Nucl - Phys.B v.775

NUCLEAR PHYSICS B
Journal devoted to the experimental and theoretical study of the fundamental

constituents of matter and their interactions
Editorial Board
Supervisory Editors: G. ALTARELLI, Geneva, Switzerland; W. BARTEL, Hamburg, Germany; C. BECCHI, Genova,
Italy; R.H. DIJKGRAAF, Amsterdam, The Netherlands; D. KUTASOV, Chicago, IL, USA; L. MAIANI, Rome, Italy;
H. MURAYAMA, Berkeley, CA, USA; T. NAKADA, Geneva, Switzerland; H. OOGURI, CalTech, Pasadena, CA, USA;
H. SALEUR, USC, Los Angeles, CA, USA; A. SCHWIMMER, Rehovot, Israel
Associate Editors for Particle Physics: S.J. Brodsky, SLAC; M. Dine, UC, Santa Cruz; G. Dvali, New York; G.W.
Gibbons, Cambridge, UK; D. Gross, St. Barbara; S.S. Gubser, Princeton; L. Ibanez, Madrid; R.L. Jaffe, MIT; K. Kajantie,
Helsinki; M. Luescher, CERN; E. Rabinovici, Jerusalem; L. Randall, Princeton; A. Sen, Bombay; G. Steigman, Ohio
State; G. t Hooft, Utrecht; Y. Totsuka, KEK; P. van Baal, Leiden; S. Weinberg, Texas; J. Zinn-Justin, Saclay
Associate Editors for Field Theory and Statistical Systems: B. Duplantier, Saclay; A.W.W. Ludwig, Santa Barbara;
S. Lukyanov, Rutgers, Piscataway; T. Miwa, Kyoto; G. Mussardo, Trieste; M. Oshikawa, Tokyo; G. Parisi, Frascati;
A.M. Polyakov, Princeton; R. Sasaki, Kyoto; M. Zirnbauer, Cologne
Aims and Scope
Nuclear Physics B focuses on the domain of high energy physics and quantum field theory, and includes three main sections, i.e. particle physics, field theory and statistical systems and physical mathematics. The emphasis is on original
research papers. Conference Proceedings on the topics covered by Nuclear Physics B are published in the (separately
available) journal Nuclear Physics B-Proceedings Supplements.
Abstracted/Indexed in:
Chemical Abstracts/Current Contents: Physical, Chemical & Earth Sciences. Also covered in the abstract and citation
database SCOPUS . Full text available on ScienceDirect .
Subscription Information 2007
Volumes 760787 are scheduled for publication in 84 issues. Publication frequency: weekly. Subscription prices are
available upon request from the Publisher or from the Regional Sales Office nearest you or from this journals website
(http://www.elsevier.com/locate/nuclphysb). A combined subscription to Nuclear Physics A ISSN 0375-9474 (volumes
781797), Nuclear Physics B ISSN 0550-3213 (volumes 760787) and Nuclear Physics B Proceedings Supplements
ISSN 0920-5632 (volumes 164174) is available at a reduced rate. Subscriptions are accepted on a prepaid basis only
and are entered on a calendar year basis. Issues are sent by standard mail (surface within Europe, air delivery outside
Europe). Priority rates are available upon request. Claims for missing issues should be made within six months of the
date of dispatch.
Orders, claims, and journal enquiries: please contact the Customer Service Department at the Regional Sales Office
nearest you:
Orlando: Elsevier, Customer Service Department, 6277 Sea Harbor Drive, Orlando, FL 32887-4800, USA; phone: (+1)
(877) 8397126 [toll free number for US customers], or (+1) (407) 3454020 [customers outside US]; fax: (+1) (407)
3631354; e-mail: usjcs@elsevier.com
Amsterdam: Elsevier, Customer Service Department, PO Box 211, 1000 AE Amsterdam, The Netherlands; phone:
(+31) (20) 4853757; fax: (+31) (20) 4853432; e-mail: nlinfo-f@elsevier.com
Tokyo: Elsevier, Customer Service Department, 4F Higashi-Azabu, 1-Chome Bldg, 1-9-15 Higashi-Azabu, Minato-ku,
Tokyo 106-0044, Japan; phone: (+81) (3) 5561 5037; fax: (+81) (3) 5561 5047; e-mail: jp.info@elsevier.com
Singapore: Elsevier, Customer Service Department, 3 Killiney Road, #08-01 Winsland House I, Singapore 239519;
phone: (+65) 63490222; fax: (+65) 67331510; e-mail: asiainfo@elsevier.com
Advertising information
Europe, USA, Canada and ROW: Advertising orders and enquiries can be sent to: Miss Katrina Barton, phone: (+44)
(0) 20 7611 4117; fax: (+44) (0) 20 7611 4463; e-mail: commercialsales@elsevier.com.
South America: Advertising orders and enquiries can be sent to: Mr Tino DeCarlo, The Advertising Department,
Elsevier Inc., 360 Park Avenue South, New York, NY 10010-1710, USA; phone: (+1) (212) 633 3815; fax: (+1) (212)
633 3820; e-mail: t.decarlo@elsevier.com.
US mailing notice Nuclear Physics B (ISSN 0550-3213) is published weekly by Elsevier B.V., P.O. Box 211, 1000
AE Amsterdam, The Netherlands. The annual institutional subscription price in the USA is US$ 17015 (valid in North,
Central and South America), including air speed delivery. Periodical postage paid at Rahway NJ and additional mailing
offices. USA Postmaster: Send change of address: Nuclear Physics B, Elsevier, 6277 Sea Harbor Drive, Orlando, FL
32887-4800. Airfreight and mailing in the USA by Mercury International Limited, 365, Blair Road, Avenel, NJ 07001.
Nuclear Physics B 775 (2007) 130
Anomalous dimensions of high-spin operators beyond

the leading order
B. Basso, G.P. Korchemsky
Laboratoire de Physique Thorique 1 , Universit de Paris XI, 91405 Orsay cdex, France
Received 9 January 2007; received in revised form 5 March 2007; accepted 14 March 2007
Available online 19 April 2007
Dedicated to the memory of Alexander Nikolaevich Vasilev
Abstract
Anomalous dimensions of Wilson operators with large Lorentz spin scale logarithmically with the spin.
Recent multi-loop QCD calculations of twist-two anomalous dimensions revealed the existence of interesting structure of the subleading corrections suppressed by powers of the Lorentz spin. We argue that
this structure is a manifestation of the self-tuning property of the multi-loop anomalous dimensionsin
a conformal gauge theory, the anomalous dimension of Wilson operators is a function of their conformal
spin which is modified in higher loops by an amount proportional to the anomalous dimension. Making
use of the parity property of this function and incorporating the beta-function contribution, we demonstrate
the existence of (infinite number of) relations between subleading corrections to the twist-two anomalous
dimensions in QCD and its supersymmetric extensions. They imply that the subleading corrections to the
anomalous dimensions suppressed by odd powers of the conformal spin can be expressed in terms of the
lower-loops corrections suppressed by smaller even powers of the spin. We show that these relations hold
true in QCD to all loops in the large 0 limit. In the N = 4 SYM theory, we employ the AdS/CFT correspondence to argue that the same relations survive in the strong coupling regime for higher-twist scalar
operators dual to a folded string rotating on the AdS3 S1 .
2007 Elsevier B.V. All rights reserved.
* Corresponding author.
E-mail address: korchems@th.u-psud.fr (G.P. Korchemsky).

1 Unit Mixte de Recherche du CNRS (UMR 8627).
0550-3213/$ see front matter 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.nuclphysb.2007.03.044
B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130
1. Introduction
Success of QCD as the theory of strong interactions relies in part on the possibility to predict
the scale dependence of various physical observables. In the most prominent example of deeply
inelastic scattering, this dependence is governed by anomalous dimensions of composite Wilson operators that arise in the operator product expansion of conserved currents. The anomalous
dimensions depend on the coupling constant as well as on the quantum numbers of operators
(Lorentz spin, isotopic charge) and their properties reflect the symmetries of the underlying gauge
theory. In the maximally supersymmetric N = 4 YangMills theory, the gauge/string correspondence allows one to relate the anomalous dimensions at strong coupling to the energies of strings
propagating on the AdS5 S5 background [1]. In the case of QCD, due to lack of a similar dual
stringy description, one can only calculate the anomalous dimensions at weak coupling as series
in the coupling constant.
At present, the most advanced QCD calculations of anomalous dimensions have been performed to three-loop accuracy for the Wilson operators of twist two [2,3]. The resulting expressions for multi-loop anomalous dimensions are complicated functions of the Lorentz spin
N carried by the Wilson operators but they simplify significantly for large N . In this limit, for
N 1, the anomalous dimensions scale logarithmically (N ) ln N with subleading corrections running in inverse powers of N [4]. The explicit three-loop calculation [2,3] revealed the
existence of interesting structure of the subleading corrections. It turned out that, to three-loop
accuracy, correction 1/N to the twist-two anomalous dimension can be expressed in terms
of the leading N 0 correction evaluated to two loops only (MVV relation). Recently, it was
shown in Refs. [5] that this relation can be explained by modifying the evolution (renormalization group) equation in such a way [6] as to preserve the reciprocity relation between its solutions
(see Eq. (1.2) below).
In this paper, we shall argue that the MVV relation is a manifestation of the following selftuning property of the multi-loop anomalous dimensions [7]in a conformal gauge theory, the
anomalous dimension is a function of the conformal spin of the Wilson operator which is modified at higher loops by an amount proportional to the anomalous dimension. Making use of the
parity property of this function and incorporating the beta-function contribution, we shall demonstrate the existence of (infinite number of) MVV like relations between subleading corrections to
the twist-two anomalous dimensions at large N in QCD and its supersymmetric extensions. We
shall show that subleading corrections to the anomalous dimensions suppressed by odd powers
of 1/N are not independent and can be expressed in terms of coefficients accompanying smaller
even powers of 1/N . In the N = 4 SYM theory, we shall employ the AdS/CFT correspondence
to argue that the same relations survive in the strong coupling regime for scalar operators of
higher twist.
The anomalous dimensions of twist-two operators govern the scale dependence of nonperturbative parton (quark, gluon) distribution functions in the deeply inelastic scattering (DIS) [810].
They are related to the parton space-like (S ) splitting functions PS (z) through the Mellin transform
1
S (N ) =
dz zN1 PS (z).
0
(1.1)
The function PS (z) defined in this way has a clear physical meaning. It defines the probability for a parton to split into another parton carrying the momentum fraction z.2 One encounters another set of nonperturbative parton fragmentation functions in hadron production
in the e+ e -annihilation which is just the cross-channel of the DIS process (see review [12]
and references therein). The scale dependence of the fragmentation functions is governed by
the time-like (T ) splitting functions PT (z) whose moments provide yet another anomalous di1
mensions T (N ) = 0 dz zN1 PT (z). In distinction with the parton distribution functions, the
fragmentation functions do not admit the operator product expansion and, as a consequence, their
anomalous dimensions T (N ) are not related to composite local Wilson operators. The two functions are related however by the crossing symmetry. This suggests to write down the following
reciprocity relations between the space-like, PS (z), and time-like, PT (z), splitting functions
PT (z) = PS (z),
PT (z) = zPS (1/z),
(1.2)
the so-called GribovLipatov [13] and DrellLevyYan [14] relations. Here in the first relation
the two functions are equated within their physical region 0 z < 1, while in the second relation
the space-like splitting function is analytically continued through z = 1. In general, the two relations in (1.2) are independent on each other. However, if they were fulfilled simultaneously, one
would obtain a strong reciprocity relation for P (z) = PT (z) = PS (z)
P (z) = zP (1/z).
(1.3)
It is known [1519] that this relation is satisfied to one loop only, while both relations in (1.2)
are violated to two loops.3
Another set of relations for the twist-two anomalous dimensions comes from a spacetime
picture of parton (quark, gluon) splitting in the space-like (distribution) and time-like (fragmentation) processes. In the latter case, imposing the condition of the angular ordering for successive
partonic splittings one obtains the following remarkable functional equation for the leading asymptotic behaviour of the time-like gluon anomalous dimension T (N ) at small N [21]

s Nc
T (N ) = S N T (N ) , S (N ) =
(1.4)
+ ,
N
where ellipses stand for subleading terms as N 0 and s /N 2 = gs2 /(4N 2 ) = fixed. The relation (1.4) leads to a quadratic equation for T (N ) whose solution resumes the leading corrections
sn /N 2n1 to T (N ) for all n 1. It has been proposed in Refs. [5,6] that a relation similar to
(1.4) should also hold at large N 4

1
(N ) = f N (N ) ,
(1.5)
2
where = for the space-like and = + for the time-like anomalous dimensions, (N ) =
S (N ) and + (N ) = T (N ). The function f (N ) is assumed to be the same for = and, most
importantly, its large N expansion takes the form

f (N ) = A (N + 1) + E + B + 0 N 1 + O N 2 ,
(1.6)
2 The splitting functions cease to have a probabilistic interpretation at higher twists [11].
3 Notice that the second relation in (1.2) still holds true to two loops for the so-called physical evolution kernels
governing the scale dependence of the DIS structure functions and the hadron fragmentation functions [20].
4 Due to different conventions for the anomalous dimension (see Eq. (2.1) below), (N ) differs from similar expres
sion in Ref. [5] by the overall factor (1/2).
where (x) = d ln (x)/dx and E is the Euler constant. Then, the relation (1.5) leads to the
following expression for the twist-two (nonsinglet quark and diagonal elements of the singlet
mixing matrix) anomalous dimensions at large N

1
(N ) = A ln N + B + C N 1 ln N + D + A N 1 + O N 2 ,
(1.7)
2
with N = NeE . Here the leading term is related to the universal cusp anomalous dimension [4]
A = 2cusp (s ),
(1.8)
the B-coefficient is the same for the space-like and time-like distributions but it depends on
the quantum numbers of partons (flavour, spin projection). From (1.5) and (1.6) one finds the
remaining coefficients as [5]
1
C(MVV) = A2 (s ),
2
1
D = A(s )B(s ).
2
(1.9)
The expression for C is in agreement with explicit three-loop QCD result for the space-like
[2,3] and (nonsinglet) time-like anomalous dimensions [22], while for D the mismatch is proportional to the beta-function
1
1
D(MVV) = A(s )B(s ) A(s )(s ).
2
2
(1.10)
As was emphasized in Refs. [2,5], this calls for a further structural explanation of subleading
1/N corrections to the anomalous dimensions (1.7) at large N and, more generally, for a better
understanding of the origin of the functional relations (1.4) and (1.5).
As we will see momentarily, for the space-like anomalous dimensions S (N ), the relation
(1.5) is ultimately tied to conformal invariance of a classical four-dimensional YangMills theory. Moreover, the conformal invariance allows one to extend the relation (1.5) to anomalous
dimensions of the so-called quasipartonic operators [23] of arbitrary twist L. A distinguished
feature of these operators is that they are built from exactly L fundamental fields of twist 1 and
from an arbitrary number of covariant derivatives projected onto the light-cone. The twist-two
operators correspond to L = 2.
The classical YangMills Lagrangian is invariant under conformal transformations but this
symmetry is broken on the quantum level unless the beta-function vanishes to all loops (see
review [24] and references therein). If the conformal symmetry was exact (as in N = 4 SYM theory), the quasipartonic operators could be classified according to representations of the collinear
SL(2; R) subgroup of the full SO(2, 4) conformal group [25]. Conformal invariance ensures that
the operators belonging to different SL(2; R) multiplets cannot mix under renormalization and
their anomalous dimension, S (N ), depends on the conformal SL(2; R) spin of the multiplet. By
definition, the conformal spin of the quasipartonic operator of twist L equals j = 12 (N + ) with
N and being its Lorentz spin and the scaling dimension, respectively. The former is protected
from perturbative corrections, whereas the latter equals N + L to the lowest order in the coupling
constant and receives anomalous contribution S (N ) at higher orders. As a result, the conformal
spin of quasipartonic operators gets modified in higher loops as [7,26,27]
1
jbare = N + L
2
1
1
j = N + L + S (N ).
2
2
(1.11)
Then, the conformal symmetry implies that the anomalous dimension of the quasipartonic operator is a function of j , or equivalently

1
(=0)
S (N ) = f
(1.12)
N + S (N ) ,
2
where the scaling function f (N ) depends on twist L and other quantum numbers of the operator.
Here the superscript indicates that the function f (=0) is defined in gauge theory with vanishing
beta-function. The relation (1.12) coincides with the first relation in (1.5) in the space-like case
= .
In gauge theory with nonvanishing beta-function, the relation (1.12) should be modified to
incorporate the additional conformal symmetry breaking corrections. The latter are known to be
renormalization scheme dependent. The analysis can be simplified by renormalizing the Wilson
operators in the dimensional regularization scheme (DREG) with d = 4 2.5 In this case,
the coupling constant acquires a nonvanishing dimension and the beta-function in (4 2)dimensional gauge theory is given by
(s ) = 2 + (s ).
(1.13)
A crucial observation is that (s ) vanishes at cr = (s )/2 and, therefore, the gauge theory is
conformal in dcr = 4 2cr dimensions. This opens up the possibility to calculate the anomalous
dimensions within the -expansion using the conformal field theory technique as pioneered in
Refs. [28,29]. As before, the quasipartonic operators can be classified according to representation
of the SL(2; R) collinear subgroup of the conformal group in dcr dimensions and their anomalous
dimension is a function of the conformal spin jcr = 12 (N + cr ). The only difference as compared
with the d = 4 case is that for cr = 0 the scaling dimensions of fundamental fields get shifted by
(cr ) so that the scaling dimension of the quasipartonic operator built from L fields takes the
form cr = N + L Lcr + S (N ). Then, the conformal invariance in dcr dimensions implies that
the anomalous dimension of twist-L operators in d = 4-dimensional gauge theory is a function
of the conformal spin jcr

1
1
S (N ) = f (=0) N + S (N ) L(s ) .
(1.14)
2
4
It is important to stress that, in distinction with f (=0) (N ), the function f (=0) (N ) depends on
the renormalization scheme. Invoking the same arguments that led to (1.5), one can extend the
relation (1.14) to the time-like anomalous dimension

1
1
(=0)
T (N ) = f
(1.15)
N T (N ) L(s ) .
2
4
Eqs. (1.14) and (1.15) generalize the relations (1.5) proposed in [5,6]. They relate the space-like
and time-like anomalous dimension of twist L to the same function f (=0) (N ) and incorporate
the beta-function contribution.
Conformal symmetry allows one to relate the space-like anomalous dimension S (N ) to yet
another function f (N) but it does not tell us much about the properties of the latter. Therefore, to
make use of the functional relation (1.14) it has to be supplemented with additional relation for
the scaling function f (N). As a hint, let us consider the twist-two anomalous dimension (1.7)
5 In SYM theories, to preserve the supersymmetry, one uses the dimensional reduction scheme (DRED) instead.
and examine the first few terms in the large N expansion of f (N), Eq. (1.6). Their substitution
into (1.14) and (1.15) yields (for L = 2) the large N expansion of the space-like and timelike anomalous dimensions (1.7) with the D-coefficient matching the exact three-loop result
(1.10). Notice that it is due to zero value of the coefficient in front of N 1 in the right-hand
side of (1.6) that the C- and D-coefficients parameterizing subleading N 1 corrections to the
anomalous dimension can be expressed in terms of the leading coefficients, Eqs. (1.9) and (1.10).
The question arises whether similar relations also exist for the subleading N n corrections
(for n 2). We shall argue that the answer is positive and it relies on the following remarkable
property of the function f (N ): the large N expansion of the function f (N ) only runs in integer
negative powers of the parameter
J 2 = N(N + 1),
(1.16)
which has the meaning of a bare quadratic Casimir of the collinear SL(2; R) group. For twisttwo operators, the same property can be expressed as (with J = J eE )

f (N ) = cusp (s ) ln J2 + f (0) + f (1) /J 2 + f (2) /J 4 + O 1/J 6 ,
(1.17)
where f (k) (with k 0) are given by series in s with the coefficients depending on ln J. The
relation (1.17) implies that each term of the asymptotic series (1.17) is invariant (modulo contribution from ln J) under J J , or equivalently N 1 N . In what follows we shall
refer to (1.17) as a parity preserving asymptotic series.6 We will show that the relations (1.14),
(1.15) and (1.17) are in a perfect agreement with available two- and three-loop results for various
anomalous dimensions in QCD and in supersymmetric YangMills theories.
The relation (1.17) has a number of important consequences. Firstly, going back from the
1
N -space to the z-representation, f (N ) = 0 dz zN1 Pf (z), one finds that (1.17) is translated
into the reciprocity condition (1.3) for the splitting function Pf (z). Secondly, substituting (1.17)
into (1.14) and (1.15), one obtains that the coefficients in front of N 2n and N 2n1 in the
large N expansion of the twist-two anomalous dimensions are related to the same coefficients
f (m) with 0 m n. Excluding the latter, one gets an infinite set of relations between the
former (odd and even) coefficients, including the relations (1.9) and (1.10) for
n = 0. Moreover,
re-expanding the anomalous dimensions at large N in inverse powers of J = N (N + 1), one
finds that corrections suppressed by odd powers of J can be expressed in terms of the coefficients
accompanying smaller even powers of J to less number of loops.
The relations (1.14) and (1.17) are universal properties of the twist-two operators in gauge
theoriesthey hold true both in QCD and in supersymmetric YangMills theories with even
number of supercharges N = 0, 2, 4, to three loops
at least. In N = 1 theory the function f (N )
develops an additional square-root singularity 1/J 2 due to degeneracy of the parity-even
and parity-odd anomalous dimensions in the large N limit. Finally, the relation (1.17) can be
also tested to all loops: In QCD, we will determine the function f (N ) in the so-called large 0 limit and show that it verifies (1.17) indeed. In the N = 4 theory, we will apply the gauge/string
correspondence to evaluate f (N ) in the strong coupling regime for the special class of scalar
operators of higher twist dual to a folded string rotating on the AdS3 S1 part of the target space
of the type IIB string theory [30,31].
6 The same property would not be manifest if one replaced J 2 by its expression (1.16) and re-expanded f (N ) in
powers of 1/N . The resulting series for f (N ) involves both even and odd powers of 1/N and its first three terms match
(1.6) for f (0) = B.
The paper is organized as follows. In Section 2, we apply well-known one-loop expressions

for various twist-two anomalous dimensions in QCD and in SYM theories to justify the parity
preserving relation (1.17) and to establish its connection with the reciprocity relation (1.3). In
Section 3, we use explicit results for two-loop and three-loop anomalous dimensions available in
the literature to verify the relations (1.14), (1.15) and (1.17) beyond the leading order. Then, we
make use of the parity preserving property (1.17) to find the relations between different terms in
the large N expansion of the anomalous dimensions analogous to (1.9) and (1.10). In Section 4,
we argue that the property (1.17) holds true in QCD for twist-two anomalous dimensions to all
loops in the large 0 limit. In Section 5, we employ the AdS/CFT correspondence to determine
the function f (N) in the planar N = 4 theory in the strong coupling regime for scalar operators
of higher twist. Section 6 contains concluding remarks. Some details of our calculations are
presented in Appendix A.
2. Symmetries of twist-two anomalous dimensions
Let us start with the definition of the space-like anomalous dimensions in a generic fourdimensional YangMills theory with the SU(Nc ) gauge group. In general, the Wilson operators
with the same Lorentz spin N mix under renormalization and verify the CallanSymanzik equation

(a)
(b)
+ ()
ON (0) = ab (N )ON (0),
(2.1)
ln
ln
where the coupling constant is defined slightly differently in QCD and in SYM theory
(QCD) =
gs2
s
,
8 2 2
(SYM) =
g 2 Nc
.
8 2
(2.2)
Here the mixing matrix is given by a series in the coupling constant

(N ) = 0 (N ) + 2 1 (N ) + ,
(2.3)
with k (N ) being matrices, and the beta-function is defined as

d ln
= () = 0 1 2 + O 3 .
d ln
(2.4)
The beta-function coefficients are given in QCD by

0(QCD) =
11
2
N c nf ,
3
3
1(QCD) =
17 2 5
N Nc nf CF nf
3 c 3
(2.5)
with nf the number of quark flavours and CF = (Nc2 1)/(2Nc ) the quadratic Casimir in the
fundamental representation of the SU(Nc ) group. In the SYM theory with N supercharges one
has
(SYM)
= (4 N ),
(SYM)
= (2 N )(4 N ).
(2.6)
For N = 2 the exact beta-function is given by the one-loop expression while for N = 4 it vanishes to all loops and the corresponding gauge theory is conformal.
2.1. Nonsinglet anomalous dimensions in QCD

Let us first consider nonsinglet twist-two quark operators in QCD. They take the form
(ns)
N1
q,j (0),
ON (0) = q,i Tija D+
(2.7)
where q,j denotes the quark field component carrying the flavour j and the spinor index .
Also, D+ = + + igA+ is the light-cone component of the covariant derivative. The matrices
Tija stand for the generators of the flavour SU(nf ) group while the spinor matrices serve
to select the so-called good (twist-one) components of quark and antiquark fields either with
a definite helicity, or with a transverse polarization [24]. A distinguished feature of the operators (2.7) is that they cannot mix under renormalization with other operators but only with the
n O (ns) (0) containing total derivatives. This mixing can be avoided by going over
operators +
N n
(ns)
from the operators ON (0) to their forward matrix elements with respect to some reference state
(ns)
ON (0).
Depending on the choice of the -matrices in (2.7) one can distinguish three different twisttwo operators: unpolarized ( = + ), longitudinally polarized ( = + 5 ) and transversity
( = + ) operators.7 To the lowest order in the coupling constant their anomalous dimensions are given by well-known expressions [23]

1+
0 (N ) = CF 4(N + 1) + 4E 3
(2.8)
,
N (N + 1)
where = 1 for the unpolarized and longitudinally polarized operators and = 1 for the
transversity operator. Here (x) = d ln (x)/dx and E = (1) is the Euler constant. In the
momentum fraction representation (1.1), the corresponding splitting functions read

4z
P0 (z) = CF
(2.9)
+ 3(1 z) + (1 z)(1 + ) ,
(1 z)+
where the + distribution is defined in a standard way
1
0
1
dz (z)
=
(1 z)+
1
dz
0
(z) (1)
1z
(2.10)
with (z) being an arbitrary test function.

As was already mentioned in the Introduction, to one-loop order the space-like and timelike splitting functions satisfy the reciprocity relations (1.2). Indeed, one verifies that for z = 1
the splitting function P0 (z), Eq. (2.9), fulfills the relation (1.3). The same property in the N space finds its manifestation in the asymptotic behaviour of the anomalous dimension 0 (N ) for
large N . In this limit, the anomalous dimension (2.8) scales as 0 (N ) = 4CF ln N + with
subleading corrections running in powers of 1/N . However, changing the expansion parameter
from N to J 2 = N (N + 1) and expanding the anomalous dimensions in inverse powers of 1/J
one finds that the subleading corrections run in even powers of 1/J only

1
2
16 6
0 (N ) = CF 2 ln J2 3
(2.11)
+ J 2 J 4 +
J + O J 8 ,
3
15
315
7 In the case of transversity, the operators (2.7) evolve autonomously even for flavour singlet quark distributions. The
same holds true for the linearly polarized gluon distribution [32].
where J = J e E . As follows from (1.14), the functions S (N ) and f (N ) coincide to the lowest
order in the coupling constant, f (N ) = 0 (N ) + O(2 ), but differ from each other starting
from two loops. Matching (2.11) into (1.17), one can identify one-loop contribution to the cusp
anomalous dimension cusp (s ) and to the coefficients f (n) .
Each term of the expansion (2.11) is invariant under substitution J J , or equivalently
N 1 N , whereas the exact expression (2.8) is not
0 (N ) 0 (N 1) = 4CF cot(N ).
(2.12)
Still, the two relations, Eqs. (2.11) and (2.12), are consistent with each otherthe one-loop
anomalous dimension (2.8) has (Regge) poles at integer negative N and expansion (2.11) is
only well defined for Re N > 0. Regge singularities of 0 (N ) lead to a factorial growth of the
expansion coefficients in the right-hand side of (2.11) (they scale as (1)n (2n)!/(n(2J )2n )
for n 1).8
Let us now show that the validity of the reciprocity relation (1.3) for the splitting function and
the absence of odd powers of 1/J in the asymptotic expansion of the anomalous dimension (1.1)
are in the one-to-one correspondence with each other. To this end, one changes the integration
variable in (1.1) as z = ex/j with j = N + 12 = (J 2 + 14 )1/2 and evaluates derivative (N ) =
N (N ) as
1
(N ) = 2
j

dx ex xex/(2j ) P ex/j .
(2.13)
Expanding the integrand at large j and integrating term-by-term one can obtain an asymptotic
expansion of (N ) in powers of 1/j . If asymptotic expansion of (N ) run in even powers of
1/J , or equivalently 1/j , then similar expansion of the derivative (N ) would run in odd powers
of 1/j leading to

xex/(2j ) P ex/j = xex/(2j ) P ex/j .
(2.14)
For x > 0 and z = ex/j this relation is equivalent to the reciprocity condition (1.3). In the similar
manner, if the splitting function verified the reciprocity relation (1.3), the corresponding anomalous dimension would be given at large N by the parity preserving asymptotic series (1.17).
Explicit calculation shows [15,17] that for the nonsinglet twist-two quark operators the reciprocity relation (1.3) breaks down starting from two loops and, as a consequence, the asymptotic
expansion of the nonsinglet anomalous dimension involves odd powers of 1/J at higher loops.
We will demonstrate in Section 3 that the function f (N ) related to the nonsinglet anomalous
dimension through (1.14) still possesses the parity preserving property (1.17) to three loops at
least.
2.2. Singlet anomalous dimensions in QCD
As the next step, we consider flavour singlet quark and gluon operators of twist two

(q)
(g)
N1
N2
q,i (0),
ON (0) = tr F+ D+
F+ (0) ,
ON (0) = q,i D+
(2.15)

8 The asymptotic series (2.11) can be resummed using the Borel transform (N ) = dx ex/ (x), with = 1/N
0
0
being the expansion parameter. Indeed, for z = ex this relation coincides with the definition of the moments (1.1)
provided that (x) = P0 (ex ).
10
a t a = i [D , D ] is the light-like component of the gauge field strength tensor

where F+ = F+
+
g
and t a are the generators of the SU(Nc ) gauge group in the fundamental (quark) representation. As before, the spinor matrix and the Lorentz tensor select good (twist-one)
components of the quark and gauge fields. For = + and = g the matrix elements of
the operators (2.15) define unpolarized quark and gluon distributions, while for = + 5 and
= + they define longitudinally polarized distributions.
The operators (2.15) mix under renormalization with each other and with operators involving
total derivatives. As before, the latter can be eliminated by going over to the forward matrix ele(q)
(g)
ments in (2.15). Then, the matrix elements
ON (0) and
ON (0) verify the evolution equation
ab
(2.1) with the mixing matrix (N ) (for a, b = q, g) known to two loops for the polarized distributions [33] and to three loops for the unpolarized distributions [3]. For our purposes, we have
to examine the eigenvalues of the mixing matrix, q and g . They can be found as solutions to
the characteristic equation
(a )2 a X(N) + Y (N) = 0,
(2.16)
with the functions X(N) and Y (N) defined as

X(N) = tr (N ) = qq + gg ,
Y (N) = det (N ) = qq gg qg gq .
(2.17)
The explicit expressions for roots to (2.16) are not particularly suitable for analyzing the large
N asymptotics of the anomalous dimensions q and g . As we will see in a moment, it is more
advantageous to study the properties of X(N) and Y (N).
As in Section 2.1, we first examine the large N expansion of the one-loop mixing matrix
0ab (N ). Expanding the well-known expression for 0ab (N ) in powers of the parameter 1/J defined in (1.16), one finds

qq
qg
0 = CF 2 ln J2 3 + O 1/J 2 ,
0 = 2nf /J + O 1/J 2 ,

gq
gg
= 2CF /J + O 1/J 2 ,
(2.18)
= 2Nc ln J2 0 + O 1/J 2 ,
0
0
qq
0
are the same for the unpolarized and polarwith 0 given in (2.5). The matrix elements
gg
gq
qg
ized distributions. For 0 their difference scales as 1/J 4 while for 0 and 0 it scales
qq
gg
as 1/J 3 . The large N expansion of the diagonal matrix elements, 0 and 0 , runs in
qg
even powers of 1/J only. Similar expansion of the off-diagonal matrix elements, 0 and
gq
qg gq
0 , involves all powers of 1/J , but odd powers of 1/J disappear in their product 0 0 =
ab
2
4
4CF nf /J + O(1/J ). This implies that the matrix elements 0 (N ) do not verify the parity
preserving relation (1.17) but this relation holds true for the trace and determinant of the mixing
matrix, X(N) and Y (N), respectively. As a consequence, the eigenvalues of the one-loop mixing matrix, Eq. (2.16), have the same propertytheir asymptotic expansion runs in even powers
of 1/J only.9 This property can be thought of as a generalization of the relation (2.11) for the
singlet anomalous dimensions.
In a close analogy to the nonsinglet case, one can apply (1.14) to relate the two eigenvalues
a (N ) of the mixing matrix to two different functions fa (N ) (with a = q, g). To the lowest
order in the coupling constant, one has a (N ) = fa (N ) + O(2 ), and, therefore, fa (N ) verify
9 Provided that the difference qq gg scales as O(J 0 ) which is indeed the case for (2.18) since C = N .
c
F
0
0
11
the parity preserving property (1.17) to one loop accuracy. We will demonstrate in Section 3, that
for the functions fq (N ) and fg (N ) defined in this way, the property (1.17) holds true to three
loops at least.
2.3. Anomalous dimensions in SYM
Let us now extend the analysis to twist-two operators in supersymmetric YangMills theories.
In the simplest case of the SYM theory with N = 1 supercharge, the twist-two operators can be
obtained from the singlet operators (2.15) by substituting the quark fields with the gaugino fields
belonging to the adjoint representation of the SU(Nc ) group. As a result, the one-loop anomalous
dimensions in the N = 1 SYM theory coincide with the QCD expressions for CF = nf = Nc [10,
23]. In the SYM theories with N > 1 supercharges, construction of twist-two operators becomes
extremely cumbersome due to larger number of fundamental fields involved (gaugino, gauge
fields and scalars). As was shown in Ref. [34], the problem can be circumvented by employing the
formulation of SYM theories in terms of light-cone superfields [35]. This approach takes a full
advantage of superconformal invariance of the SYM theory and provides a unifying description
of the twist-two (superconformal) operators for different number of supercharges N .
Similar to the QCD case, the form of the one-loop mixing matrix in the SYM theory is constrained by the (super)conformal invariance of the classical Lagrangian [36]. It allows one to
classify the twist-two operators according to representations of the superconformal SU(2, 2|N )
group and, more precisely, its collinear subgroup SL(2|N ). In this way, one can show that, in the
SYM theories with N < 4 supercharges, all twist-two operators belong to two different SL(2|N )
multiplets whereas in the maximally supersymmetric N = 4 SYM theory they belong to a single
SL(2|4) multiplet. The reason for such difference is that for N = 4 all good components of
the fundamental fields, providing the building blocks for the twist-two operators, can be comprised into a single light-cone superfield whereas for N < 4 they are described by two different
light-cone superfields.
The superconformal symmetry ensures that the twist-two operators inside the supermultiplet
have the same anomalous dimension. In the SYM theory with N < 4 supercharges, the anomalous dimensions of twist-two operators belonging to two supermultiplets are given to one loop
by [34]

0,+ (N ) = 4 (N + 1) + E 0 ,

(N 1) (4 N )
0, (N ) = 2 (N + 3 N ) + (N 1) (1)N
+ 2E 0 ,
(N + 3 N )
(2.19)
with 0 being the lowest order coefficient of the beta-function defined in (2.6). In the N = 4
SYM theory, the twist-two operators belong to the same supermultiplet and their anomalous
dimension is given to one loop by [37]

0,+ (N ) = 4 (N + 1) + E .
(2.20)
Here nonnegative integer N parameterizes the superconformal SL(2|N ) spin of the twist-two operators. For the operators with the anomalous dimension 0,+ (N ) and 0, (N ) the corresponding
quadratic Casimir is given respectively by [34]

1
1
J2 = N N N + 1 N .
J+2 = N(N + 1),
(2.21)
2
2
12
For N = 0 the expressions (2.19) coincide with the one-loop anomalous dimensions of twisttwo operators in pure gluodynamics. For N = 1 they match the QCD expressions evaluated
in the supersymmetric limit CF = nf = Nc [10,23]. In particular, 0,+ (N ) corresponds to the
anomalous dimensions of quark transversity and linearly polarized gluon distributions [32,38].
For N = even/odd the function 0, (N ) defines the eigenvalues of the one-loop mixing matrix
0ab (N ) for singlet unpolarized/polarized distributions [23].
Let us examine the dependence of the anomalous dimensions (2.19) and (2.20) on the superconformal Casimirs (2.21) at large N . The expression for the anomalous dimension 0,+ (N ),
Eq. (2.20), is similar to (2.8) (for = 1) and its asymptotic expansion in powers of 1/J+2 can
be easily obtained from (2.11). In the SYM theory with N < 4 supercharges, the expression for
0, (N ), Eq. (2.19), involves the factor (1)N and its analytical continuation from even and odd
N gives rise to two different functions [15] that we shall denote as 0 (N, ) (with = )

0 (N, ) = 2 ln J2 + 1 J2 J2 + 2 J2 J4+N 0 ,
(2.22)
where, by definition, 0, (N ) = 0 (N, = (1)N ) for integer N . Here the functions 1,2 are
given by asymptotic series in 1/J2 with the N -dependent coefficients.
We notice that the functions (2.22) have different analytical properties for even and odd
number of supercharges N . For N = 0 and N = 2 the functions 0 (N, ) admit asymptotic
expansion in integer powers of 1/J2 only. For N = 1 this property is not valid for 0 (N, )
anymore but it is restored in the sum 0 (N, +) + 0 (N, ). Still, the asymptotic expansion of
the difference 0 (N, +) 0 (N, ) runs in odd powers of 1/J only.
To understand the reason why the parity preserving property is modified for N = 1, we recall
that 0 (N, ) defines the eigenvalues of the singlet mixing matrix in QCD, Eq. (2.18), in the
supersymmetric limit CF = nf = Nc . More precisely, for integer N the solutions to the characteristic equation (2.16) are given by 1 = 0, (N + 1) and 2 = 0, (N ). Then, it follows from
(2.16) and (2.17) that

qq
gg 2
qg gq
1 2 = 0 0
(2.23)
+ 40 0
with the matrix elements of the mixing matrix given by (2.18). We demonstrated in Section 2.2,
qq
gg
qg gq
that (0 0 )2 and 0 0 are given by asymptotic series in 1/J 2 = 1/(N(N + 1)) and,
therefore, 1 2 should also admit similar expansion provided that the leading J 0 term inside
the square-root in the right-hand side of (2.23) is different from zero, or equivalently 1 2 = 0
at large N . In the N = 1 SYM theory, the eigenvalues of the mixing matrix become degenerate
in the large N limit and the above condition is not satisfied. Indeed, one finds from (2.18) that
qq
gg
qg gq
(0 0 )2 + 40 0 1/J 2 in the supersymmetric limit, CF = nf = Nc , leading to10

0 (N, ) 0 (N + 1, ) = J 1 J 2
(2.24)
with J 2 = N(N + 1) and (J 2 ) given by series in 1/J 2 . For N = 1, one replaces 0 (N, )
by its general expression (2.22) (without assuming parity property of the function 2 ), takes into
account the relation J2 = (N 12 )(N + 12 ), Eq. (2.21), and deduces from (2.24) that, in agreement
with (2.22), the -dependent term inside 0 (N, ) should admit an asymptotic expansion in odd
powers of 1/J only.
10 The phenomenon is well known in quantum mechanics for the system of two almost degenerate energy levels gg
0
qq
qg gq
qq
gg
and 0 whose interaction energy is of the same order as the level splitting, 0 0 (0 0 )2 1/J 2 .
13
3. Parity preserving relations at higher loops

We demonstrated in Section 2, that to one-loop accuracy, the large N expansion of the twisttwo anomalous dimensions in QCD and in its supersymmetric extensions (except the N = 1
SYM theory) runs in integer powers of the (super)conformal Casimir J 2 and, therefore, possesses the parity preserving property (1.17). In the z-representation (1.1), this translates into the
reciprocity relation (1.3) for the corresponding one-loop splitting functions P (z). Going over
to higher loops, one finds that the conformal spin of Wilson operators get renormalized by an
amount proportional to their anomalous dimension (1.11). Moreover, in gauge theories with
nonvanishing beta-function the anomalous dimensions receive conformal symmetry breaking
contributions. We will argue in this section, that once both effects are taken into account, the parity preserving relation (1.3) does not hold true for the anomalous dimension (N ) but it remains
valid (to three loops at least) for the function f (N ) defined in (1.14).
3.1. Space-like anomalous dimensions
Let us examine the relation between the space-like anomalous dimension of twist L and the
function f (N )

1
1
S (N ) = f N + S (N ) L() .
(3.1)
2
4
We recall that the shift of the argument in the right-hand side of (3.1) takes into account renormalization of the conformal spin of the quasipartonic operator of twist L due to nonzero value
of the anomalous dimension and of the beta-function. The function f (N ) admits a perturbative
expansion in the coupling constant similar to (2.3)
f (N ) = f0 (N ) + 2 f1 (N ) + .
(3.2)
To the lowest order one has f0 (N ) = 0 (N ). Substituting (3.2) into (3.1) and matching the coefficients in front of powers of , one can establish the relations between higher order corrections
to f (N ) and S (N ). It is convenient to introduce two functions
1
S (N ) = S (N ) L(),
2
1
f(N ) = f (N ) L(),
(3.3)
2
so that the relation (3.1) simplifies as

1
S (N ) = f N + S (N ) .
(3.4)
2
Solving this functional relation one obtains its formal solution as (LagrangeBrmann formula)

k1

k

1 1
1
1 3
f(N) = f(N ) + f2 (N ) +
N
f (N ) + O f4 (N ) ,
S (N ) =
k! 2
4
24
k=1
(3.5)
where prime denotes a derivative with respect to N . To determine the function f(N ) one has to
invert this relation. To this end, one uses (3.4) to get

1
f (N ) = S N f (N )
(3.6)
2
14
and obtains solution to this equation as

k1

k
1
1
S (N )
N
f (N ) =
k!
2
k=1

1 2
1 3
(3.7)
S (N ) +
S (N ) + O S4 (N ) .
4
24
This relation allows one to reconstruct the function f(N ) from known expression for the spacelike anomalous dimensions S (N ).
= S (N )
3.1.1. Parity preserving relations

We are now ready to verify the parity preserving relation (1.17) for the scaling function f (N )
in higher loops. For the nonsinglet anomalous dimensions, the analysis goes along the same
lines as in Section 2.1one replaces S (N ) in (3.7) and (3.3) by available two- and three-loop
nonsinglet anomalous dimensions, evaluates the function f (N ) to the same loop accuracy and,
then, examines its asymptotic expansion in J 2 = N (N + 1) at large N . For the singlet anomalous
dimensions, one uses two solutions to the characteristic equation (2.16), q (N ) and g (N ), and
defines the corresponding functions fq (N ) and fg (N ) with a help of (3.7) (see Appendix A for
more details).
In our analysis, we used expressions for multi-loop anomalous dimensions of various twisttwo operators in QCD and in SYM theories available in the literature. They include:
Two-loop longitudinally polarized singlet distributions in QCD [33];
Two-loop gluon linearly polarized distribution in QCD [32];
Two-loop quark transversity distribution in QCD [38] and its analogs in SYM theories with
N = 0, 1, 2 supercharges [27];
Three-loop nonsinglet unpolarized distributions in QCD [2];
Three-loop singlet unpolarized distributions in QCD [3];
Three-loop universal distribution in N = 4 SYM [39].
These anomalous dimensions are given by complicated expressions involving nested harmonic
sums and various color factors. Going over through a lengthy calculation, we worked out their
large N expansion in powers of J 1 = (N (N + 1))1/2 and split the corresponding series as

1
S (N ) = + ln J, J 1 + ln J, J 1 + L().
2
(3.8)
Here the beta-function term is introduced for the later convenience and are series in J 1 of a
definite parity, (ln J, J 1 ) = (ln J, J 1 ),

+ = 0 (ln J) + 2 (ln J)J 2 + O J 4 ,

= 1 (ln J)J 1 + 3 (ln J)J 3 + O J 5 ,
(3.9)
where k (ln J) are given by series in the coupling constant with the coefficients depending on
ln J = ln J + E . Matching (3.8) and (1.7) one can establish the correspondence between the two
sets of coefficient functions (for L = 2)
0 = A() ln J + B() (),
1 = C () ln J + D ().
(3.10)
15
In all cases except the N = 4 SYM, the resulting multi-loop expressions for the coefficient functions k (ln J) are too cumbersome to be displayed here. For the three-loop universal anomalous
dimension our analysis gives (see Appendix A for details)

4 2
2
3
0 = 4 ln J 6 3 + 3 + 205 + O 4 ,
3
4
2
3
1 = 8 ln J 12 3 + O ,

2
2
2
2 = + 2 2 + 3 2 + 16 2 ln J 8(ln J)2 + O 4 ,
3
3
3

5
5
2 4
3
ln J + 4 + 3 8 ln J + O 4 ,
3 =
3 3
2
..
.
(3.11)
where stands for the so-called physical coupling constant [40]

1
1
11 4 3
= cusp () = 2 2 +
+ O 4 .
2
6
180
(3.12)
Replacing S (N ) in the relation (3.7) by its expression, Eqs. (3.8), (3.9) and (3.11), one finds the
scaling function in the N = 4 SYM theory after some algebra as

4
f (N =4) (N ) = 4 ln J 6 2 3 + 3 2 3 + 205
3

2 2 2 2 3 2
2
2
ln J J + O J 4 , 4 .
+ + 2 +
(3.13)
3
3
3
We observe that, in agreement with (1.17), this series does not involve odd powers of 1/J while
they are present in the series for the anomalous dimension (3.8).
We repeated similar analysis for the above-mentioned two-loop and three-loop anomalous
dimensions of twist two and found that in all cases the corresponding functions f (N ) verify
the parity preserving relation (1.17)! The properties of the functions f (N ) are discussed in
Appendix A.
3.1.2. MVV relations
As was mentioned in the Introduction, the parity preserving property (1.17) implies certain
relations between different terms in the large N expansion of the anomalous dimensions. To
obtain these relations,
one notices that according to (1.17) the large N expansion of the deriva
tives (N )n f = ( 1 + 1/(4J 2 )J )n f runs in odd/even powers of 1/J depending on parity of n.

Matching (3.5) and (3.8) one finds that the series + and are given by the sum of terms
involving derivatives of the even and odd orders, respectively,

2k

2k+1

1
1
f(N )
= f(N ) + O 3 ,
N
(2k + 1)! 2
k=0

2k1

2k 1
2

1
1
f(N ) = N f(N ) + O 4 .
N
=
(2k)! 2
4
+ =
k=1
(3.14)
16
Inverting the first relation and substituting the resulting expression for f (N) into the second
relation, one can express in terms of + . To four-loop accuracy one finds

3 1 4
1 2
1
= + +
(3.15)
+ ,
+ + + +
4
48
4
where prime denotes a derivative with respect to N and ellipses stand for terms involving higher
number of + and their derivatives with respect to N . It follows from (3.15) that in order to
determine to four loops it is sufficient to substitute + by its three-loop expression.
Replacing in (3.15) by their expressions (3.9) and comparing the coefficients in front of
odd powers of 1/J one obtains an infinite system of relations between the coefficient functions
k entering (3.9)
1
1 = 0 0 ,
2
1
1
1
1
1
3 = 03 0 + 02 (0 )2 0 (0 )3 + 0 0 + (02 ) 0 2 ,
12
4
8
16
2
..
.
(3.16)
where dot denotes a derivative with respect to ln J. Here, in the second relation, we took into
account that 0 , Eq. (3.10), is linear in ln J and therefore 0 = 0. It is straightforward to
derive from (3.15) similar relations for the subleading functions 5 , 7 , . . . but they become
more cumbersome. Substitution of (3.10) into the first relation in (3.16) yields C = 12 A2 and
D = 12 A(B ), in agreement with the exact three-loop result in QCD, Eqs. (1.9) and (1.10).
In the N = 4 SYM theory, the relations (3.16) can be easily verified with a help of (3.11).
We conclude from (3.15) and (3.16) that the coefficients in front of odd powers of 1/J in the
large N expansion of the anomalous dimension, Eqs. (3.8) and (3.9), can be expressed in terms
of the coefficients accompanying smaller even powers of 1/J to less number of loops.
3.2. Time-like anomalous dimensions
Let us now extend consideration to the time-like anomalous dimensions T (N ). They describe
the scale dependence of the partonic fragmentation functions [12] and, in distinction with the
space-like anomalous dimensions discussed above, are not related to local Wilson operators. In
what follows we shall assume, following [5], that the time-like anomalous dimensions satisfy the
relation

1
1
T (N ) = f N T (N ) L()
(3.17)
2
4
with the function f (N ) the same as for the space-like anomalous dimension (3.1). The relation
(3.17) generalizes to higher twists the relation (1.5) proposed in [5,6] and incorporates the betafunction contribution. Inverting (3.17) and combining it with the similar relation for the spacelike anomalous dimension (3.1), one obtains

1
1
1
1
f (N ) = T N + f (N ) + L() = S N f (N ) + L() .
(3.18)
2
4
2
4
One can exclude f (N ) from (3.18) and obtain two equivalent functional relations between the
space-like and time-like anomalous dimensions

S (N ) = T N + S (N ) ,

T (N ) = S N T (N ) .
17
(3.19)
Notice that the beta-function term dropped out from these relations. The second relation in (3.19)
takes the same form as (1.4) but in distinction with the latter it now holds true for arbitrary N .
Following [5], one can apply the relations (3.19) to verify the reciprocity relations (1.2). It
is convenient to switch in (1.2) from the splitting functions to their moments (1.1) and examine
the relations between the anomalous dimensions S (N ) and T (N ). From the second relation in
(3.19) one gets similarly to (3.6) and (3.7)
T (N ) =

k
1
(N )k1 S (N ) .
k!
(3.20)
k=1
We conclude from this relation that the difference between the space-like and time-like anomalous dimensions is given by
1

1 2
1 4
(3.21)
S (N ) + S3 (N )
S (N ) + O S5 (N ) ,
2
6
24
where prime denotes a derivative with respect to N . It is different from zero starting from two
loops. Moreover, replacing S (N ) in the right-hand side of (3.21) by its three-loop expression,
one can evaluate T (N ) S (N ) to four-loop accuracy and, then, translate it with a help of (1.1)
into the difference of the splitting functions PS (z) PT (z).
Let us now examine the second reciprocity relation in (1.2) and evaluate the moments (1.1) of
the difference of the splitting functions entering both sides of (1.2)
T (N ) S (N ) =
1
(N ) =

dz zN1 PT (z) + zPS (1/z) .
(3.22)
By definition (1.1), the moments of PT (z) give rise to the time-like anomalous dimension T (N )
which can be expressed in its turn in terms of the function f (N ) using (3.17).11 To calculate the
moments of PS (1/z) in (3.22), we go over to the large N limit and change the integration variable
z = ex/j with j = N + 12 = (J 2 + 14 )1/2 as in (2.13). Expanding the x-integral in inverse powers
of J , one finds that it is given by the asymptotic series (3.8) with J 1 J 1
1

1
dz zN PS (1/z) = + ln J, J 1 ln J, J 1 + L().
2
(3.23)
Then, one takes into account (3.14) and obtains

k
k
k1

1
1
1
1
(N ) =
N
f (N ) + L() f (N ) L()
.
k!
2
2
2
(3.24)
k=2
It follows from this relation that (N ) vanishes in a conformal gauge theory with () = 0.
Otherwise, (N ) receives corrections starting from two loops.
11 One observes that the relation (3.17) transforms into (3.1) under substitution (N ) (N ) and f (N )
S
T
f (N). The expression for T (N ) can be easily obtained by applying the same transformation to (3.5) and (3.3).
18
The relation (3.24) takes a simple form when expressed in terms of the time-like anomalous
dimension T (N ). Inverting (3.17) and substituting f (N) in (3.24) by its expression in terms of
T (N ), one gets after some algebra

1
(N ) = T (N ) T N + L() .
(3.25)
2
Its expansion in powers of the beta function reads

2
1
1
1

L() T (N ) + O 5 .
(N ) = L() T (N ) + L()T (N ) +
2
4
24
(3.26)
We observe that this relation has the same remarkable property as Eqs. (3.15) and (3.21)using
the known result for three-loop anomalous dimensions, one can determine (N ) to four loops.
Our analysis in this section relied on the parity preserving property (1.17) of the scaling function f (N ). We verified this property in QCD and in SYM theories using known results for the
multi-loop anomalous dimensions. In the next two sections, we will argue that the property (1.17)
holds true to all loops in QCD in the so-called large 0 limit and in the N = 4 SYM theory for
special class scalar operators of higher twist in the strong coupling regime.
4. Parity preserving relations in the large 0 -limit
The anomalous dimensions in QCD depend on the number of quark flavours nf . One can
make use of this fact to obtain the all-loop expression for the anomalous dimensions (N ) and
the functions f (N) in the large 0 limit
1
a = 0 = fixed,
2
0 .
(4.1)
2
Here 0 = 11
3 Nc 3 nf is the one-loop coefficient of the beta-function in QCD and nf is assumed
to take large negative values. Notice that this limit does not exist in SYM theories, since 0 is
uniquely fixed there by the number of supercharges, Eq. (2.6).
Let us first consider the nonsinglet (unpolarized) twist-two anomalous dimension in QCD. To
one-loop order, it takes the form ns = 0 (N ) + O(2 ) with 0 (N ) given by (2.8) for = 1. At
higher loops, it receives corrections (nf )n enhanced in the large 0 limit. They come from
the so-called single renormalon chain Feynman diagrams which can be resummed to all orders in
the rescaled coupling constant a, Eq. (4.1). Then, the nonsinglet anomalous dimension in QCD
is given in the large 0 -limit by [41,42]

()
ns (N ) = 2A(a) (N + a) (1 + a)
2

N 1
a + 2a 1
(1 + a)2
,
2
(1 + a)(N + a) (a + 2)(N + 1 + a)
(4.2)
where the superscript () indicates the leading asymptotic behaviour for 0 and A(a) is
the cusp anomalous dimension in the limit (4.1)
A(a) =
2CF sin(a) (4 + 2a)

.
30 2 (a + 2)
()
To lowest order in a, ns (N ) coincides with the exact one-loop result (2.8) (for = 1).
(4.3)
19
Let us substitute (4.2) into the relation (3.1) and reconstruct the corresponding function
f () (N ). One finds that the relation (3.1) simplifies significantly in the large 0 limit because
()
the anomalous dimension scales as ns (N ) 1/0 and, therefore, it can be safely neglected in
the right-hand side of (3.1). Together with () = 0 + = 2a + this leads to
()
()
ns
(N ) = fns
(N + a).
(4.4)
As a result, one finds from (4.2) after some algebra

(2a + 1 + )2
3 a2
()
fns
(N ) = 2A(a) (N + 1) (1 + a)
,
8N (N + 1)
2(1 + a)(2 + a)
(4.5)
with = 1. The reason why we displayed the -dependence in (4.5) is that for = 1 the
relations (4.5) and (4.4) describe the quark transversity anomalous dimension, Eq. (2.8), in
the large 0 limit [43]. As expected, the function f () (N ) also depends on the parameter
cr = ()/2 = a + defining the critical value of the spacetime dimension dcr = 4 2cr
()
for which the gauge theory becomes conformal. Since ns (N ) 1/0 , one deduces from (3.19)
that the space-like and time-like anomalous dimensions coincide in the large 0 limit and, therefore, the first of the two reciprocity relations in (1.2) is exact.
We recall that the expression (4.5) resumes perturbative corrections to the function fns (N ) of
()
the form (nf )n . The N -dependence of the function fns (N ), Eq. (4.5), is similar to that of
the one-loop expression (2.8)higher order corrections in a only modify the coefficients in front
of (N + 1) and 1/(N(N + 1)). Therefore, in agreement with (1.17), the large N expansion of
()
fns (N ) runs in integer negative powers of J 2 = N (N + 1) only.
Let us now examine the large 0 limit of the twist-two (un)polarized singlet anomalous dimensions, q and g , defined as solutions to the characteristic equation (2.16). To one-loop accuracy,
the mixing matrix is given by (2.18) and the nf -dependence resides in two matrix elements only,
qg
gg
0 and 0 . Then, it is easy to see that eigenvalues of the one-loop mixing matrix scale in the
large 0 limit (4.1) as
q() (N ) = a/0 ,
g() (N ) = 0 = 2a.
(4.6)
Going over to higher loops one finds [41,44] that leading corrections to both anomalous dimensions take the form (nf )n a n+1 /0 . The scaling behaviour of the smallest eigenvalue
()
q() (N ) is the same as of the nonsinglet anomalous dimension ns
(N ). Therefore, in the large
0 limit the relation (3.1) between q (N ) and the corresponding function fq (N ) takes the same
form as (4.4)
q() (N ) = fq() (N + a).
()
(4.7)
()
In distinction with q (N ), the gluon dominated anomalous dimension g

finite value (4.6) and the defining relation (3.1) looks in this case as

1
g() (N ) = fg() N + g() (N ) + a .
2
()
Substituting g
(N ) approaches a
(4.8)
(N ) = 2a + O(1/0 ) into this relation, one finds in the large 0 limit
g() (N ) = fg() (N ),
(4.9)
20

()
with fg (N ) + 2a = O(1/0 ). The relations (4.7) and (4.9) hold true for the unpolarized and
polarized anomalous dimensions although the corresponding scaling functions fq (N ) and fg (N )
are different in two cases.
()
()
The resummed (un)polarized singlet anomalous dimensions q (N ) and g (N ) have been
12
calculated in Refs. [41,44]. Their expressions are lengthy and to save space we do not display
them here. Making use of (4.7) and (4.9), it is straightforward to calculate the corresponding
()
()
functions fq (N ) and fg (N ) and, then, work out their large N expansion. In this way, we
()
found that for the polarized and unpolarized anomalous dimensions the expansion of fq (N )
()
and fg (N ) runs in integer negative powers of J 2 = N (N + 1) only! We conclude that the
parity preserving property (1.17) holds true in QCD for the twist-two (non)singlet anomalous
dimensions to all loops in the large 0 limit.
5. Parity preserving relations in the AdS/CFT
In this section, we will employ the AdS/CFT correspondence to determine the function f (N )
in the maximally supersymmetric N = 4 theory in the strong coupling regime.
The gauge/string duality allows one to establish the correspondence between the anomalous
dimensions of Wilson operators in the planar N = 4 SYM theory at strong coupling and energies
of strings on the AdS5 S5 background [1,30,45]. This relationship takes a very precise form
for Wilson operators built from a complex scalar field Z and the light-cone component of the
covariant derivatives D+ = (n D) (see review [46] and references therein)
k1

{k}
k2
kL
OL (0) = tr D+
(5.1)
Z(0)D+
Z(0) D+
Z(0) ,
{k}
with ki being nonnegative integer. In gauge theory,

OL (0) can be identified as quasipartonic
operators with twist L and the Lorentz spin N = i ki . The operators (5.1) with the same quantum numbers L and N mix under renormalization and their anomalous dimensions can be found
by diagonalizing the corresponding mixing matrix. For L = 2, the anomalous dimension of the
operators (5.1) is uniquely specified by the total Lorentz spin N whereas for higher twists L 3
the anomalous dimensions occupy a band whose size depends on N (see review [47] and references therein). The AdS/CFT correspondence provides a prediction for the minimal anomalous
dimension L (N ) of the Wilson operators (5.1) carrying large quantum numbers (twist L and/or
Lorentz spin N ) in the planar N = 4 SYM theory at strong coupling. Namely, L (N ) is related
to the energy of a (semi)classical folded string which rotates with the angular momentum N
on the AdS3 part of the target space and whose center-of-mass is also moving with the angular
momentum L along a big circle of S5 [30,31]
L (N ) = E N L.
(5.2)
Here the string energy E is given by the series in 1/

N L
N L
E = E0 ,
(5.3)
+ E1 ,
+
with E0 being the classical string energy and E1 being one-loop quantum string correction.
12 Notice that the expression for the quark dominated polarized anomalous dimensions, Eq. (5.3) in the second reference
in [41], contains a misprint. The factor (2 + n 1) should be replaced there by (2 + n 2). We would like to thank
J. Gracey for clarifying this point.
21
For the folded rotating string, the classical energy E0 can be found explicitly in terms of
the Jacobi elliptic K- and E-functions [31,48,49]. The quantum correction E1 has been recently
calculated in Ref. [50] and its expression is more involved. Keeping in (5.3) the first term only,
onefinds from
(5.2) the leading order expression for the anomalous dimension as 1 and
N/ , L/ = fixed [51,52]

1 ,
L (N ) = L K( )bstr E( )
bstr

1
N/L = E( ) astr +
(5.4)
K( ) bstr +
,
2
bstr
astr
where = g 2 Nc /(L)2 is the so-called BMN coupling [45]. The auxiliary variables , astr and
bstr parameterize classical rotating string solution and verify the relations
astr = bstr / 1 ,
1
bstr =
K( )

1 2
astr

1 2
bstr
1/2
.
(5.5)
Their substitution into (5.4) yields the anomalous dimension L (N )/L as a function of N/L
through parametric dependence of both functions on the auxiliary modular parameter .
Let us apply (5.4) to reconstruct the corresponding function fL (N ). In the N = 4 SYM theory,
the defining relation (1.12) takes the form

1
L (N ) = fL N + L (N ) .
(5.6)
2
We recall that the shift of the argument in the right-hand side of this relation takes into account
renormalization of the conformal spin of the operators (5.1) in higher loops. The operators (5.1)
have Lorentz spin N and canonical dimension N + L, so that their bare conformal spin equals
j = N + 12 L.
We have seen in the previous sections, that for the twist-two operators with large Lorentz spin
N the function fL=2 (N ) admits asymptotic expansion in inverse powers of the SL(2) Casimir
J 2 = j (j 1). In the case under consideration,
an important difference as compared with the
twist two is that, for 1 and N/ , L/ = fixed, both the Lorentz spin N and the twist
L are necessarily large and, therefore, discussing large N asymptotics of fL (N ) one has to
distinguish two different limits N L 1 and L N 1. We will see in a moment that
the function fL (N ) has the parity preserving property in the former limit only. Introducing the
scaling variable = N/L (not to be confused with the physical coupling (3.12)), one finds the
quadratic Casimir for the operators (5.1) as

J =L
2
1
+
2
2
1
1
L( +
1
2)

=L
1
+
2
2

1 + O 1/ .
(5.7)
We expect that in the scaling limit, 1 and = fixed, the function fL (N ) should admit
asymptotic expansion in integer negative powers of J 2 , or equivalently in even negative powers
of ( + 12 ). To verify this property, we shall apply Eqs. (5.4)(5.6) to reconstruct the large N
expansion of f (N ) and, then, examine the resulting series for different values of = N/L.
The coupling constant enters into the right-hand side of both relations in (5.4) through the
BMN coupling = g 2 Nc /(L)2 . Assuming that the modular parameter admits a regular
22
expansion in powers of 13
= (0) + (1) + ,
(5.8)
one finds from the second relation in (5.4) for = 0

+
1 1
E( (0) )
.
=
2 2 1 (0) K( (0) )
(5.9)
Examining subleading ( )k corrections to (5.4) one can express (k) -coefficients in terms of
the leading one (0) . Then, substitution of (5.8) into the first relation in (5.4) yields asymptotic
expansion of the anomalous dimension in powers of the BMN coupling [48,51]14

2
L (N ) = L (0) + ( ) (1) + .
(5.10)
The first few coefficients are given by

1
(0) = K (0) 2 (0) K (0) 2E (0) ,
2
2

1 3 (0)
(1)
4 2 (0) 1 (0) (0) K (0) 8 1 (0) E (0) .
= K
8
(5.11)
The relations (5.9)(5.11) define parametric dependence of the anomalous dimension L (N ) on
the rescaled Lorentz spin = N/L. The leading term in (5.10) coincides with the one-loop
expression for the minimal anomalous dimension of the operators (5.1) in the thermodynamical
limit L 1 [48,51].
Let us combine together the relations (5.6) and (5.10) and determine the function fL (N ). We
find that fL (N ) also admits an expansion in powers of the BMN coupling

2
fL (N ) = L f (0) + ( ) f (1) +
(5.12)
with the coefficients related to (5.11) as
f (0) = (0) ,

2
(0) (0)
1
1
= K4 (0) (0) .
f (1) = (1) (0)
2
(0)
8
(5.13)
We observe a remarkable simplification of f (1) as compared to (1) , Eq. (5.11). Let us examine
the expressions for L (N ) and fL (N ) for different values of = N/L.
LN 1
1
1
In this limit, for 1, one finds from (5.9) that = 32
( (0) )2 + 32
( (0) )3 + with (0) 1.
(0)
Expanding (5.10) and (5.12) in powers of one gets the leading term

1
3
1
(0) = f (0) = 2 2 + 3 + ,
(5.14)
2
2
8
13 As was shown in Refs. [51,52], this assumption is only justified for ln2 (N/L) < 1.
14 The one-loop quantum string corrections to the energy of folded rotating string (5.3) break the BMN scaling of the
anomalous dimension [50].
and the first subleading correction

1 4
11 3
(1)
2
= + 2 + ,
8
4

1 4
1 2 3 3
(1)
f = + + .
4
4
8
23
(5.15)
We observe that the two functions, L (N ) and fL (N ), vanish for 0. Expansion of fL (N )

involves all powers of = N/L and does not reveal any symmetry.15
N L1
In this limit, for 1, one finds from (5.9) that (0) 1. Denoting (0) = 1 16z2 one
obtains from (5.9) for z 0

4
1 1 1
2 2 2 + 1 2
+ =
(5.16)
z +O z ,
+
2 z 8
2 2
with = ln(1/z). One notices that the series inside the square brackets in the right-hand side of
(5.16) only involves even powers of z. Therefore, inverting (5.16) one obtains z as a series in odd
negative powers of ( + 12 ) with the expansion coefficients depending on ln( + 12 ).
Replacing (0) = 1 16z2 in (5.11) and (5.13) and expanding the resulting expressions around
z = 0 one gets the leading term

4
2 + 1 2
(0)
(0)
2 2
=f =
(5.17)
z +O z ,
+4
2
2
as well as the subleading corrections to the anomalous dimension L (N ), Eq. (5.10),

3
2( + 1) 2
1 2( 2)
(1)
4
= +
z+
z +O z ,
8

3
4 4 43 3 + 137 2 160 + 61 2
1 + 1 (3 5)
(2) = 6
,
z
+
O
z
z
16 1 2( 1)
2 ( 1)2
(5.18)
and to the function fL (N ), Eq. (5.12),

4
1 2( + 1) 2
(1)
4
f = +
z +O z ,
8

4
1 + 1 3 + 2 3 2
.
z
+
O
z
f (2) = 6
16 1
2 ( 1)2
(5.19)
Here as compared with (5.11) and (5.13) we also included the next-to-subleading correction to
the functions (5.10) and (5.12).
As follows from (5.17), expansion of the leading order term, (0) = f (0) , runs in even powers
of z ( + 12 )1 . Going over to subleading terms, Eqs. (5.18) and (5.19), one finds that (k)
and f (k) (with k = 1, 2, . . .) coincide at the level of z0 corrections but deviate otherwise.
15 The situation is similar to that for the one-loop anomalous dimension of twist-two operators in the N = 4 SYM
theory, Eq. (2.20). The anomalous dimension 0 (N ) = 4[(N + 1) (1)] vanishes for N = 0, its expansion around
N = 0 involves all powers of N but its large N expansion runs in integer powers of 1/(N (N + 1)).
24
Remarkably enough, the functions f (k) do not contain odd powers of z while they are present
in (k) . Together with (5.16) this implies that, in agreement with our expectations, the large
expansion of the function fL (N ) runs in even negative powers of ( + 12 ) only.
The relations (5.18) and (5.19) were derived under tacit assumption that the coefficient functions entering (5.10) and (5.12) are uniformly bounded functions of = N/L. Examining the
expressions (5.17), (5.18) and (5.19) one finds that for N L, or equivalently , this
assumption is justified provided that 2 1, or equivalently (/L2 ) ln2 (N/L) 1. In the
opposite limit, for 2 1, one finds that the leading corrections ( 2 )n to (5.12) can be
resummed to all orders [51]

g 2 Nc

2
fL (N ) = L 1 + 1 +
(5.20)
ln N + ,
where = g 2 Nc /(L)2 and ellipses denote terms suppressed by powers of ln( + 12 ). We

notice that despite the fact that each term of the expansion is suppressed by inverse power of L,
the resummed expression does not depend on the twist L and exhibits a well-known logarithmic
scaling [30]. For the twist-two anomalous dimensions, this scaling behaviour arises for N 1.
Specific feature of operators of higher twist L 1 is that the logarithmic scaling (5.20) sets up
for much larger values of the Lorentz spin such that
2N
ln
1.
(5.21)
L
L2
In this kinematical region, analysis becomes more involved due to a complicated form of the
defining relations (5.4) and (5.5).
There is however a much simpler way to understand analytical properties of the function
fL (N ). Let us consider the relation (5.6) and rewrite it using (5.4) as

x( )L
fL
(5.22)
= L K( )bstr ( ) E( )
1 ,
2
bstr ( )
where x is a function of the modular parameter
x( ) = E( )astr ( ) K( )

1,
astr ( )
(5.23)
with astr = bstr / 1 and bstr defined in (5.5). We expect that for large Lorentz spin N the
asymptotic expansion
of the function fL (N ) should run in integer negative powers of J 2 =
2
j [1 + O(1/ )], Eq. (5.7), with j = N + 12 L being the conformal spin. Substituting N with
1
2 x( )L as in (5.22), one finds that the conformal spin takes the form

L
K( )
L
x( ) + 1 =
E( )astr ( )
.
j=
(5.24)
2
2
astr ( )
To verify the parity preserving property (1.17) for fL it is sufficient to check that the righthand side of (5.22) stays invariant under transformation j j modulo i -contribution
from ln j terms. Let us show that this transformation corresponds to going around = 1 in
the complex -plane in the right-hand side of (5.24). According to (5.16), the limit N L corresponds to 1. Let us introduce a new variable = 1 16z2 and examine the relations
(5.22) and (5.24) for z 0. One first verifies that the elliptic functions K( ) and E( ) have
logarithmic branch cuts that start at = 1. Their expansion around z = 0 runs in even powers of z2 with the coefficients linear in ln z. Examining the relations (5.5) one finds that bstr
25
has similaranalytical properties around = 1, while astr has additional square-root singularity
astr = bstr / 1 = bstr /(4z). Therefore, going around = 1 in the complex -plane, or equivalently replacing z z, one finds that, modulo contribution from logarithmic cuts, the functions
K( ), E( ) and bstr stay invariant while astr changes a sign. Going back to (5.22) and (5.24), we
conclude that the function fL (N ) is indeed invariant under j j and, therefore, it has the
parity preserving property.
6. Conclusions
Anomalous dimensions of Wilson operators with large Lorentz spin N admit an asymptotic
expansion in inverse powers of N . The leading term of the expansion scales logarithmically with
the spin and is related to the universal cusp anomalous dimension. The subleading, power suppressed, corrections to the anomalous dimensions are not universal and depend on the operator
under consideration. Motivated by findings of Refs. [2,3,5,6], we argued in this paper that the subleading corrections satisfy infinite number of coupled relations. They allow one
to reconstruct the
corrections to the anomalous dimensions suppressed by odd powers of J = N (N + 1) in terms
of the coefficients accompanying smaller even powers of J to less number of loops. Remarkably
enough, these relations take the same form for different twist-two operators even though they
relate to each other quantities that are operator dependent.
We demonstrated that the above-mentioned properties of the subleading corrections naturally
follow from the parity preserving property of the scaling function f (N ). This function is related to the anomalous dimension (N ) through the functional relation (1.14) which generalizes
similar relation proposed in [5,6]. It incorporates the constraints imposed by the conformal invariance and takes into account the conformal symmetry breaking corrections to the anomalous
dimensions due to nonzero beta-function. We would like to stress that the functional relation
(1.14) taken alone does not tell us much about the properties of the anomalous dimensions unless it is supplemented by additional condition for the function f (N ). The latter is just the parity
preserving property of the scaling function f (N ), Eq. (1.17).
It is important to keep in mind that the anomalous dimensions (N ) and the scaling functions
f (N) have (Regge) singularities at negative N , and therefore they cannot be invariant under the
transformation N 1 N , or equivalently J J . The parity preserving property (1.17)
only holds true for each individual term in the asymptotic series for f (N ) and the Regge singularities manifest themselves through a factorial growth of the expansion coefficients at higher
orders in 1/J . It is interesting to note that the parity respecting relation is not a unique feature
of gauge theories. Using the four-loop result for the twist-two anomalous dimension in scalar
4 -theory [53], we verified that the corresponding scaling function f (N ) has the same form as
(1.17) with the only difference that the leading ln J term is missing.
In a gauge theory with nonzero beta-function, both the anomalous dimensions and the scaling functions are scheme-dependent. In our analysis, we employed the dimensional regularization/reduction scheme because the beta-function contribution to the anomalous dimension can be
described in this scheme by going over to the critical number of spacetime dimensions dcr =
42cr in which the underlying gauge theory becomes conformal [29]. The change of the scheme
(R)
corresponds to a finite renormalization of the Wilson operator, say ON (0) = Z(N)ON (0).
(R)
Then, the anomalous dimension in a new scheme reads S (N ) = S (N ) ()ln ln Z(N),
(R)
so that the properties of S (N ) at large N are now governed by the scheme-dependent factor
(R)
Z(N). Substituting S (N ) into (3.7) one can reconstruct the corresponding scaling function
26
f (R) (N ) = f (N) + O(()) and verify that it does not satisfy (1.17) for generic Z(N). However, imposing (1.17) as a condition for Z(N), one can define the renormalization scheme in
which the subleading corrections to the anomalous dimensions still verify the relations discussed
above.
In this paper, we used explicit expressions for the two-loop and three-loop anomalous dimensions of twist two in QCD and in SYM theories to verify the parity preserving relation (1.17).
One may wonder whether the same property holds true for the anomalous dimensions of higher
twist. Going over to higher twists, one finds that, in general, there are few operators carrying
the same conformal spin. In distinction with the twist two, the size of their mixing matrix grows
with N and its eigenvalues cannot be written in explicit form as functions of N . This makes
the large N analysis of higher-twist anomalous dimensions extremely difficult. Tremendous simplification occurs for the special subclass of quasipartonic operators due to hidden integrability
symmetry of their mixing matrix in the planar (t Hooft) limit.16 In QCD and in SYM theories
with N = 0, 1, 2 supercharges, integrability arises for the so-called maximally helicity operators
[34,56] while in the N = 4 theory it gets extended to a larger class of operators [57]. Making
use of the integrability, one can work out a systematic large N expansion of higher-twist anomalous dimensions in the planar approximation. In this way, one finds [51,58] that for the so-called
single-trace operators built from L fundamental fields belonging to the adjoint representation of
the SU(Nc ) group and carrying the conformal spin s (s = 12 for scalars, s = 1 for gaugino and
s = 32 for gauge strength tensor), the asymptotic expansion of the one-loop anomalous dimension runs in integer negative powers of the conformal Casimir J 2 = (N + Ls)(N + Ls 1). It
would be interesting to extend analysis to higher orders in the coupling constant and to verify
the parity respecting relation for the scaling function fL (N ) in higher loops. The gauge/string
correspondence suggests that it should hold true at least for the minimal anomalous dimension
in the strong coupling regime in the N = 4 SYM theory for higher twists and large spins such
that N L 1.
Note added
When this paper was in writing we learned from Yuri Dokshitzer and Pino Marchesini that
they studied the reciprocity respecting equation (1.5) in the N = 4 SYM theory and analyzed
the properties of the corresponding three-loop scaling function f (N ) in important kinematical
limits. Our results agree with theirs [59] as far as the parity respecting property (1.17) and the
structure of the large N expansion of f (N), Eq. (3.13), are concerned.
Acknowledgements
We are most grateful to S. Moch for collaboration at an early stage of the project. We would
like to thank Yu. Dokshitzer, E. Gardi, A. Gorsky, A. Manashov, G. Marchesini and A. Mueller
for interesting discussions and J. Gracey for useful correspondence. The work was supported in
part by the French Agence Nationale de la Recherche under grant ANR-06-BLAN-0142-02.
16 Integrability phenomenon in four-dimensional YangMills theories has been first discovered in Refs. [54,55] in context of high-energy (Regge) asymptotics of the scattering amplitudes.
27
Appendix A. Large N expansion of the anomalous dimensions

In this appendix we present expressions for the large N expansion of various twist-two
anomalous dimensions S (N ) and the corresponding functions f (N ). They were obtained in
collaboration with S. Moch. The anomalous dimensions are given by a series (2.3) in the coupling constant defined in (2.2). At large N their leading asymptotic behaviour is

S (N ) = 2cusp () ln N + O N 0 ,
(A.1)
with the cusp anomalous dimension cusp () depending on the gauge theory under consideration.
Perturbative expansion of S (N ) can be simplified by using the physical coupling [40], =
1
2 cusp (). Then, the relations (2.3) and (3.2) can be rewritten in terms of this coupling as

S (N ) = (4 ln N + 0,ph ) + 2 1,ph + 3 2,ph + O 4 ,

f (N ) = 2 ln J2 + f0,ph + 2 f1,ph + 3 f2,ph + O 4 ,
(A.2)
where N = NeE , J = J eE and J 2 = N (N + 1) is the conformal Casimir. Here, the ph coefficients are given by series in integer negative powers of N while the fph -coefficients admit
expansion in even negative powers of J only, Eq. (1.17).
The multi-loop twist-two anomalous dimensions S (N ) are expressed in terms of generalized harmonic (EulerZagier) sums. In QCD, they also depend on the SU(Nc ) color and SU(nf )
flavour factors and their explicit expressions are cumbersome [2,3]. In the N = 4 SYM, to threeloop accuracy, the perturbative coefficients in (A.2) are Nc -independent and, as a consequence,
they are much simpler as compared to the QCD case. For the sake of simplicity, we shall present
below the explicit expressions (A.2) in the N = 4 SYM theory and outline a difference as compared with the QCD expressions. In N = 4 SYM the anomalous dimensions of all twist-two
operators are given by the same, universal function of N [39]. To three-loop accuracy, the cusp
anomalous dimension, or equivalently the physical coupling constant is given by (3.12). Expanding the three-loop result of Ref. [39] at large N and matching it with the first relation in (A.2)
one finds

1
1
0,ph = 2N 1 N 2 + N 4 + O N 6 ,
3
30
1,ph = 63 + (8 ln N )N 1 + (6 4 ln N )N 2

4
14
+
ln N
N 3 + 4N 4 + O N 5 ,
3
3

4
2,ph = 2 3 + 205 123 N 1
3

2 2
2 2
2
+ 8(ln N ) + + 16 ln N + + 63 N 2
3
3

2 2
+ 8(ln N )2 +
32 ln N + 12 2 23 N 3
3

110
70 13 2
+ 4(ln N )2 +
ln N
+ N 4 + O N 5 .
3
3
18
(A.3)
28
One evaluates the corresponding function f (N ) with a help of Eqs. (3.7) and (3.3) and, then,
expands it in inverse powers of J 2 = N (N + 1) to get from (A.2)

2
2
f0,ph = J 2 J 4 + O J 6 ,
3
15

f1,ph = 63 + 2J 2 + J 4 + O J 6 ,

4
2
2
f2,ph = 205 2 3 + 2 ln J + 2 J 2
3
3
3

2 2
4
+
6 + ln J 2 2 J 4 + O J 6 .
3
9
(A.4)
Examining these relations, one observes that for the anomalous dimension S (N ) the coefficients
in front of negative powers of N in the right-hand side of (A.3) are enhanced by powers of ln N
with the leading term n,ph (ln N /N )n (for n = 1, 2). For the function f (N ) one finds that,
in agreement with the parity preserving relation (1.17), its expansion runs in negative powers
of J 2 . Remarkably enough, the one-loop and two-loop coefficients, f0,ph and f1,ph respectively,
do not involve logarithmically enhanced terms but they reappear starting from three loops f2,ph
2 (to be compared with
)2 /N 2 ).
ln J/J 2 ln N/N
2,ph (ln N
Repeating similar analysis for the twist-two operators in QCD, we found that, in spite of the
fact that the large N expressions for the functions (N ) and f (N) are much more cumbersome,
the ln N and ln J terms inside the ph - and fph -coefficients have the same properties as in the
N = 4 theory for the following anomalous dimensions: nonsinglet unpolarized quark operators
to three loops [2], quark transversity operator to two loops [38] and linearly polarized gluon
operator to two loops [32].
For the singlet (un)polarized twist-two anomalous dimensions in QCD, the analysis becomes
more cumbersome since one has to solve the characteristic equation (2.16) and, then, use its
solutions, q (N ) and g (N ), to define the corresponding functions fq (N ) and fg (N ) with a help
of (3.7) and (3.3). As in (2.17), it is more advantageous to analyze the properties of the functions
fq (N ) + fg (N ) and fq (N )fg (N ). Making use of (3.7) it is straightforward to verify that

1
1 3
X 3X Y + O 4 ,
fq + fg = X X 2 2Y +
4
24

1
1
1 1

2
fq fg = Y 1 X + (X ) + X X Y + O 5 ,
2
4
8
8
(A.5)
where the notation was introduced for the functions

) = q + g = X 2(),
X(N

2
Y (N ) = q g = Y X() + () ,
(A.6)
with X and Y given in (2.17). In the right-hand side of these relations, the N -dependence of all
functions (except of ()) is tacitly assumed and prime denotes a derivative with respect to N .
We verified by explicit calculation that for the three-loop singlet unpolarized anomalous dimensions [3] and the two-loop singlet polarized anomalous dimensions [33], the right-hand side of
both relations in (A.5) has asymptotic expansion in integer powers of 1/J 2 . This implies that
the corresponding functions fq and fg (or equivalently fq and fg ) possess the parity preserving
property (1.17). The resulting expressions can be described as follows. The off-diagonal elements
of the mixing matrix scale as gq , qg 1/J (see Eq. (2.18)) and they affect fq and fg at the
29
level of O(J 2 ) corrections only. As a result, for both functions, fq and fg , the corresponding
fph -coefficients (A.2) take the form (A.4) with the only difference that the coefficients in front
of J 2 , J 4 , . . . receive an additional contribution given by (an infinite) series in 1/ ln J.
References
[1] J.M. Maldacena, Adv. Theor. Math. Phys. 2 (1998) 231;
S.S. Gubser, I.R. Klebanov, A.M. Polyakov, Phys. Lett. B 428 (1998) 105;
E. Witten, Adv. Theor. Math. Phys. 2 (1998) 253.
[2] S. Moch, J.A.M. Vermaseren, A. Vogt, Nucl. Phys. B 688 (2004) 101.
[3] A. Vogt, S. Moch, J.A.M. Vermaseren, Nucl. Phys. B 691 (2004) 129.
[4] G.P. Korchemsky, Mod. Phys. Lett. A 4 (1989) 1257;
G.P. Korchemsky, G. Marchesini, Nucl. Phys. B 406 (1993) 225.
[5] Yu.L. Dokshitzer, G. Marchesini, G.P. Salam, Phys. Lett. B 634 (2006) 504;
G. Marchesini, hep-ph/0605262.
[6] Yu.L. Dokshitzer, V.A. Khoze, S.I. Troian, Phys. Rev. D 53 (1996) 89.
[7] A.V. Belitsky, G.P. Korchemsky, D. Mller, hep-th/0605291.
[8] V.N. Gribov, L.N. Lipatov, Sov. J. Nucl. Phys. 15 (1972) 438.
[9] G. Altarelli, G. Parisi, Nucl. Phys. B 126 (1977) 298.
[10] Yu.L. Dokshitzer, Sov. Phys. JETP 46 (1977) 641.
[11] R.L. Jaffe, M. Soldate, Phys. Rev. D 26 (1982) 49.
[12] R. Brock, et al., CTEQ Collaboration, Rev. Mod. Phys. 67 (1995) 157.
[13] V.N. Gribov, L.N. Lipatov, Sov. J. Nucl. Phys. 15 (1972) 675.
[14] S.D. Drell, D.J. Levy, T.M. Yan, Phys. Rev. 187 (1969) 2159;
S.D. Drell, D.J. Levy, T.M. Yan, Phys. Rev. D 1 (1970) 1617.
[15] G. Curci, W. Furmanski, R. Petronzio, Nucl. Phys. B 175 (1980) 27.
[16] W. Furmanski, R. Petronzio, Phys. Lett. B 97 (1980) 437.
[17] E.G. Floratos, C. Kounnas, R. Lacaze, Nucl. Phys. B 192 (1981) 41;
E.G. Floratos, C. Kounnas, R. Lacaze, Phys. Lett. B 98 (1981) 89, 285.
[18] M. Stratmann, W. Vogelsang, Nucl. Phys. B 496 (1997) 41.
[19] M. Stratmann, W. Vogelsang, Phys. Rev. D 65 (2002) 057502.
[20] J. Blumlein, V. Ravindran, W.L. van Neerven, Nucl. Phys. B 586 (2000) 349.
[21] A.H. Mueller, Nucl. Phys. B 228 (1983) 351.
[22] A. Mitov, S. Moch, A. Vogt, Phys. Lett. B 638 (2006) 61.
[23] A.P. Bukhvostov, G.V. Frolov, L.N. Lipatov, E.A. Kuraev, Nucl. Phys. B 258 (1985) 601.
[24] V.M. Braun, G.P. Korchemsky, D. Mller, Prog. Part. Nucl. Phys. 51 (2003) 311.
[25] T. Ohrndorf, Nucl. Phys. B 198 (1982) 26.
[26] D. Mller, Phys. Rev. D 49 (1994) 2525;
D. Mller, Phys. Rev. D 58 (1998) 054005;
A.V. Belitsky, D. Mller, Nucl. Phys. B 537 (1999) 397.
[27] A.V. Belitsky, G.P. Korchemsky, D. Mller, Phys. Rev. Lett. 94 (2005) 151603;
A.V. Belitsky, G.P. Korchemsky, D. Mller, Nucl. Phys. B 735 (2006) 17.
[28] A.N. Vasilev, Y.M. Pismak, Yu.R. Khonkonen, Theor. Math. Phys. 46 (1981) 104;
A.N. Vasiliev, Yu.M. Pismak, Yu.R. Khonkonen, Theor. Math. Phys. 47 (1981) 465;
A.N. Vasiliev, Yu.M. Pismak, Yu.R. Khonkonen, Theor. Math. Phys. 50 (1982) 127.
[29] A.N. Vasilev, The Field Theoretic Renormalization Group in Critical Behavior Theory and Stochastic Dynamics,
Chapman & Hall/CRC, 2004.
[30] S.S. Gubser, I.R. Klebanov, A.M. Polyakov, Nucl. Phys. B 636 (2002) 99.
[31] S. Frolov, A.A. Tseytlin, JHEP 0206 (2002) 007.
[32] W. Vogelsang, Acta Phys. Pol. B 29 (1998) 1189.
[33] R. Mertig, W.L. van Neerven, Z. Phys. C 70 (1996) 637;
W. Vogelsang, Nucl. Phys. B 475 (1996) 47;
W. Vogelsang, Phys. Rev. D 54 (1996) 2023.
[34] A.V. Belitsky, S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Phys. Lett. B 594 (2004) 385;
A.V. Belitsky, S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Nucl. Phys. B 708 (2005) 115;
A.V. Belitsky, S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Nucl. Phys. B 722 (2005) 191.
30
[35] S. Mandelstam, Nucl. Phys. B 213 (1983) 149;

S. Mandelstam, Phys. Lett. B 121 (1983) 30;
L. Brink, O. Lindgren, B.E. Nilsson, Nucl. Phys. B 212 (1983) 401;
L. Brink, O. Lindgren, B.E. Nilsson, Phys. Lett. B 123 (1983) 323.
[36] S.J. Gates, M.T. Grisaru, M. Rocek, W. Siegel, Superspace, or one thousand and one lessons in supersymmetry,
Front. Phys. 58 (1983) 1;
M.F. Sohnius, Phys. Rep. 128 (1985) 39.
[37] F.A. Dolan, H. Osborn, Nucl. Phys. B 629 (2002) 3.
[38] S. Kumano, M. Miyama, Phys. Rev. D 56 (1997) 2504;
W. Vogelsang, Phys. Rev. D 57 (1998) 1886;
A. Hayashigaki, Y. Kanazawa, Y. Koike, Phys. Rev. D 56 (1997) 7350.
[39] A.V. Kotikov, L.N. Lipatov, A.I. Onishchenko, V.N. Velizhanin, Phys. Lett. B 595 (2004) 521;
A.V. Kotikov, L.N. Lipatov, A.I. Onishchenko, V.N. Velizhanin, Phys. Lett. B 632 (2006) 754, Erratum.
[40] S. Catani, G. Marchesini, B.R. Webber, Nucl. Phys. B 349 (1991) 635.
[41] J.A. Gracey, Phys. Lett. B 322 (1994) 141;
J.A. Gracey, Nucl. Phys. B 480 (1996) 73.
[42] E. Gardi, JHEP 0502 (2005) 053.
[43] J.A. Gracey, Nucl. Phys. B 667 (2003) 242.
[44] J.F. Bennett, J.A. Gracey, Nucl. Phys. B 517 (1998) 241;
J.F. Bennett, J.A. Gracey, Phys. Lett. B 432 (1998) 209.
[45] D. Berenstein, J.M. Maldacena, H. Nastase, JHEP 0204 (2002) 013.
[46] A.A. Tseytlin, in: M. Shifman, A. Vainshtein, J. Wheater (Eds.), Ian Kogan Memorial Volume, From Fields to
Strings: Circumnavigating Theoretical Physics, World Scientific, 2004, p. 1648;
A.A. Tseytlin, hep-th/0311139.
[47] A.V. Belitsky, V.M. Braun, A.S. Gorsky, G.P. Korchemsky, in: M. Shifman, A. Vainshtein, J. Wheater (Eds.), Ian
Kogan Memorial Volume, from Fields to Strings: Circumnavigating Theoretical Physics, World Scientific, 2004,
p. 266;
A.V. Belitsky, V.M. Braun, A.S. Gorsky, G.P. Korchemsky, Int. J. Mod. Phys. A 19 (2004) 4715.
[48] N. Beisert, S. Frolov, M. Staudacher, A.A. Tseytlin, JHEP 0310 (2003) 037.
[49] V.A. Kazakov, K. Zarembo, JHEP 0410 (2004) 060.
[50] S. Frolov, A. Tirziu, A.A. Tseytlin, hep-th/0611269.
[51] A.V. Belitsky, A.S. Gorsky, G.P. Korchemsky, Nucl. Phys. B 748 (2006) 24.
[52] K. Sakai, Y. Satoh, hep-th/0607190.
[53] S.E. Derkachov, J.A. Gracey, A.N. Manashov, Eur. Phys. J. C 2 (1998) 569.
[54] L.N. Lipatov, Phys. Lett. B 309 (1993) 394;
L.N. Lipatov, JETP Lett. 59 (1994) 596.
[55] L.D. Faddeev, G.P. Korchemsky, Phys. Lett. B 342 (1995) 311.
[56] V.M. Braun, S.E. Derkachov, A.N. Manashov, Phys. Rev. Lett. 81 (1998) 2020;
V.M. Braun, S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Nucl. Phys. B 553 (1999) 355;
A.V. Belitsky, Phys. Lett. B 453 (1999) 59;
A.V. Belitsky, Nucl. Phys. B 558 (1999) 259.
[57] L.N. Lipatov, Evolution equations in QCD, in: S. Boffi, C. Ciofi Degli Atti, M. Giannini (Eds.), Perspectives in
Hadronic Physics, World Scientific, Singapore, 1998, p. 413;
J.A. Minahan, K. Zarembo, JHEP 0303 (2003) 013;
N. Beisert, M. Staudacher, Nucl. Phys. B 670 (2003) 439.
[58] G.P. Korchemsky, Nucl. Phys. B 498 (1997) 68.
[59] Yu.L. Dokshitzer, G. Marchesini, hep-th/0612248.
Tri-bimaximal neutrino mixing from orbifolding

Guido Altarelli a,b , Ferruccio Feruglio c , Yin Lin c,
a Dipartimento di Fisica E. Amaldi, Universit di Roma Tre, INFN, Sezione di Roma Tre, I-00146 Rome, Italy
b CERN, Department of Physics, Theory Division, CH-1211 Geneva 23, Switzerland
c Dipartimento di Fisica G. Galilei, Universit di Padova, INFN, Sezione di Padova,
Via Marzolo 8, I-35131 Padua, Italy

Received 23 December 2006; accepted 23 March 2007
Abstract
We show that the A4 discrete symmetry that naturally leads to tri-bimaximal neutrino mixing can be simply obtained as a result of an orbifolding starting from a model in 6 dimensions. This particular orbifolding
has four fixed points where 4-dimensional branes are located and the tetrahedral symmetry of A4 connects
these branes. In this approach A4 appears after the reduction from six to four dimensions as a remnant of
the 6D spacetime symmetry. A previously discussed supersymmetric version of A4 is reinterpreted along
these lines.
1. Introduction
It is an experimental fact [1] that within measurement errors the observed neutrino mixing
matrix is compatible with the so-called tri-bimaximal form, introduced by Harrison, Perkins
and Scott (HPS) [2]. It is an interesting challenge to formulate dynamical principles that, in
a completely natural way, can lead to this specific mixing pattern as a first approximation, with
small corrections determined by higher order terms in a well defined expansion. In a series of
papers [3,4] it has been pointed out that a broken flavour symmetry based on the discrete group
A4 appears to be particularly fit for this purpose. Other solutions based on continuous flavour
groups like SU(3) or SO(3) have also been recently presented [5,6], but the A4 models have
a very economical and attractive structure (for example, in terms of field content). A crucial
E-mail addresses: guido.altarelli@cern.ch (G. Altarelli), feruglio@pd.infn.it (F. Feruglio), lin@pd.infn.it (Y. Lin).
doi:10.1016/j.nuclphysb.2007.03.042
32
G. Altarelli et al. / Nuclear Physics B 775 (2007) 3144
feature of all HPS models is the mechanism used to guarantee the necessary VEV alignment of
the flavon field T which determines the charged lepton mass matrix with respect to the direction
in flavour space chosen by the flavon S that gives the neutrino mass matrix. In recent papers
[7,8] we have constructed explicit versions of A4 model where the alignment problem is solved.
In Ref. [7] we adopted an extra dimensional framework, with T and S on different branes
so that the minimization of the respective potentials is kept to a large extent independent. In
Ref. [8], we presented an alternative, perhaps more conventional, formulation of the A4 model in
4 dimensions with supersymmetry (SUSY) at the price of introducing a somewhat less economic
set of fields. Versions either with see-saw or without see-saw can be constructed. The existence
of different realizations shows that the connection of A4 with the HPS matrix is robust and does
not necessarily require extra dimensions.
Another important aspect of the problem is that of trying to understand the dynamical origin
of A4 . As a first move in this direction, in Ref. [8] we have reformulated A4 as a subgroup of
the modular group which often plays a role in the formalism of string theories, for example in
the context of duality transformations [9]. In the present note we show that the A4 symmetry can
be simply obtained by orbifolding starting with a model in 6 dimensions (6D). In this approach
A4 appears as the remnant of the reduction from 6D to 4D spacetime symmetry induced by the
special orbifolding adopted. There are 4D branes at the four fixed points of the orbifolding and the
tetrahedral symmetry of A4 connects these branes. The standard model fields have components
on the fixed point branes while the scalar fields necessary for the A4 breaking are in the bulk.
In this paper, starting from a 6D field theory, we first introduce the specific orbifolding with
four fixed points on which the 4D Standard Model fields live (while a number of additional gauge
singlets are in the bulk) and specify how the A4 transformations relate the field components on
different branes or on the bulk. We then study the invariant interactions, local in 6D, constructed
out of the fields in the theory which are invariant under A4 . Finally we rederive the SUSY model
for tri-bimaximal neutrino mixing in this particular framework.
2. A4 as the isometry of T 2 /Z2
We consider a quantum field theory in 6 dimensions, with two extra dimensions compactified
on an orbifold T 2 /Z2 . We denote by z = x5 + ix6 the complex coordinate describing the extra
space. The torus T 2 is defined by identifying in the complex plane the points related by
z z + 1,
z z + ,
= ei 3 ,
(1)
where our length unit, 2R, has been set to 1 for the time being. The parity Z2 is defined by
z z
(2)
and the orbifold T 2 /Z2 can be represented by the fundamental region given by the triangle
with vertices 0, 1, , see Fig. 1. The orbifold has four fixed points, (z1 , z2 , z3 , z4 ) = (1/2, (1 +
)/2, /2, 0). The fixed point z4 is also represented by the vertices 1 and . In the orbifold,
the segments labelled by a in Fig. 1, (0, 1/2) and (1, 1/2), are identified and similarly for those
labelled by b, (1, (1 + )/2) and ( , (1 + )/2), and those labelled by c, (0, /2), ( , /2).
Therefore the orbifold is a regular tetrahedron with vertices at the four fixed points. The symmetry of the uncompactified 6D spacetime is broken by compactification. Here we assume that,
before compactification, the spacetime symmetry coincides with the product of 6D translations
33
Fig. 1. Orbifold T2 /Z2 . The regions with the same numbers are identified with each other. The four triangles bounded by
solid lines form the fundamental region, where also the edges with the same letters are identified. The orbifold T2 /Z2 is
exactly a regular tetrahedron with 6 edges a, b, c, d, e, f and four vertices z1 , z2 , z3 , z4 , corresponding to the four fixed
points of the orbifold.
and 6D proper Lorentz transformations. The compactification breaks part of this symmetry. However, due to the special geometry of our orbifold, a discrete subgroup of rotations and translations
in the extra space is left unbroken. This group can be generated by two transformations:
S:
T:
1
zz+ ,
2
z z, 2 .
(3)
Indeed S and T induce even permutations of the four fixed points:

S:
T:
(z1 , z2 , z3 , z4 ) (z4 , z3 , z2 , z1 ),
(z1 , z2 , z3 , z4 ) (z2 , z3 , z1 , z4 ),
(4)
thus generating the group A4 .1 From the previous equations we immediately verify that S and T
satisfy the characteristic relations obeyed by the generators of A4 :
S 2 = T 3 = (ST )3 = 1.
(6)
These relations are actually satisfied not only at the fixed points, but on the whole orbifold,
as can be easily checked from the general definitions of S and T in Eq. (3), with the help of
the orbifold defining rules in Eqs. (1) and (2). In our model the discrete group A4 , together
1 Notice that an odd permutation of the four fixed points can be generated by the parity:
z z ,
(5)
that maps (z1 , z2 , z3 , z4 ) into (z1 , z3 , z2 , z4 ) and belongs to the full 6D Poincar group, which, beyond 6D translations
and proper Lorentz transformations, also includes discrete symmetries. Therefore, had we assumed 6D Poincar as
starting point in the uncompactified theory, we would have ended up with the product of 4D Poincar times the discrete
group S4 .
34
with 4D translations and 4D proper Lorentz transformations, can be seen as the subgroup of the
spacetime symmetry in six dimensions that survives compactification. In a similar context, the
compactification of two extra dimensions on an orbifold T 2 /Z3 and its relation to the flavour
symmetry Z3 has been analyzed in Ref. [10] (see also Ref. [11] for other approaches to the
flavour symmetry in orbifold compactifications).
It is useful to represent the action of S and T on the fixed points by means of the four by four
matrices S and T 1 , respectively,
0 1 0 0
0 0 0 1
0 0 1 0
0 0 1 0
S=
T 1 =
(7)
.
,
1 0 0 0
0 1 0 0
0 0 0 1
1 0 0 0
The matrices S and T satisfy the relations (6), thus providing a representation of A4 . Since
the only irreducible representations of A4 are a triplet and three singlets, the 4D representation
described by S and T is not irreducible. It decomposes into the sum of the invariant singlet plus
the triplet representation. If we denote by u = (u1 , u2 , u3 , u4 )t (the suffix t denotes transposition)
a multiplet transforming as
u Su,
u T u,
(8)
under S and T respectively, then singlet corresponds to

u 1 = u2 = u3 = u4 ,
(9)
while the triplet is obtained by imposing the constraint

4

ui = 0.
(10)
i=1
Both conditions (9) and (10) are invariant under A4 . To better visualize this decomposition, we
consider the unitary matrix U given by:
+1 +1 +1 +1
1
1 +1 +1 1
U=
(11)
.
2 +1 1 +1 1
+1 +1 1 1
This matrix maps S and T into matrices that are block-diagonal:
1 0
1 0
,
,
UT U =
U SU =
0 S3
0 T3
where S3 and T3

1
S3 = 0
0
are the generators of the three-dimensional representation:
0
0
0 0 1
T3 = 1 0 0 .
1 0 ,
0 1
0 1 0
(12)
(13)
If u = (u1 , u2 , u3 , u4 )t transforms as specified in Eq. (8), then v (v0 , v1 , v2 , v3 )t = U u transforms as

v U T U v,
v U SU v,
(14)
respectively. Therefore, if we parametrize u as

v0
v1 + v2 + v3
u1
u2 1 v0 1 +v1 v2 + v3
,
= +
u3
2 v0
2 +v1 + v2 v3
u4
v0
v1 v2 v3
35
(15)
the components (v1 , v2 , v3 )t transform with S3 and T3 , whereas the component v0 is left invariant
by A4 . It is useful to observe that v0 is given by v0 = (u1 + u2 + u3 + u4 )/2 while the sum of
all components of the last multiplet in Eq. (15) vanishes, in agreement with the conditions (9)
and (10). Finally, if we restrict to the case of a pure triplet by taking v0 = 0, then v1 , v2 and v3
are given by:
u1
0
0
u2
v1
u2 + u3
(16)
=U
=
.
v2
u3
u1 + u3
v3
u1 u2 u3
u1 + u2
3. Local interactions invariant under A4
In this section we collect the rules to construct an A4 invariant field theory in the 6D space
time M T 2 /Z2 . The fields of this theory can be either 4D fields living at the fixed points,
in short brane fields, or bulk fields depending on both the uncompactified coordinates x and
the complex coordinate z. The new essential feature with respect to a 4D formalism is that in
general all particles have components over all four fixed points. Locality in 6D implies that at
each fixed point only products of components on that brane are allowed in the interaction terms.
This constraint reduces the number of invariant interactions that can be constructed out of brane
fields. We now discuss the structure of the invariants in this context.
3.1. Brane fields
We first consider the case of brane fields and we denote by:

a = a1 (x), a2 (x), a3 (x), a4 (x)
(17)
a set of fields localized at the fixed points (z1 , z2 , z3 , z4 ), respectively. For the time being we
do not specify if a is a scalar, a spinor or a vector under the 4D Lorentz group. We denote by
i = (z zi ) the 2D Dirac deltas needed to construct an interaction term local in 6D, starting
from brane fields. We observe that, if z undergoes the transformations (3), then the delta functions
= (1 , 2 , 3 , 4 )t are mapped into2
S:
T:
S,
T ,
(18)
where S and T are given in Eq. (7). Although the 6D spacetime symmetry does not fix the
A4 transformations of a, it is natural to assume that these are given by:
S:
T:
a Sa,
a T a.
(19)
2 Notice that the action of T on the Dirac deltas is described by T , the inverse of the matrix T 1 that permutes the
four fixed points, Eq. (7).
36
According to our discussion in the previous section, the quadruplet a decomposes into a triplet
plus the invariant singlet 1. If we introduce two such sets of brane fields, called a and b, transforming as specified in Eq. (19), then it is easy to see that the only invariant under A4 , bilinear in
a and b and local in 6D is given by:
J (2) =
4

a i b i i .
(20)
i=1
In particular, if a = (ac /2, ac /2, ac /2, ac /2) and b = (bc /2, bc /2, bc /2, bc /2) are two
invariant
singlets, then, after integrating over the z coordinate, the invariant J (2) is given by d 2 zJ (2) =
ac bc . If a is a singlet and b is a triplet, J (2) vanishes after integration over z, because of Eq. (10).
If a and b are two triplets transforming as in Eq. (19), they can be parametrized as shown in
Eq. (15):
v1 + v2 + v3
w1 + w2 + w3
1 +w1 w2 + w3
1 +v v2 + v3
b=
a= 1
(21)
,
.
+v
+
v
v
2
2 +w1 + w2 w3
1
2
3
v1 v2 v3
w1 w2 w3
In this case, after integration over z, the bilinear J (2) reads:

d 2 z J (2) = v1 w1 + v2 w2 + v3 w3 ,
(22)
which is the familiar expression of the invariant under A4 contained in the product of two triplet
representations [7].
Locality in 6D provides some limitations in the construction of interaction terms. For instance,
it will be important for the following discussion to note that if a and b are two triplets transforming as in (19), then it is not possible to construct a term bilinear in a and b, local in 6D and
transforming as a 1 or a 1 . This is easily seen by starting from the local bilinear
J =
4

y i a i b i i ,
(23)
i=1
where yi are constants to be determined by imposing that J transforms as a 1 . In fact it is

trivial to see that only the trivial solution yi = 0 is allowed. This is because S imposes y4 = y1
and y3 = y2 ; while T requires y4 = y4 , hence y4 = y1 = 0, and y1 = y2 = 2 y2 , so that also
y2 = y3 = 0. The same argument also shows that it is equally impossible to obtain 1 .
To obtain a non-invariant singlet from two triplets one has two possibilities. The first one is
to exploit bulk fields, as we shall see in detail in the next subsection. The second one is to make
use of a freedom associated to the A4 algebra, by generalizing the transformation properties of
the brane fields in the following way:
S:
T:
a Sa,
a ra T a,
(24)
where is a cubic root of unity, Eq. (3), and ra = (0, 1).

Clearly these new transformations satisfy the A4 algebra, Eq. (6). The only difference with
respect to the transformations in Eq. (19) is in the phase factor ra . It is possible to show that,
once the delta function transformations are specified as in Eq. (18), then Eq. (24) provides the
only allowed generalization of Eq. (19). If we call R0,1,+1 these representations, we see that
37
they are all reducible: R0 decomposes into a triplet plus the invariant singlet 1, R+1 decomposes
into a triplet plus the singlet 1 and R1 decomposes into a triplet plus the singlet 1 . It is immediate to see that J (2) is left invariant by A4 only if (a, b) transform as Ra , Rb with a + b = 0.
To build a non-invariant singlet one has to assign (a, b) to (R0 , R1 ). For example, consider the
case R = +1 for b. Then the triplet (w1 , w2 , w3 ) can be embedded in b in the following way:
w1 + w2 + 2 w3
1 +w1 w2 + 2 w3
b=
(25)
.
2 +w1 + w2 2 w3
w1 w2 2 w3
Now the bilinear
4

(26)
a i b i i ,
i=1
is invariant under S and picks up a phase under T , that is it transforms as a singlet 1 . After
integrating over the coordinate z, we find

d 2z
4

ai bi i = v1 w1 + v2 w2 + 2 v3 w3 .
(27)
i=1
This example shows that, although from the point of view of the group A4 the triplet representations contained in R0 , R+1 , R1 are all equivalent (they can be seen as the result of the
multiplication of a triplet by the singlets 1, 1 , 1 , respectively), in this 6D framework their
difference is not irrelevant when building up local interactions covariant under A4 .
Generalizing what done above, a local invariant J (N ) of degree N , built out of M brane
multiplets a (I ) (I = 1, . . . , M) transforming as RrI is given by:
J (N ) =
where
4

(1) n1 (M) nM
ai
ai
i ,
i=1
M
I =1 nI
= N and
M
I =1 rI
(28)
= 0 (mod 3).
3.2. Bulk and brane fields

Here we consider the coupling between a bulk multiplet B(z) = (B1 (z), B2 (z), B3 (z)), transforming as a triplet of A4 , and a brane multiplet a = (a1 , a2 , a3 , a4 ), transforming as R0 under A4 . The dependence on the 4D spacetime coordinates x is not made explicit in our notation.
For the time being, we assume that the three components BI (z) (I = 1, 2, 3) are scalars in 6D.
The transformations of B under A4 are specified by:
S:
B (zS ) = S3 B(z),
T:
B (zT ) = T3 B(z),
1
zS = z + ,
2
zT = z.
We write the most general local term bilinear in a and B as:

J=
iK ai BK (z)i ,
iK
(29)
(30)
38
where iK is a four by three matrix of constant coefficients. It is not difficult to see that, in order
to have J invariant under A4 , we should choose
1 +1 +1
1 +1 1 +1
iK =
(31)
,
2 +1 +1 1
1 1 1
up to an overall constant. If the brane multiplet a is a R0 triplet under A4 , parametrized as in
Eq. (21), by choosing iK as in (31), after integration over z we get:

1
J = (v1 + v2 + v3 ) B1 (z1 ) + B2 (z1 ) + B3 (z1 )
4

1
+ (+v1 v2 + v3 ) +B1 (z2 ) B2 (z2 ) + B3 (z2 )
4

1
+ (+v1 + v2 v3 ) +B1 (z3 ) + B2 (z3 ) B3 (z3 )
4

1
+ (+v1 + v2 + v3 ) +B1 (z4 ) + B2 (z4 ) + B3 (z4 ) .
(32)
4
If the triplet B(z) acquires a constant VEV B(z) = (B1 , B2 , B3 ), essentially the only case that
will be relevant for the discussion in the next session, then the invariant J becomes
J = v1 B1 + v2 B2 + v3 B3 .
(33)
Similarly, by requiring that J given by

iK
ai BK (z)i ,
J =
(34)
iK
should be given by
transforms as a 1 , we find that the matrix iK
1 + +2
2
1 +1 +

iK
=
.
2 +1 + 2
1 2
(35)
In this case, after integration over z and after substitution of the triplet B(z) with its constant
VEV, the quantity J of Eq. (30) becomes
J = v1 B1 + v2 B2 + 2 v3 B3 .
(36)
with its complex conjugate .

Finally, the singlet 1 is obtained from J , by substituting iK
iK
4. Orbifold realization of the A4 model

Let us start by recalling the basic formulae for the baseline A4 model for lepton masses and
mixings in 4D with supersymmetry [8]. The full superpotential of the model is
w = w l + wd
(37)
where wl is the term responsible for the Yukawa interactions in the lepton sector and wd is the
term responsible for the vacuum alignment. We now detail the structure of both in succession.
39
The term wl is given by

wl = ye ec (T l) + y c (T l) + y c (T l) + (xa + xa )(ll)
+ xb (S ll) + h.c. + .
(38)
To keep our formulae compact, we omit to write the Higgs fields hu,d and the cut-off scale .
For instance ye ec (T l) stands for ye ec (T l)hd /, xa (ll) stands for xa (lhu lhu )/2 and so on.
The superpotential wl contains the lowest order operators in an expansion in powers of 1/.
Dots stand for higher dimensional operators. The driving term wd reads:

wd = M 0T T + g 0T T T

+ g1 0S S S + g2 0S S + g3 0 (S S ) + g4 0 2 + g5 0 + g6 0 2 ,
(39)
where 0T , 0S and 0 are driving fields that allow to build a non-trivial scalar potential in the
symmetry breaking sector. The superpotential w is invariant not only with respect to the gauge
symmetry SU(2) U(1) and the flavour symmetry A4 , but also under a discrete Z3 symmetry
and a continuous U(1)R symmetry under which the fields transform as shown in Table 1.
We now show how this model can be derived from the 6D field theory with orbifolding. We
start from an N = 1 chiral supersymmetric 6D field theory, corresponding to N = 2 SUSY in
the 4D language. Such an extended SUSY is broken down to N = 1 SUSY by the Z2 parity in
the usual way. The Lagrangian of the theory is the sum of a bulk term, depending on bulk fields
and invariant under N = 2 SUSY, plus boundary terms localized at the four fixed points and
invariant under the less restrictive N = 1 SUSY. Moreover at the fixed points we are allowed to
localize brane N = 1 multiplets. In particular, we choose as brane fields the gauge bosons of the
SM gauge group, the SM fermions and two Higgs doublets hu and hd , together with their N = 1
superpartners. The remaining fields, namely the flavons and the driving fields are introduced as
bulk hypermultiplets. In this way we avoid 6D gauge anomalies. Due to the orbifolding, out of the
two N = 1 chiral supermultiplets contained in the generic bulk hypermultiplet only one possesses
a zero mode. Here we are interested in the brane interactions of this particular multiplet, and we
will use for it the N = 1 notation.
The dictionary between the 4D realization, specified by the superpotential wl and the present
6D version, is given in Table 2. We have denoted by li the lepton doublet supermultiplets, which
are A4 -triplet brane fields parametrized as in Eq. (21):
le + l + l
1 +l l + l
l= e
(40)
.
2 +le + l l
le l l
The charged leptons ec , c and c are brane fields, having the same value at each fixed point.
As anticipated, the flavon fields S (z), T (z) and (z) are bulk fields, depending on the extra
Table 1
Fields and their transformation properties under A4 , Z3 and U(1)R
Field
A4
Z3
U(1)R
l
3
ec
hu,d
0S
1
0T
1
2
1
1
2
1
2
1
1
1
0
3
1
0
3
1
2
40
Table 2
Realization of 4D superpotential terms for wl in terms of local 6D A4 invariants. The 4D terms are obtained from the 6D
ones by integrating over
z and by assuming a constant background value for the bulk multiplets
the complex coordinate
S,T (z) = S,T / V , (z) = / V

4D
(ll)
(S ll)
ec (T l)
c (T l)
c (T l)
6D
4
i=1 li li (z)i
4
i=1 li li iK SK (z)i
c
i=1 e li iK T K (z)i
4

c
i=1 li iK T K (z)i
4

c
i=1 li iK T K (z)i
4
coordinate z. In particular S (z) and T (z) are A4 triplets, transforming as in Eq. (29), while
(z) is an A4 invariant: (z + 1/2) = (z) and (z) = (z). Each 4D superpotential term
is reproduced, up to an overall constant, from the corresponding 6D term of the dictionary by
integrating over the complex coordinate z and by assuming a constant, that is z-independent,
background value for the bulk supermultiplets S (z), T (z) and (z). This last requirement is
justified by the fact that we only need to discuss the expansion of w around the VEVs of the
flavon fields. Barring a peculiar behaviour of such VEVs, we will look for minima of the scalar
potential that do not depend on z and in our final expressions the bulk fields will be replaced by
their constant VEVs. In this way the superpotential wl is completely reproduced.
To correctly establish the relation between the 6D superpotential and the 4D one we should
also pay attention to the overall normalization of wl . The 6D superpotential wl is linear in the
bulk fields having mass dimension two and therefore carries an extra factor 1/ with respect to
the 4D
superpotential. Moreover, the VEV of the generic bulk field B can be parametrized as
B/ V where B is the VEV of the zero mode, of mass dimension one, and V is the volume
of the extra compact space. Therefore, after spontaneous breaking of the A4
symmetry, each bulk
field B enters the superpotential in the dimensionless combination B/(2 V ). Higher dimensional operators are suppressed by extra powers of this combination. To avoid large corrections
to the HPS mixing scheme, such a combination is required to be at most of order 2 , 0.22
being the Cabibbo angle. This is of no concern for the lepton sector of the theory, but it can be
a potential problem for the extension of the A4 model, both in its 4D and 6D realizations, to the
quark sector. Indeed we expect that the mass of the top quark arises from an unsuppressed renormalizable operator, whereas a naive extension of the A4 assignment to the
quark sector of our
6D model would lead to a top mass depleted by an overall factor T /(2 V ) (with respect,
say, to the W mass), which as we have seen is expected to be of order 2 .
Finally we need a similar dictionary for the driving part of the superpotential. It is easy to see
that each 4D term in wd can be reproduced starting from a corresponding 6D term, by assuming
constant field configurations and by integrating over the coordinate z. The new feature when
analyzing wd is that in general there is no one-to-one correspondence between 4D and 6D terms
as was the case for wl because the number of local 6D invariants we can build from bulk fields is
larger than the number of 4D invariants we have in wd . This is not an obstacle in deriving the 4D
theory. Since we are interested in constant field configurations of the flavon and driving fields,
after integration over z our 6D driving superpotential will indeed give rise to the most general
set of A4 invariants in 4D. The result is nothing but the superpotential wd given in Eq. (39). At
this point the discussion of the vacuum alignment proceeds exactly as in the 4D case, detailed in
Ref. [8].
41
The scalar potential is minimum at:

1
vT
M
T = (vT , vT , vT ),
= ,
g
V
V
1
g
4
S = (vS , 0, 0),
vS2 = u2 ,
g3
V
1
= u, u undetermined,
V
= 0.
(41)
At the leading order of the 1/ expansion, the mass matrix ml for charged leptons is given by:

ye
ye
ye
vT
2
ml = vd
(42)
y y y ,
2 V y
y 2 y
and charged fermion masses are given by:
vT
vT
vT
me = 3ye vd ,
(43)
m = 3y vd ,
m = 3y vd .
2
2
2
V
V
V
We can easily obtain a natural hierarchy among me , m and m by introducing an additional
U(1)F flavour symmetry under which only the right-handed lepton sector is charged. In the
flavour basis the neutrino mass matrix reads:

d/3
vu2 a + 2d/3 d/3
m =
(44)
d/3
2d/3
a d/3 ,
d/3
a d/3
2d/3
where
u
vS
d xd ,
,
2
V
V
and is diagonalized by the transformation:
a xa
U T m U =
with
vu2
diag(a + d, a, a + d),

2/3
U = 1/6
1/ 6
1/3
0
1/3 1/2 .
1/ 3 +1/ 2
For the neutrino masses we obtain:

1
2
|m1 | = r +
m2atm ,
8 cos2 (1 2r)
1
m2atm ,
|m2 |2 =
2
8 cos (1 2r)

1
2
|m3 | = 1 r +
m2atm ,
8 cos2 (1 2r)
(45)
(46)
(47)
(48)
where r m2sol / m2atm (|m2 |2 |m1 |2 )/(|m3 |2 |m1 |2 ), m2atm |m3 |2 |m1 |2 and is
the phase difference between the complex numbers a and d. For cos = 1, we have a neutrino
42
Fig. 2. On the left panel, sum of neutrino masses versus cos , the phase difference between a and b. On the right panel,
the lightest neutrino mass, m1 and the mass combination mee versus cos . To evaluate the masses, the parameters |a| and
|b| have been expressed in terms of r m2sol / m2atm (|m2 |2 |m1 |2 )/(|m3 |2 |m1 |2 ) and m2atm |m3 |2 |m1 |2 .
The bands have been obtained by varying m2atm in its 3 experimental range, 0.0020 eV0.0032 eV. There is a negligible sensitivity to the variations of r within its current 3 experimental range, and we have realized the plots by choosing
r = 0.03.
spectrum close to hierarchical:

|m3 | 0.053 eV,
|m1 | |m2 | 0.017 eV.
(49)
In this case the sum of neutrino masses is about 0.087 eV. If cos is accidentally small, the
neutrino spectrum becomes degenerate. The value of |mee |, the parameter characterizing the
violation of total lepton number in neutrinoless double beta decay, is given by:

1 + 4r
1
2
+
m2atm .
|mee | =
(50)
9
8 cos2 (1 2r)
For cos = 1 we get |mee | 0.005 eV, at the upper edge of the range allowed for normal
hierarchy, but unfortunately too small to be detected in a near future. Independently from the
value of the unknown phase we get the relation:

10
r
,
|m3 |2 = |mee |2 + m2atm 1
(51)
9
2
which is a prediction of our model. In Fig. 2 we have plotted the neutrino masses predicted by
the model.
In summary, we have obtained the baseline 4D A4 model starting from a 6D realization,
where all SM supermultiplets live at the fixed points of a T 2 /Z2 orbifold and the flavon and
driving fields live in the bulk.
5. Conclusion
We have shown that extra dimensional theories with orbifolding provide a natural framework
to interpret flavour symmetries as discrete permutational symmetries among fixed point branes.
In particular, starting from a 6D theory, we have discussed an orbifolding with 4 fixed points
43
leading to the A4 flavour symmetry. In this picture A4 together with 4D translations and 4D
proper Lorentz transformations represents the subgroup of 6D spacetime symmetry which is
left unbroken in the theory after orbifolding and after locating the SM particles on the fixed
point branes. Note that A4 and not the full permutation group S4 is the residual symmetry group
because only even permutations can be seen as the result of a rigid space rotation. Each brane
field, either a triplet or a singlet, has components on all of the four fixed points (in particular all
components are equal for a singlet) but the interactions are local, i.e. all vertices involve products of field components at the same spacetime point. This approach suggests a deep relation
between flavour symmetry in 4D and spacetime symmetry in extra dimensions. We have also
demonstrated that a SUSY model of neutrino tri-bimaximal mixing based on A4 , which we have
formulated in a recent work [8], can be directly reinterpreted in the orbifolding approach.
Acknowledgements
We thank L.E. Ibanez for several useful discussions. We recognize that this work has been
partly supported by the European Commission under contract MRTN-CT-2004-503369.
References
[1] T. Schwetz, Phys. Scr. T 127 (2006) 1, hep-ph/0606060;
M. Maltoni, T. Schwetz, M.A. Tortola, J.W.F. Valle, New J. Phys. 6 (2004) 122, hep-ph/0405172;
A. Strumia, F. Vissani, Nucl. Phys. B 726 (2005) 294, hep-ph/0503246;
G.L. Fogli, et al., hep-ph/0608060.
[2] P.F. Harrison, D.H. Perkins, W.G. Scott, Phys. Lett. B 530 (2002) 167, hep-ph/0202074;
P.F. Harrison, W.G. Scott, Phys. Lett. B 535 (2002) 163, hep-ph/0203209;
Z.z. Xing, Phys. Lett. B 533 (2002) 85, hep-ph/0204049;
P.F. Harrison, W.G. Scott, hep-ph/0402006;
P.F. Harrison, W.G. Scott, hep-ph/0403278.
[3] E. Ma, G. Rajasekaran, Phys. Rev. D 64 (2001) 113012, hep-ph/0106291.
[4] E. Ma, Mod. Phys. Lett. A 17 (2002) 627, hep-ph/0203238;
K.S. Babu, E. Ma, J.W.F. Valle, Phys. Lett. B 552 (2003) 207, hep-ph/0206292;
M. Hirsch, J.C. Romao, S. Skadhauge, J.W.F. Valle, A. Villanova del Moral, hep-ph/0312244;
E. Ma, Phys. Rev. D 70 (2004) 031901, hep-ph/0404199;
E. Ma, hep-ph/0409075;
E. Ma, New J. Phys. 6 (2004) 104;
S.L. Chen, M. Frigerio, E. Ma, Nucl. Phys. B 724 (2005) 423, hep-ph/0504181;
K.S. Babu, X.G. He, hep-ph/0507217;
A. Zee, Phys. Lett. B 630 (2005) 58, hep-ph/0508278;
E. Ma, Mod. Phys. Lett. A 20 (2005) 2601, hep-ph/0508099;
E. Ma, hep-ph/0511133;
S.K. Kang, Z.z. Xing, S. Zhou, Phys. Rev. D 73 (2006) 013001, hep-ph/0511157;
X.G. He, Y.Y. Keum, R.R. Volkas, JHEP 0604 (2006) 039, hep-ph/0601001;
B. Adhikary, B. Brahmachari, A. Ghosal, E. Ma, M.K. Parida, Phys. Lett. B 638 (2006) 345, hep-ph/0603059;
E. Ma, hep-ph/0607190;
L. Lavoura, H. Kuhbock, hep-ph/0610050.
[5] S.F. King, JHEP 0508 (2005) 105, hep-ph/0506297;
I. de Medeiros Varzielas, G.G. Ross, Nucl. Phys. B 733 (2006) 31, hep-ph/0507176;
S.F. King, M. Malinsky, hep-ph/0608021.
44
[6] For others approaches to the tri-bimaximal mixing see: J. Matias, C.P. Burgess, JHEP 0509 (2005) 052, hep-ph/
0508156;
S. Luo, Z.z. Xing, hep-ph/0509065;
W. Grimus, L. Lavoura, hep-ph/0509239;
F. Caravaglios, S. Morisi, hep-ph/0510321;
I. de Medeiros Varzielas, S.F. King, G.G. Ross, hep-ph/0512313;
C. Hagedorn, M. Lindner, R.N. Mohapatra, JHEP 0606 (2006) 042, hep-ph/0602244;
P. Kovtun, A. Zee, Phys. Lett. B 640 (2006) 37, hep-ph/0604169;
R.N. Mohapatra, S. Nasri, H.B. Yu, Phys. Lett. B 639 (2006) 318, hep-ph/0605020;
Z.z. Xing, H. Zhang, S. Zhou, Phys. Lett. B 641 (2006) 189, hep-ph/0607091.
[7] G. Altarelli, F. Feruglio, Nucl. Phys. B 720 (2005) 64, hep-ph/0504165.
[8] G. Altarelli, F. Feruglio, Nucl. Phys. B 741 (2006) 215, hep-ph/0512103.
[9] See for instance A. Giveon, M. Porrati, E. Rabinovici, Phys. Rep. 244 (1994) 77, hep-th/9401139;
For more specific examples see: W. Lerche, D. Lust, N.P. Warner, Phys. Lett. B 231 (1989) 417;
E.J. Chun, J. Mas, J. Lauer, H.P. Nilles, Phys. Lett. B 233 (1989) 141;
S. Ferrara, D. Lust, S. Theisen, Phys. Lett. B 233 (1989) 147.
[10] T. Watari, T. Yanagida, Phys. Lett. B 532 (2002) 252, hep-ph/0201086;
T. Watari, T. Yanagida, Phys. Lett. B 544 (2002) 167, hep-ph/0205090.
[11] T. Kobayashi, S. Raby, R.J. Zhang, Nucl. Phys. B 704 (2005) 3, hep-ph/0409098;
J. Kubo, Phys. Lett. B 622 (2005) 303, hep-ph/0506043;
N. Haba, A. Watanabe, K. Yoshioka, Phys. Rev. Lett. 97 (2006) 041601, hep-ph/0603116;
M.T. Eisele, N. Haba, Phys. Rev. D 74 (2006) 073007, hep-ph/0603158.
Consistency of the two-Higgs-doublet model

and CP violation in top production at the LHC
Abdul Wahab El Kaffas a , Wafaa Khater b , Odd Magne Ogreid c ,
Per Osland a,
a Department of Physics and Technology, University of Bergen, PO Box 7803, N-5020 Bergen, Norway
b Department of Physics, Birzeit University, Palestine
c Bergen University College, Bergen, Norway
Received 26 July 2006; received in revised form 28 November 2006; accepted 30 March 2007
Abstract
It is important to provide guidance on whether CP violation may be measurable in top-quark production
at the Large Hadron Collider. The present work extends an earlier analysis of the non-supersymmetric twoHiggs-doublet model in this respect, by allowing a more general potential. Also, a more comprehensive
study of theoretical and experimental constraints on the model is presented. Vacuum stability, unitarity,
direct searches and electroweak precision measurements severely constrain the model. We explore, at low
tan , the allowed regions in the multidimensional parameter space that give a viable physical model. This
exploration is focused on the parameter space of the neutral sector rotation matrix, which is closely related
to the Yukawa couplings of interest. In most of the remaining allowed regions, the model violates CP.
We present a quantitative discussion of a particular CP-violating observable. This would be measurable in
semileptonically decaying top and antitop quarks produced at the LHC, provided the number of available
events is of the order of a million.
1. Introduction
The Two-Higgs-Doublet Model (2HDM) is attractive as one of the simplest extensions of
the Standard Model that admits additional CP violation [13]. This is an interesting possibility,
E-mail address: per.osland@ift.uib.no (P. Osland).

doi:10.1016/j.nuclphysb.2007.03.041
46
A.W. El Kaffas et al. / Nuclear Physics B 775 (2007) 4577
given the unexplained baryon asymmetry of the Universe [4,5], and the possibility of exploring
relevant, new physics at the LHC [6]. In particular, the model can lead to CP violation in t t
production, a process which has received considerable theoretical attention [79], since it will
become possible to severely constrain or even measure it.
CP violation can be induced in t t production at the one-loop level, by the exchange of neutral
Higgs bosons which are not eigenstates under CP. This effect is only large enough to be of
experimental interest if the neutral Higgs bosons are reasonably light, and have strong couplings
to the top quarks.
Within the 2HDM (II), where the top quark gets its mass from coupling to the Higgs field 2
[10] (see Section 3.5), the condition of having sizable H t t couplings forces us to consider small
values of tan . A first exploration of this limit was presented in [11]. In that paper, the general
conditions for measurability of CP violation in gg t t at the LHC [8] were found to be satisfied
in a certain region of the 2HDM parameter space. In addition to having small tan , in order to
have a measurable signal with a realistic amount of data (of the order of a million t t events), it
was found necessary that the lightest neutral Higgs boson be light, and that the spectrum not be
approximately degenerate. In fact, it was found that in the most favorable observable considered,
the effect would not reach the per mil level unless there is one and only one Higgs boson below
the t t threshold, and that tan is at most of order unity. We here extend the analysis of [11] to
the more general case, allowing the most general quartic couplings in the potential.
At small tan , also certain Yukawa couplings to charged Higgs bosons are enhanced. Such
couplings contribute to effects that are known experimentally to very high precision. In particular, at low tan the Bd0 B d0 oscillation data and the effective Zbb coupling, measured via Rb
[12,13] severely constrain the model, whereas the b s data [14] constrain it at low MH .
Furthermore, the high-precision measurement of the W and Z masses, as expressed via [13]
constrains the splitting of the Higgs mass spectrum. Unless there are cancellations, the charged
Higgs boson cannot be very much heavier than the lightest neutral one, and the lightest neutral
one cannot be far away from the mass scale of the W and the Z [15]. Also, the lightest one is
constrained by the direct searches at LEP [16,17]. We shall here study the interplay of these constraints, and estimate the amount of CP violation that may be measurable at the LHC in selected
favorable regions of the remaining parameter space.
An important characteristic of the 2HDM (as opposed to the MSSM [1822]) is the fact that,
at the level of the mathematics, the masses of the neutral and the charged Higgs bosons are
rather independent (see Section 2). However, the experimental precision on (see Section 4.3)
forces the charged Higgs mass to be comparable in magnitude to the neutral Higgs masses.
Another important difference is that whereas small values of tan are practically excluded in the
MSSM [23], in the 2HDM, which has more free parameters, they are not.
For a recent comprehensive discussion of the experimental constraints on the 2HDM (though
mostly restricted to the CP-conserving limit), see [24] and [25]. The latter study, which considers
the CP-conserving limit, concludes that the model is practically excluded, with the muon anomalous moment being very constraining. However, the interpretation of the data is now considered
less firm, and furthermore, that study focuses on large tan , and is thus less relevant for the
present work.
We present in Section 2 an overview of the 2HDM, with focus on the approach of Ref. [11],
and outline the present extensions. In Section 3 we discuss the model in more detail, in particular
the implications of stability and unitarity, and review the conditions for having CP violation. In
Section 4 we discuss various experimental constraints on the model, with particular attention to
small values of tan . In Section 5 we present an overview of allowed parameter regions, also
47
restricted to small tan . In Section 6 we discuss the implications of the model for a particular CPviolating observable involving the energies of positrons and electrons from the decays of t and t
produced in gluongluon collisions at the LHC. Section 7 contains a summary and conclusions.
2. Review of the two-Higgs-doublet model
The 2HDM may be seen as an unconstrained version of the Higgs sector of the MSSM. While
at tree level the latter can be parametrized in terms of only two parameters, conventionally taken
to be tan and MA , the 2HDM has much more freedom. In particular, the neutral and charged
Higgs masses are rather independent.
Traditionally, the 2HDM is defined in terms of the potential. The parameters of the potential
(quartic and quadratic couplings) determine the masses of the neutral and the charged Higgs
bosons. Alternatively, and this is the approach followed here and in Ref. [11], one can take masses
and mixing angles as input, and determine parameters of the potential as derived quantities. This
approach highlights the fact that the neutral and charged sectors are rather independent, as well as
masses being physically more accessible than quartic couplings. However, some choices of input
will lead to physically acceptable potentials, others will not. This way, the two sectors remain
correlated.
In addition, the 2HDM neutral sector may or may not lead to CP violation, depending on the
choice of potential. We shall here consider the so-called model II, where u-type quarks acquire
masses from a Yukawa coupling to one Higgs doublet, 2 , whereas the d-type quarks couple to
the other, 1 . This structure is the same as in the MSSM.
2.1. The approach of Ref. [11]
The amount of CP violation that can be measured in t t production was related to the Higgs
mass spectrum and other model parameters in [11]. In that paper, the Higgs potential studied was
parametrized as [26]

1 2 2 2
1 1 +
2 2 + 3 1 1 2 2 + 4 1 2 2 1
2
2

1 2
+ 5 1 2 + h.c.
2

1
m211 1 1 + m212 1 2 + h.c. + m222 2 2 .
2
Expanding the Higgs-doublet fields as

i+
i = 1
(vi + i + ii )
V=
(2.1)
(2.2)
and choosing phases of i such that v1 and v2 are both real [27], it is convenient to define
3 = sin 1 + cos 2 orthogonal to the neutral Goldstone boson G0 = cos 1 + sin 2 .
In the basis (1 , 2 , 3 ), the resulting mass-squared matrix M2 of the neutral sector, can then
be diagonalized to physical states (H1 , H2 , H3 ) with masses M1 M2 M3 , via a rotation
matrix R:

H1
1
(2.3)
H2 = R 2 ,
H3
3
48
satisfying

RM2 R T = M2diag = diag M12 , M22 , M32 ,
and parametrized as1
(2.4)

1
0
0
cos 2 0 sin 2
R = R3 R2 R1 = 0 cos 3
sin 3
0
1
0
0 sin 3 cos 3
sin 2 0 cos 2

cos 1
sin 1 0
sin 1 cos 1 0
0
0
1

c1 c2
s1 c2
s2
= (c1 s2 s3 + s1 c3 )
c1 c3 s1 s2 s3
c2 s3
c1 s2 c3 + s1 s3 (c1 s3 + s1 s2 c3 ) c2 c3
(2.5)
with ci = cos i , si = sin i . The rotation angle 1 is chosen such that in the limit of no CP
violation (s2 0, s3 0) then 1 + 12 , where is the familiar mixing angle of the
CP-even sector, and the additional 12 provides the mapping H1 h, instead of H being in the
(1, 1) position of M2diag , as is used in the MSSM [10].
While the signs of i and i are fixed by our choice of taking the vacuum expectation values
real and positive [27], the phase of Hi has no physical consequence. One may therefore freely
change the sign of one or more rows, e.g., let R1i R1i (see Section 3.1.1).
Rather than describing the phenomenology in terms of the parameters of the potential (2.1),
in [11] the physical mass of the charged Higgs boson, as well as those of the two lightest neutral
ones, were taken as input, together with the rotation matrix R. Thus, the input can be summarized
as

Parameters: tan , (M1 , M2 ), MH , 2 , (1 , 2 , 3 ),
(2.6)
where tan = v2 /v1 and 2 = v 2 , with = Re m212 /(2v1 v2 ) and v = 246 GeV.
This approach provides better control of the physical content of the model. In particular, the
elements R13 and R23 of the rotation matrix must be non-zero in order to yield CP violation. For
consistency, this requires Im 5 and Im m212 (as derived quantities) to be non-zero.
2.2. The general potential
For the potential, in this study, we take

V = Expression (2.1) + 6 1 1 + 7 2 2 1 2 + h.c. .
(2.7)
The new terms proportional to 6 and 7 have to be carefully constrained, since this potential
does not satisfy natural flavour conservation [28], even if each doublet is coupled only to up-type
or only to down-type flavours.
The various coupling constants in the potential will of course depend on the choice of basis
(1 , 2 ). Recently, there has been some focus [29] on the importance of formulating physical
observables in a basis-independent manner. Here, we shall adopt the so-called model II [10] for
the Yukawa couplings. This will uniquely identify the basis in the (1 , 2 ) space.
1 In Ref. [11], these angles were referred to as (,
b , c ) (1 , 2 , 3 ).
49
Minimizing the potential (2.7), we can rewrite it (modulo a constant) as

1 v12 2 2 v22 2
+
+ 3 1 1 2 2
1 1
2 2
V=
2
2
2
2

1 2
+ 4 1 2 2 1 +
5 1 2 + 6 1 1 + 7 2 2 1 2 + h.c.
2

1
[Re 34567 2] v22 1 1 + v12 2 2 v1 v2 Re 6 1 1 + 7 2 2
2

v1 v2 2 Re 1 2 Im 567 Im 1 2 .
(2.8)
Here and in the following, we adopt the abbreviations
v1
v2
34567 = 345 + 6 + 7 ,
345 = 3 + 4 + 5 ,
v2
v1
567 = 5 +
v1
v2
6 + 7 .
v2
v1
(2.9)
The mass-squared matrix M2 of (2.4), corresponding to the neutral sector of the potential, is
found to be

s
Re 3c2 6 s2 7 ,
M211 = v 2 c2 1 + s2 +
2c

2
c
2
2 2
2
2
Re c 6 + 3s 7 ,
M22 = v s 2 + c +
2s

1 2
2
2
2
c 6 + s 7 ,
M33 = v Re 5 +
2c s

3 2
2
2
2
M12 = v c s (Re 345 ) + Re c 6 + s 7 ,
2
1
M213 = v 2 Im[s 5 + 2c 6 ],
2
1
M223 = v 2 Im[c 5 + 2s 7 ],
(2.10)
2
with M2j i = M2ij .
Here, compared with the potential (2.1), we have two more complex parameters, 6 and 7
(four new real parameters), but rather than those, we take as additional parameters M3 , Im 5 ,
Re 6 and Re 7 . Thus, the input will be

Parameters: tan , (M1 , M2 , M3 ), MH , 2 , (1 , 2 , 3 ), Im 5 , (Re 6 , Re 7 ).
(2.11)
3. Model properties
We want to explore regions of parameter space where there is significant CP violation. In order
to do that, we need to map out regions in the {1 , 2 , 3 } space where the model is consistent
(figures are presented in Section 5).
From Eq. (2.4), it follows that

M2ij =
(3.1)
Rki Mk2 Rkj .
k
50
Here, it is evident that the signs of the rows of R play no role.

Comparing the expressions (3.1) with (2.10), invoking also
1 2
2
2
MH
v (4 + Re 567 ),
=
2
we can solve for the s. In particular, it follows from (2.10) that

s
1 1
2
M
+
Im
Im 6 =
5 ,
c v 2 13
2

c
1 1
2
M
+
Im
Im 7 =
5 .
s v 2 23
2
(3.2)
(3.3)
3.1. Symmetries
By exploiting certain symmetries of the rotation matrix R, we can reduce the ranges of parameters that have to be explored.
3.1.1. Transformations of the rotation matrix
The rotation matrix R is invariant under the following transformation;
A: 1 + 1 , 2 2 , 3 + 3 ;
(3.4)
which leaves its elements unchanged.

Another class of transformations are those where two rows of R (i.e., physical Higgs fields)
change sign, as discussed in Section 2.1. The transformations are [11]:
B1:
1 + 1 , 2 2 , 3 fixed: R1i R1i ,
R2i R2i ,
R3i R3i ,
B2:
1 fixed, 2 + 2 , 3 3 : R1i R1i ,
B3:
1 + 1 , 2 2 , 3 3 :
R2i R2i ,
R1i R1i ,
R3i R3i ,
R2i R2i ,
R3i R3i .
(3.5)
Actually, any one of these is a combination of the other two. For example, the transformation
B3 is the combination of B1 and B2. Other transformations exist that will yield the same symmetries, but they will be combinations of one of these three transformations followed by the
transformation A. In total we have 6 different transformations that yield symmetries of type B.
The third class of transformation we consider are those where two columns of R change sign.
These transformations are:
C1:
1 1 , 2 + 2 , 3 fixed: Rj 1 Rj 1 ,
Rj 2 Rj 2 ,
Rj 3 Rj 3 ,
C2:
1 1 , 2 + 2 , 3 fixed: Rj 1 Rj 1 ,
Rj 2 R j 2 ,
Rj 3 Rj 3 ,
Rj 3 R j 3 .
(3.6)
The transformation C3 is the combination of the transformations C1 and C2. Other transformations exist that will yield the same symmetries, but they will be combinations of one of these
C3:
1 + 1 , 2 fixed, 3 fixed: Rj 1 Rj 1 ,
Rj 2 Rj 2 ,
51
three transformations followed by the transformation A. In total we have 6 different transformation that yield symmetries of type C.
Under transformations of types A and B, the resulting mass-squared matrix M2 = R T M2diag R
will be invariant. We make use of this fact along with the symmetries A, B1 and B2 to reduce the
parameter space under consideration to
/2 < {1 , 2 , 3 } /2.
(3.7)
Under transformations of type C, the mass-squared matrix will not be invariant, some of its
non-diagonal elements will change sign while the rest are unaltered.
C1:
M212 M212 ,
M213 M213 ,
M223 +M223 ,
C2:
M212 M212 ,
M213 +M213 ,
M223 M223 ,
C3:
M212 +M212 ,
M213 M213 ,
M223 M223 .
(3.8)
While a change of the sign of M212 implies changes in the physical content of the model,
a change of sign of M213 and/or M223 can be compensated for by adjusting the imaginary parts
of 5 , 6 and 7 . Thus, the most interesting transformation among the set (3.8) is C3.
The transformation B3 C3 is physically equivalent to C3 since transformations of type B
leave the mass-squared matrix invariant:
B3 C3:
1 fixed, 2 2 , 3 3 :
M212 +M212 ,
M223 M223 .
M213 M213 ,
(3.9)
When Im 5 = 0, it follows from (3.3) that a sign change of M213 and M223 can be compensated for by sign changes of Im 6 and Im 7 . These signs play no role in the discussion
of stability (see Appendix A) and unitarity [30]. We shall therefore, when discussing the case
Im 5 = 0 (Sections 5.1 and 5.2), make use of (3.9) to restrict the angular range from (3.7) to the
smaller
/2 < {1 , 2 } /2,
0 3 /2.
(3.10)
When Im 5 = 0 we need to consider the angular range as given in (3.7).

3.1.2. Inversion of tan
The Higgs sector is invariant under
tan cot ,
2 2 ,
1
1 1 ,
2
3 3 ,
(3.11)
accompanied by
1 2 ,
3 , 4 3 , 4 ,
5 5 ,
6 7 .
(3.12)
This is just the symmetry between 1 and 2 , and will be violated by the introduction of model II
Yukawa couplings, which distinguish between the two Higgs doublets, i.e., between tan and
cot .
52
3.2. CP violation
In general, with all three rotation-matrix angles non-zero, the model will violate CP. However,
in certain limits, this is not the case. In order not to have CP violation, the mass-squared matrix
must be block diagonal, i.e., one must require
M213 = M223 = 0.
(3.13)
Thus, CP conservation requires

M213 = M12 R11 R13 + M22 R21 R23 + M32 R31 R33 = 0,
M223 = M12 R12 R13 + M22 R22 R23 + M32 R32 R33 = 0.
(3.14)
One possible solution of (3.14) is that

M1 = M 2 = M 3 .
(3.15)
The expressions (3.14) then vanish, by the orthogonality of R. There are additional limits of no
CP violation, as discussed below.
Expressed in terms of the angles of the rotation matrix, the above elements describing mixing
of the CP-even and CP-odd parts of M2 take the form

M213 = c1 c2 s2 M12 s32 M22 c32 M32 + s1 c2 c3 s3 M22 + M32 ,

M223 = c1 c2 c3 s3 M22 M32 + s1 c2 s2 M12 s32 M22 c32 M32 .
(3.16)
In the mass-non-degenerate case, they vanish (there is thus no CP violation) if either:
Case I:
sin 22 = 0 and
Case II:
cos 2 = 0,
sin 23 = 0,
3 arbitrary.
or
(3.17)
Note that M12 s32 M22 c32 M32 < 0 for non-degenerate or partially degenerate masses, ordered
such that M1 M2 M3 (where no more than two of the masses are equal). Thus, there are no
additional CP-conserving solutions for the vanishing of this factor. The cases of partial degeneracy, M1 = M2 = M3 , and M1 = M2 = M3 will be discussed in Section 3.7.
It is thus natural to focus on the angles 2 and 3 . In particular, since R12 R13 is associated
with CP violation in the H1 t t coupling (see Section 3.5), we are interested in regions where
| sin(22 )| is large.
3.3. Reference parameters
In order to search for parameters with large CP violation, we will assume H1 is light, and
that M2 is not close to M1 , as such degeneracy would cancel any CP violation.
For illustration, as a conservative default set of parameters, we take

tan = {0.5, 1.0, 2.0}, (M1 , M2 , M3 ) = (100, 300, 500) GeV,
Set A:
(3.18)
MH = 500 GeV, 2 = (200 GeV)2 , Im 5 = 0, Re 6 = Re 7 = 0.
Here, the lightest neutral Higgs boson can be accommodated by the negative LEP searches [16,
17] provided it does not couple too strongly to the Z, and the charged Higgs boson mass is
compatible with the negative LEP [13] and Fermilab searches [31] as well as with the Bd0 B d0
oscillation, Rb constraints [25] (see Section 4.1) and the b s analysis at low tan [14].
53
As a second set of parameters, we take

tan = {0.5, 1.0, 2.0}, (M1 , M2 ) = (80, 300) GeV, M3 = {400, 600} GeV,
Set B:
MH = 300 GeV, 2 = {0, (200 GeV)2 }, Im 5 = 0, Re 6 = Re 7 = 0.
(3.19)
This set, which represents a light Higgs sector, is marginally in conflict with data (the combination of charged-Higgs mass and tan values violate the Rb constraints by up to 5 , see Table 2
in Section 4.5), but is chosen for a more optimistic comparison, since it could give more CP
violation due to a lower value of M1 (which enhances the loop integrals).
3.4. Stability and unitarity
A necessary condition we must impose on the model, is that the potential is positive when
|1 | and |2 | . This constraint, which is rather involved, is discussed in Appendix A. Two
obvious conditions are that
1 > 0,
(3.20)
2 > 0.
In general, the additional stability constraint is that 3 and 4 cannot be too large and negative,
and that |5 |, |6 |, |7 | cannot be too large.
Furthermore, we shall impose tree-level unitarity on the HiggsHiggs-scattering sector, as
formulated in [30,32] (see also Ref. [33]). This latter constraint is related to the perturbativity
constraint (s not allowed too large) adopted in Ref. [11], but actually turns out to be numerically more severe.
3.5. Yukawa couplings
With the above notation, and adopting the so-called model II [10] for the Yukawa couplings,
where the down-type and up-type quarks are coupled only to 1 and 2 , respectively, the couplings can be expressed (relative to the SM coupling) as
Hj bb:
Hj t t:
1
[Rj 1 i5 sin Rj 3 ],
cos
1
5.
[Rj 2 i5 cos Rj 3 ] a + i a
sin
(3.21)
Likewise, we have for the charged Higgs bosons [10]

H + bt:
H t b:

ig
mb (1 + 5 ) tan + mt (1 5 ) cot ,
2 2mW

ig
mb (1 5 ) tan + mt (1 + 5 ) cot .
2 2mW
(3.22)
With this Yukawa structure, the model is denoted as the 2HDM (II).
The product of the Hj t t scalar and pseudoscalar couplings,
(j )
CP = a a =
cos
sin2
R j 2 Rj 3
plays an important role in determining the amount of CP violation in the top-quark sector.
(3.23)
54
As was seen in Ref. [11], unless the Higgs boson is resonant with the t t system, CP violation
is largest for small Higgs masses. For a first orientation, we shall therefore focus on the contributions of the lightest Higgs boson, H1 . (There will also be significant contributions from the
two heavier Higgs bosons, as discussed in Section 6.) For the lightest Higgs boson, the coupling
(3.21) becomes
H1 t t:
1
[sin 1 cos 2 i5 cos sin 2 ],
sin
(1)
with CP =
1 sin 1 sin(22 )
,
2 tan sin
(3.24)
where 1 and 2 are mixing angles of the Higgs mass matrix as defined by Eqs. (2.4) and (2.5).
From (3.24), we see that low tan are required for having large CP violation in the top-quark
sector. However, according to (3.22), for low tan the charged-Higgs Yukawa coupling is also
enhanced. Thus, for low tan , the Rb , MBd [12,13] and b s constraints [14] force MH
to be high. For a quantitative discussion, see Section 4.1.
3.6. CP violation in the Yukawa sector
We shall in Section 6 study CP violation in the process
pp t tX e+ e X,
(3.25)
focusing on the sub-process

gg t t e+ e X.
(3.26)
Let the CP violating quantity of that process be given by [8,11]

3

(3.27)
Rj 2 Rj 3 f (Mj ),
j =1
where f (Mj ) is some function of the neutral Higgs mass Mj , in general determined by loop
integrals.
When the three neutral Higgs bosons are light, they will all contribute to the CP-violating
effects. In fact, in the limit of three mass-degenerate Higgs bosons, the model may still be consistent in the sense that solutions can be found in some regions of parameter space, but the CP
violation will cancel, since [cf. Eq. (3.23)]
3

j =1
(j )
CP
3
cos
sin2
Rj 2 Rj 3 = 0
(3.28)
j =1
due to the orthogonality of R.

3.7. Degenerate limits
The set of free parameters (2.11) permits all three neutral Higgs masses to be degenerate. As
discussed above, in this limit there is no CP violation, by orthogonality of the rotation matrix R.
However, in contrast to the case of 6 = 7 = 0 studied in [11], the partial degeneracies are
non-trivial and may lead to CP violation for certain choices of the angles i :
M1 = M2 m = M3
In this limit, the elements of M2 that induce CP violation, are

M213 = R31 R33 M32 m2 = c2 c3 (s1 s3 c1 s2 c3 ) M32 m2 ,

M223 = R32 R33 M32 m2 = c2 c3 (c1 s3 + s1 s2 c3 ) M32 m2 .
55
(3.29)
These both vanish, when the conditions (3.17) are satisfied, or else, when
cos 3 = 0,
1 , 2 arbitrary.
(3.30)
By orthogonality, when the two lighter Higgs bosons are degenerate, the CP violation (3.27)
in the top-quark sector is proportional to

R32 R33 f (M3 ) f (m) M223 .
(3.31)
Thus, even though the model violates CP in the limit M1 = M2 = M3 , by for example having M213 = 0, the top-quark sector would not violate CP at the one-loop level unless M223
R32 R33 = 0.
M1 = M2 = M3 M
In this limit, the elements of M2 that induce CP violation are

M213 = R11 R13 M 2 M12 = c1 c2 s2 M 2 M12 ,

M223 = R12 R13 M 2 M12 = s1 c2 s2 M 2 M12 .
(3.32)
We note that these both vanish for sin(22 ) = 0, meaning 2 = 0 or 2 = /2. Thus, in the limit
M1 = M2 = M3 and 2 = 0, but 3 arbitrary, the model does not violate CP, in agreement with
the results of [11].
In this limit of the two heavier Higgs bosons being degenerate, the CP violation in the topquark sector is proportional to2

R12 R13 f (M1 ) f (M) M223 .
(3.33)
In our parametrization, this is non-zero for
sin 1 = 0,
sin 22 = 0,
(3.34)
but with 3 arbitrary.

In the more constrained model discussed in [11], the latter limits of only two masses being
degenerate do not exist. In that case, with 6 = 7 = 0, a degeneracy of two masses forces the
third one to have that same value.
4. Experimental model constraints at low tan
It is convenient to split the experimental constraints on the 2HDM into two categories. There
are those involving only the charged Higgs boson, H , and those also involving the neutral ones.
The former, like the non-discovery of a charged Higgs boson, the b s constraint [14], and the
Bd0 B d0 oscillations [25] do not depend on the rotation matrix R and the amount of CP violation.
2 Whereas both these degenerate limits yield CP violation in the t -quark sector proportional to M2 , the corresponding
23
quantities in the b-quark sector are proportional to M213 .
56
Table 1
2HDM contribution to Mb0 for parameter sets A and B, relative to its absolute value
tan
0.5
1.0
2.0
Set B: MH = 300 GeV

Set A: MH = 500 GeV
270%
140%
45%
24%
10%
5%
They are given by MH and its coupling to quarks, (3.22), i.e., on tan . On the other hand,
constraints involving the neutral ones depend on the details of the couplings, i.e., they depend
sensitively on the rotation matrix R as well as on the neutral Higgs mass spectrum. We shall first
review the constraints that depend only on the charged Higgs sector.
In Sections 4.24.5 we discuss constraints on the model that depend on the neutral sector.
For the purpose of determining these constraints, one has to generalize some predictions for the
CP-conserving case to the CP-violating case. Eqs. (4.3), (4.10), (4.12), (4.13) and (4.16) are the
results of such generalizations. In the CP-conserving limit, R13 = R23 = R31 = R32 = 0, and
these expressions simplify accordingly.
4.1. Constraints on the charged-Higgs sector
There are three important indirect constraints on the charged-Higgs sector: the Bd0 B d0 oscillations, Rb and b s .
The mass splitting in the neutral Bd mesons is sensitive to contributions from box diagrams
with top quark and charged Higgs exchange [3437], involving the Yukawa couplings (3.22). Indeed, the diagrams with one or two H exchanges give contributions proportional to (mt cot )2
2 . These
or (mt cot )4 multiplied by functions of MH that for large MH behave like 1/MH
contributions to MBd will constrain low values of MH , in particular at low values of tan .
While MBd is known experimentally to considerable precision, MBd = (3.304 0.046)
1010 MeV [13], its theoretical understanding is more limited. The largest theoretical uncertainty
is related to the parameter combination fB2 BB of the hadronic matrix element. This is only known
to a precision of 1015%. Thus, we cannot exclude models which give predictions for MBd that
deviate from the SM value by this order of magnitude, even if this deviation is large compared to
the experimental precision.
In Table 1 we show the contribution to Mb0 that are due to the additional 2HDM fields, for the
two parameter sets considered. It is clear that tan = 0.5 is incompatible with the experimental
and theoretical constraints on Mb0 , whereas the tan = 1.0 case is marginal.
As mentioned above, the b s constraints [14] also force MH to be high, in particular
for low tan . A recent analysis arrived at the bound MH 300 GeV [38]. However, at the
very low values of tan considered here, they are less severe than the Mb0 and Rb constraints.
The experimental constraints on Rb (see Section 4.5) depend on the charged Higgs mass as well
as on the neutral Higgs spectrum. However, this constraint is for low values of tan practically
independent of the neutral spectrum.
4.2. Higgs non-discovery at LEP
One might think that both parameter Set A and Set B would be in conflict with the negative
direct searches at LEP, because of the low values of M1 . However, these bounds are marginally
evaded by two facts which both dilute the experimental sensitivity. First, the H1 ZZ coupling is
57
suppressed by the square of the Higgsvectorvector coupling, which relative to the Standard
Model coupling is
Hj ZZ:
[cos Rj 1 + sin Rj 2 ],
for j = 1.
(4.1)
(1)
For large values of | sin 1 | (which is of interest in order to maximize CP of (3.24)), R11 will be
rather small, and the second term in (4.1), proportional to R12 , takes over. But this is suppressed
by the factor sin . For some quantitative studies of this suppression, which can easily be by
is
a factor of 2 or more, see Fig. 8 in [11]. Secondly, the typical decay channel, H1 bb,
suppressed by the square of the Yukawa coupling, Eq. (3.21). For small values of tan , this is
approximately cos2 1 cos2 2 + sin2 sin2 2 . In the limits of interest, both terms are small.
In the analysis of LEP data by DELPHI,3 a channel-specific dilution factor C 2 is defined
by [16]
ew
2
Z(hX) = Zh
CZ(hX)
,
(4.2)
ew is the Standard Model cross section for a particular Higgs mass M. In the 2HDM, for
where Zh
an H1 decaying to bb or , the dilution is caused by the two effects discussed above: There is
a reduced coupling to the Z boson [see (4.1)] and a modified (typically reduced) coupling to the
bb (or ) [see (3.21)]. Thus, we take

1 2
2
2
2
,
CZ(H
(4.3)
R + sin2 R13
= [cos R11 + sin R12 ]
1 bb)
cos2 11
and consider as excluded parameter sets those where this quantity exceeds the LEP bounds,
roughly approximated as [16]
2
CZ(H
1 bb)
= 0.2 at 100 GeV,
and 0.1 at 80 GeV.
(4.4)
The last term in (3.21), involving R13 , is absent in the CP-conserving case. However, at small
tan , it has little effect. Actually, similar results are obtained for both the bb and channels.
Presumably, when these are combined, a more strict limit would be obtained.
It is instructive to consider this expression (4.3) and the corresponding constraints in three
simple limits:
tan
1:
2
CZ(H
1 bb)
cos4 1 cos4 2
1.
(4.5)
This requires either 1 /2 or 2 /2.

1 2
2
2
(cos 1 + sin 1 ) cos 2 cos 1 cos 2 + sin 2
1.
tan = 1:
2
(4.6)
This requires either 2 /2 or 1 /4 or {1 /2 and 2 0}.

2
tan 1: CZ(H
tan2 sin2 1 cos2 2 cos2 1 cos2 2 + sin2 sin2 2
1.
b
b)
1
(4.7)
This requires 1 0 or 2 /2 or {1 /2 and 2 0}.
Furthermore, we note that at negative 1 , the LEP bound is to some extent evaded for small
and medium |2 | by cancellation among the two terms in the H1 ZZ coupling.
2
CZ(H
1 bb)
2
3 There are such studies also by the other LEP collaborations (see, for example [17]), but limited to the CP-conserving
case.
58
4.3. The -parameter constraint

A very important constraint coming from electroweak precision data, is the precise determination of the -parameter [39,40]. The quantity

1
AW W q 2 = 0 cos2 W AZZ q 2 = 0
2
MW
(4.8)
measures how much the W and Z self-energies can deviate from the Standard Model value, being
zero at the tree level. The experiments (mostly at LEP) have put severe constraints on [41]:
= 1.0050 0.0010.
(4.9)
The measured deviation from unity is accommodated within the Standard Model, and mostly due
to the heavy top quark.
In the 2HDM additional contributions arise [15], which are determined by the couplings to
the W and the Z of the Higgs particles, and by the mass splittings within the Higgs sector, as
well as the mass splittings with respect to the W and Z bosons. The simplified forms provided
in [10] can easily be re-expressed in terms of the mass eigenvalues and the elements Rj k of
the rotation matrix for the CP-violating basis. For the HiggsHiggs contribution, we find (the
relevant couplings are given in Appendix B):
H
2
HH
AH
W W (0) cos W AZZ (0)

2

g2
2
2
2
=
F
M
cos
R
]
+
R
[sin
R
, Mj
j
1
j
2

j
3
H
64 2
j

(sin Rj 1 cos Rj 2 )Rk3 (sin Rk1 cos Rk2 )Rj 3
2
F Mj2 , Mk2
k>j
(4.10)
where

1

m2 m2
m2
F m21 , m22 = m21 + m22 2 1 2 2 log 12 .
2
m1 m2
m2
(4.11)
For the Higgsghost contribution, we have to subtract the contribution from a Standard Model
Higgs of mass M0 , since this is already taken into account in the fits, and find:
G
2
HG
AH
W W (0) cos W AZZ (0)

2

g2
[cos Rj 1 + sin Rj 2 ]2 3F MZ2 , Mj2 3F MW
, Mj2
=
64 2
j

2

+ 3F MW , M02 3F MZ2 , M02 .
(4.12)
From the electroweak fits, we take M0 = 129 GeV, but note that this value is not very precise
[41]. In the CP-conserving limit, these expressions (4.10) and (4.12) simplify considerably, since
terms with R13 , R23 , R31 , and R32 are absent.
In order to keep these additional contributions (4.10) and (4.12) small, the charged Higgs
boson should not be coupled too strongly to the W if its mass is far from those of its neutral
partners. As a measure of the tolerance we take 3 , i.e., we impose || 0.003.
59
4.4. The muon anomalous magnetic moment

The dominant contribution of the Higgs fields to the muon anomalous magnetic moment, is
according to Refs. [42] and [25] due to the two-loop BarrZee effect [43], with a photon and a
Higgs field connected to a heavy fermion loop. The contributions are given by [25] in terms of
scalar and pseudoscalar Yukawa couplings. Re-expressed in terms of the Yukawa couplings of
(3.21), assuming that the muon couples to the Higgs fields like a down quark, i.e., to 1 , we find
for the top quark contribution:
2
2

Nc e.m. 2 2 2
mt
mt
1
a =
(4.13)
m
Q
g
R
f
R
R
,
j
1
j
2
t
j3
3
2
2
cos sin
4 v
Mj
Mj2
j
with Nc = 3 the number of colours associated with the fermion loop, e.m. the electromagnetic
finestructure constant, Qt = 2/3 and mt the top quark charge and mass, and m the muon mass.
The functions f and g are given in [43]. It is worth noting that the tan factor associated with
the pseudoscalar Yukawa coupling of the muon is cancelled by an opposite factor associated with
the top quark. While the first term gives a positive contribution, the second one may have either
sign.
The contribution of the b quark can be obtained from (4.13) by trivial substitutions for Qt and
mt accompanied by
Rj23 tan2 Rj23
and
1
1
R2
R j 1 Rj 2
cos sin
cos2 j 1
(4.14)
in the square bracket.

Earlier studies (see, for example, [25,42]) have focused on the contributions from rather light
pseudoscalars and large tan , where the b and contributions are enhanced by the substitutions
(4.14). At high tan , the b-quark loop will indeed dominate. For small values of tan , as are
considered here, the b-quark contribution is completely negligible.
The experimental situation is somewhat unclear, depending on how the hadronic corrections
to the running of e.m. are evaluated. The deviation from the Standard Model can be summarized
as [44]

e+ e ,
(221 245) 1011 ,
exp
SM
a a =
(4.15)
(62 142) 1011 ,
+ , {e+ e & + },
and represent 0.7 to 2.8 standard deviations with respect to the data [45]. Two distinct attitudes
are here possible. One may either fit this (positive) deviation with some new physics effect [25],
or one may restrict new physics contributions not to exceed this contribution (4.15). We shall
here follow this latter approach, and require the 2HDM contribution (4.13) to be less than 3 , i.e.,
|a | < 300 1011 . For the parameters considered here, tan O(2), the 2HDM contribution
to a is at most (a few) 1011 and therefore plays no role in constraining the model.
4.5. Rb
The one-loop contributions to the Zbb coupling influence the relative branching ratio of
given by Rb , which is known to 0.05%, or 1.25 MeV precision [13]. In the SM there are
Z bb,
significant contributions proportional to m2t . In the 2HDM there are additional one-loop contributions due to triangle diagrams involving charged and (non-standard) neutral-Higgs fields. For
60
Table 2
2HDM contribution to Rb for parameter sets A and B, in units of the uncertainty, (Rb ) = 1.25 MeV
tan
0.5
1.0
2.0
Set B: MH = 300 GeV

Set A: MH = 500 GeV
5.6
3.3
1.4
0.83
0.35
0.21
the CP-conserving case, these were given in [46]. In the general CP-violating case, the chargedHiggs contribution, Eq. (4.2) of [46], remains unchanged, but we find that the neutral-Higgs part,
Eq. (4.4), gets modified to
3 3
2 N m
m2b
e.m
c Z
f
V (H ) =
(sin Rj 1 cos Rj 2 )Rk3
96 sin4 W cos2 W m2W j =1 k=1
(sin Rk1 cos Rk2 )Rj 3
tan

Rj 1 Rk3 4 m2Z , Mj2 , Mk2 , 0
cos

Rj 1 2
(cos Rj 1 + sin Rj 2 )
4 mZ , Mj2 , m2Z , 0
cos

Rj21
2

2
2
2
2
2
+ tan Rj 3 3 mZ , Mj , 0
+ 2Qb sin W 1 + 2Qb sin W
cos2

2 2
2
2
+ 2Qb sin W 1 + 2Qb sin W 3 mZ , mZ , 0 .
(4.16)
The functions 3 and 4 are various combinations of three-point and two-point loop integrals
[46]. For the numerical studies, we use the LoopTools package [47,48]. Again, this expression
(4.16) is more complicated than those of the CP-conserving limit, but the additional terms have
little quantitative importance at low tan .
At small tan , the charged-Higgs contribution, which behaves like (mt / tan )2 due to the
+
couplings, dominates. For MH = 500 GeV (Set A) and MH = 300 GeV
H bt and H bt
(Set B) the 2HDM contributions to Rb are given in Table 2. For the two larger values of tan ,
this is compatible with the experimental uncertainty, whereas for tan = 0.5 it amounts to a substantial conflict, especially for the lower value of MH (Set B). The neutral-Higgs contribution,
as given by (4.16), is smaller, by three orders of magnitude.
5. Overview over allowed parameters
5.1. Variations of mass parameters around Set A
We start this discussion of model parameters by a survey of how the allowed regions of the
2 3 space depend on the mass parameters, in particular M3 , MH and 2 . It turns out that
while stability is readily satisfied for relevant mass parameters, unitarity excludes sizable regions of parameter space. Ignoring experimental constraints, a low-mass spectrum is in general
easier to accommodate than one where some Higgs particles are heavy. In many cases, non-zero
values of 6 and 7 also have a tendency to reduce the allowed parameter space.
In Fig. 1 we show for Set A the allowed regions in the 2 3 plane, for a few representative
values of 1 , focusing on stability (or positivity) and unitarity. For the considered parameters,
61
Fig. 1. Allowed and forbidden regions in the 2 3 plane, for tan = 0.5, 1 and 2, parameter Set A and selected values
of 1 . Green (G): stability, unitarity, direct search and constraints satisfied; red (R) [purple (P)]: stability, unitarity
satisfied, but direct search [and ] constraints not satisfied; yellow (Y): stability (but not unitarity) satisfied; white:
stability violated. (For interpretation of the references to colour in all figures legends, the reader is referred to the web
version of this paper.)
much of the 2 3 plane is actually in violation of stability, as shown in white. Next, there are
significant areas (yellow) where stability is satisfied, but (tree-level) unitarity is violated. Finally,
in darker colours, we indicate where also unitarity holds. Part of these areas (typically, small |2 |,
as discussed in Section 4.2) are in conflict with the direct search (LEP) data, and indicated in red.
Some areas (purple, typically, larger |2 |) violate the constraint, whereas the remaining areas
in green give viable models. The symmetry given by Eq. (3.11) is evident in the figure: the case
of tan = 2 can be obtained from the case of tan = 0.5 by suitable reflections, apart from the
experimental constraints, which in the 2HDM (II) distinguish tan and cot . For tan > 1, the
direct search constraints shift from those of (4.5) towards those of (4.7).
While this figure cannot be directly compared with Fig. 7 of Ref. [11], since we here keep M3
fixed, it was found that the unitarity constraints of [30,32] are more restrictive than the order-of-
62
Fig. 2. Physically allowed regions in the 2 3 plane, for tan = 0.5 and 1, and selected values of 1 . Left: Variations of
M3 around parameter Set A. Blue: M3 = 500 GeV (Set A), vertical lines: M3 = 550 GeV, yellow: M3 = 600 GeV. Note
that some regions overlap. In particular, green denotes regions allowed for both M3 = 500 GeV and M3 = 600 GeV.
Right: Variations of MH around parameter Set A. Yellow: MH = 300 GeV, vertical lines: MH = 400 GeV, blue:
MH = 500 GeV (Set A).
magnitude estimate adopted in [11]. It should also be noted that with Im 5 = 0, as given by the
parameter Set A, Im 6 and Im 7 will be non-zero in the general CP-violating case.
Dependence on M3 and M2
The dependence of the allowed regions on M3 , the heaviest neutral Higgs boson, is illustrated
in Fig. 2 (left), for the other parameters kept fixed at the values of parameter Set A. The figure
shows the allowed regions for M3 = 500 GeV (blue, default of Fig. 1), 550 GeV (vertical lines)
and 600 GeV (yellow). Smaller allowed regions are also found at 450 and 650 GeV (not shown),
but nothing neither at 400 GeV nor at 700 GeV.
As M2 approaches M1 , there are still regions where stability is satisfied, but unitarity is only
satisfied in very small regions. Similarly, as discussed above, when M2 approaches M3 , the allowed regions tend to be restricted to small values of |2 | 0. Thus, by (3.34), there is in this
limit of M1
M2 M3 , little CP violation in the top-quark sector.
Dependence on MH
The dependence of the allowed regions on MH , the charged Higgs boson, is illustrated in
Fig. 2 (right), for the other parameters kept fixed at the values of parameter Set A. The allowed
63
Fig. 3. Physically allowed regions in the 2 3 plane, for tan = 0.5 and 1, and selected values of 1 . Variations
around parameter Set A; 2 0 and 2 > 0 shown separately. 2 0: Yellow: 2 = 0, green: 2 = (100 GeV)2 ,
vertical lines: 2 = (200 GeV)2 . 2 > 0: Yellow: 2 = (100 GeV)2 , blue: 2 = (200 GeV)2 , vertical lines:
2 = (300 GeV)2 . Note that some regions overlap. In particular, for 2 > 0, green is allowed for both 2 = (100 GeV)2
and 2 = (200 GeV)2 .
region is seen to increase a bit for lower values of MH , but such values are constrained by the
MBd , Rb [13] and b s data [14], especially at low values of tan . On the other hand,
the allowed regions shrink for high values of MH . Nothing is allowed at MH

600 GeV. The
shrinking of the allowed regions for high values of MH is due to the constraint at large |2 |
and/or negative 1 , as well as the direct search (LEP) constraint at small |2 |. Eventually (like
.
for high values of M3 ) also the unitarity constraint excludes high values of MH
Dependence on 2
Since 4 + Re 5 is bounded from below by the stability requirement [see (A.23)], the parameter 2 normally plays a role in pushing MH to values that are high enough to evade the Rb ,
MBd [13] and b s constraints. In Fig. 3 we show how the allowed region in the 2 3 plane
shrinks for low and high values of 2 , when the Higgs boson masses and other parameters
are kept fixed. For the considered spectrum of masses, a range of negative values of 2 is allowed
(for 2 = (400 GeV)2 there are no allowed regions). For increasing positive values of 2 , the
allowed regions shrink away before 2 = (400 GeV)2 (which is not allowed). Of course, these
critical values depend on the mass spectrum adopted in parameter Set A.
64
Fig. 4. Stability and unitarity in the 2 3 plane, for tan = 0.5, 1 and 2, parameter Set B with M3 = 400 GeV, 2 = 0
and selected values of 1 . Colour codes as in Fig. 1.
5.2. Variations of mass parameters around Set B

We shall here briefly review the light-Higgs scenario of parameter Set B, but recall that the
tan = 0.5 case is essentially ruled out by the Mb0 data.
In Fig. 4 we show for 2 = 0 how stability can be satisfied in the whole 2 3 plane. For
tan = 1, stability and unitarity would actually allow any value of the rotation matrix R, i.e.,
any values of 1 , 2 and 3 . However, except for this special case, unitarity is only satisfied in
parts of the plane.
The direct search (LEP) constraint severely cuts into the allowed regions of this light-Higgs
(Set B) scenario, with C 2 = 0.1 [16], see Section 4.2. As noted there, this constraint tends to
exclude the central range of small |2 | and positive and small 1 .
Increasing 2 to (200 GeV)2 , the most striking change is perhaps the emergence of large
regions where stability is not satisfied (indicated in white in Fig. 5). Another interesting obser-
65
Fig. 5. Stability and unitarity in the 2 3 plane, for tan = 0.5, 1 and 2, parameter Set B with M3 = 400 GeV,
2 = (200 GeV)2 and selected values of 1 . Colour codes as in Fig. 1.
vation is that for tan = 1.0 and negative values of 1 , the picture is little changed from the case
of 2 = 0.
With a higher value of M3 (M3 = 600 GeV, but 2 = 0), the unitarity constraint excludes
most of the 2 3 plane. For tan = 1.0 the whole plane is excluded.
5.3. Non-zero values of Im 5 , Re 6 , Re 7
This subsection will present a brief discussion of how the allowed regions get modified for
non-zero values of parameters which are normally set to zero, in order to control the amount of
flavour-changing neutral currents. Thus, adopting a non-zero value for any of them in a realistic
model would have to be done with an eye to these effects.
5.3.1. Non-zero values of Im 5
The cases of positive and negative values of Im 5 can be related. This is seen as follows: by
the transformation C3 of (3.6) and (3.8), the mass-squared elements M213 and M223 flip signs, and
66
Fig. 6. Physically allowed regions in the 2 3 plane, for tan = 0.5, 1 and 2, and selected values of 1 . Variations of
Im 5 around parameter Set A. Blue: Im 5 = 0, vertical lines: Im 5 = 2, yellow: Im 5 = 4.
the auxiliary quantities Im 6 and Im 7 change signs without altering the stability or unitarity
constraints. The only effect will be that CP-violating effects change sign. Thus, we may restrict
the discussion of non-zero Im 5 to Im 5 > 0. However, the full range (3.7) of 3 now has to be
considered.
We show in Fig. 6, for some values of tan and 1 , and for parameter Set A how nonzero values of Im 5 also provide allowed regions in the 2 3 plane. But these are typically
smaller for higher values of Im 5 , and rather scattered. However, they can lead to significant CP violation, since allowed ranges of |2 | tend to be at intermediate values of |2 | [see
Eq. (3.17)].
67
Fig. 7. Physically allowed regions in the 2 3 plane, for tan = 0.5, 1 and 2, and selected values of 1 . Parameters
correspond to Set A, except that Re 6 = 1. Colour codes as in Fig. 1.
5.3.2. Non-zero values of Re 6 or Re 7

Up to this point, we have kept Re 6 = Re 7 = 0. However, we have treated Im 6 and Im 7
as auxiliary quantities derived from the spectrum, the rotation matrix and Im 5 via Eq. (3.3). In
general, they will be non-zero. This might lead to too large flavour-changing neutral couplings,
due to the violation of the Z2 symmetry [2,27,28]. We have not investigated this constraint quantitatively, but note that in many cases the imaginary parts of 6 and 7 can be shifted to the
imaginary part of 5 [see (2.10)]. This will however lead to a modification of the rotation matrix
R and/or the spectrum, by for example a shift in M3 according to the approach of [11].
The Yukawa interactions (see Section 3.5) couple the Higgs fields to a left-handed doublet
and a right-handed singlet quark field. However, these need not be in the flavour basis in which
the mass matrices are diagonal. The Z2 symmetry, which is imposed to stabilize model II [2,28]
is broken by the m212 and Im 5 terms, as well as by the 6 and 7 terms. However, one may
adopt the attitude that these terms, which arise naturally in the MSSM [1822] are constrained
and subdominant. Additionally, one may argue that the FCNCs are suppressed by powers of the
68
quark masses [49,50], thus evading the experimental constraints involving the first two fermion
generations.
We shall here only discuss the case of either Re 6 or Re 7 being non-zero, the other being zero. Because of the symmetry (3.11), (3.12), we may restrict this discussion to Re 6 = 0,
Re 7 = 0. Analogous results for Re 6 = 0, Re 7 = 0 can be obtained from these by the inversion tan 1/ tan according to (3.11) and (3.12).
Even though Re 6 (or Re 7 ) significantly different from zero may turn out to be ruled out
by the constraints on FCNCs, we find it instructive to see how the otherwise allowed regions
change when these parameters are introduced.
In Fig. 7 we show the allowed regions corresponding to parameter Set A, except that
Re 6 = 1. The allowed regions are qualitatively rather similar to those of Fig. 1. However,
with Re 6 = +1, stability tends to be violated in most of the 2 3 plane. Also, when Re 6
decreases to 2 or 3, there is nothing allowed at tan = 1 and tan = 0.5, respectively.
6. CP violation in t t production at the LHC
In order to illustrate the CP-violating effects that can be observed at the LHC, resulting from
mixing in the Higgs sector, we consider the process
pp t t + X,
(6.1)
which at high energies is dominated by the underlying process

gg t t.
(6.2)
In the presence of CP violation, correlations will then be induced at the parton level involving
the t and t spins, denoted s 1 and s 2 , and their c.m. momentum p [8]:

p (s 1 s 2 ) and p (s 1 s 2 ) .
(6.3)
These correlations are determined by the CP-violating combination (3.23) of Yukawa couplings,
multiplied by certain loop integrals, and convoluted over the gluongluon c.m. energy [8,11].
The t and t quarks decay fast enough that hadronization effects do not smear out these CP-odd
correlations. They can thus be treated perturbatively. Consider the semileptonic decays:
t l + l b,
= e
t l l b,
(6.4)
. The lepton energies will inherit an asymmetry from the first correlation given
with l
or
in Eq. (6.3), accessible via the observable:
A1 = E+ E .
(6.5)
An important question is whether or not this can be large enough to be measurable.

In order to have a significant observation, the expectation
value A1 must compare favourably
with the statistical fluctuations, which behave like N , where N is the number of events. In order
to assess this, it is convenient to consider the signal to noise ratio [8],
S
A1
,
=
N
A21 A1 2
where the denominator gives the statistical width of the observable (6.5).
(6.6)
69
Fig. 8. Signal to noise ratio vs. M1 , otherwise parameter Set A. Rotation matrix R defined by angles {1 , 2 , 3 } as
indicated. Actually, in the left panel, we show S/N for A1 . Dotted curves: contribution of the lightest Higgs boson
only.
Other limitations of this approach include the need for very good lepton energy calibration,
and the assumption of no (anomalous) right-handed couplings in the tbW coupling (see [51]).
We display the ratio S/N in Fig. 8, as a function of the mass of the lightest Higgs boson, M1 ,
for two rotation matrices R, given by the angles {1 , 2 , 3 }, and for two values of tan . These
parameters are chosen such that the model is consistent (stability and unitarity) and satisfies the
experimental constraints (see the discussion in Section 5.1 and Fig. 1). The actual evaluation of
this quantity (6.6) follows the approach of [11], using the LoopTools package for the loop evaluations [47,48] and the CTEQ6 parton distribution functions [52] to describe the gluon content
of the proton.
While the allowed regions, for tan = 0.5, in Fig. 1 only are rather small and scattered, they
remain allowed as M1 is increased towards M2 . For these two rotation matrices, the contribution
of the lightest Higgs boson, H1 , is actually the same (apart from the over-all sign), as shown by
(1)
the dotted curves. In terms of the quantities (3.23), |CP | = 1.936 is the same for both cases.
However, for the case shown on the left, the contribution of H2 (with M2 = 300 GeV), is with
(2)
(1)
CP
/CP = 0.75 more efficient in reducing the effect of H1 than for the case on the right,
where the corresponding ratio is 0.44.
In general, the values fall with increasing M1 , since the loop integrals decrease. However,
there is a resonance in one diagram at Mj = 2mt 350 GeV (for more details, see [11]), this is
the reason for the increase beyond 250 GeV.
In Fig. 9 we show, for the same rotation matrices as in Fig. 8, how S/N varies with M2 . Here,
some of the curves are cut off at low or high values of M2 , since the model becomes inconsistent
or experimentally excluded, as discussed in Section 5.1. In some cases, an enhanced negative
interference occurs as M2 2mt .
In this study, the parameter 2 was allowed to float, as compared with parameter Set A, in
order to extend the range of allowed values of M2 . In all cases, except the right panel, with
tan = 1, the value of 2 had to be adjusted as follows: At low values of M2 , a lower value of
2 was needed, and at high values of M2 , a higher value was needed.
Finally, in Fig. 10, we show the variation of S/N with M3 , keeping M1 and M2 fixed. There
is a tendency for the value to increase with M3 (because of reduced destructive interference),
70
Fig. 9. Signal to noise ratio vs. M2 , otherwise parameter Set A. Rotation matrices R as in Fig. 8. Dashed line: value of
M2 in parameter Set A.
Fig. 10. Signal to noise ratio vs. M3 , otherwise parameter Set A. Rotation angles {1 , 2 , 3 } as indicated. Dashed line:
value of M3 in parameter Set A.
but variations of M3 are only allowed within a rather restricted range (see again Section 5.1 and
Fig. 2).
7. Summary and conclusions
The constraints of stability and tree-level unitarity exclude most of the multidimensional
2HDM (II) parameter space. Furthermore, the direct searches (LEP) [16] and the [13] constraints exclude certain domains of the parameters. Finally, the direct searches, as well as the
MBd , Rb [13] and b s [14] constraints exclude a light charged Higgs boson, in particular
at low values of tan . The remaining pockets of allowed parameters will for low values of tan
allow CP violation in the top-quark sector, due to the exchange of Higgs bosons that are mixed
with respect to CP.
In order to maximize the CP violation in the top-quark sector that might be measurable at
the LHC, we focus on parameters where (i) the lightest Higgs boson is rather light, in order
71
to maximize the relevant loop integrals, and (ii) where the product of the CP-violating Yukawa
(1)
couplings, parametrized by CP [see Eq. (3.23)] are large. The latter constraint requires tan to
be small, and | sin 1 sin 22 | to be large.
In summary, we note that even in the face of a variety of experimental constraints, the model
is consistent in a number of regions in parameter space. Apart from exceptional points, these
allowed regions yield CP violation in t t final states produced at the LHC, at a level which can be
explored with a data sample of the order of 106 semileptonic events.
Acknowledgements
It is a pleasure to thank Okan Camursoy and Levent Selbuz for their contributions in the early
stages of this work. This research has been supported in part by the Mission Department of Egypt
and the Research Council of Norway.
Appendix A
In this appendix we describe the method used for formulating the necessary and sufficient
conditions for the potential to satisfy stability. For earlier approaches to stability, all restricted to
simpler potentials, see [5356]. We start by rewriting the Higgs doublets as:
1 = 1 1 ,
2 = 2 2
(A.1)
where i is the norm of the spinor i , and i is a unit spinor. By SU(2) invariance, only four
combinations of fields may appear:
1 1 = 1 2 ,
2 2 = 2 2 ,

2 1 = 1 2 2 1 ,
1 2 = 2 1 .
We let the norms i of Eq. (A.1) be parametrized as follows:
1 = r cos ,
2 = r sin .
(A.2)
The complex product 2 1 between the two unit spinors will be a complex number x + iy
with |x + iy| 1.
Using this parametrization, we can write:
1 1 = r 2 cos2 ,
2 2 = r 2 sin2 ,
2 1 = r 2 cos sin (x + iy),
1 2 = r 2 cos sin (x iy),
(A.3)
where r 0, [0, /2] and x 2 + y 2 1.

The potential can now be written as
V = r 4 V4 + r 2 V2 ,
with the quartic part:
(A.4)
72
V4 = 1 A1 + 2 A2 + 3 A3 + 4 A4 + Re 5 A5 + Im 5 A6
+ Re 6 A7 + Im 6 A8 + Re 7 A9 + Im 7 A10
(A.5)
where
1
1
A2 = sin4 ,
A3 = cos2 sin2 ,
cos4 ,
2
2

A4 = x 2 + y 2 cos2 sin2 ,
A5 = x 2 y 2 cos2 sin2 ,
A1 =
A6 = 2xy cos2 sin2 ,

A9 = 2x cos sin3 ,
A7 = 2x cos3 sin ,
A8 = 2y cos3 sin ,
A10 = 2y cos sin3 .
(A.6)
By introducing polar coordinates x = cos and y = sin , these coefficients can be written
1
1
A2 = sin4 ,
A3 = cos2 sin2 ,
cos4 ,
2
2
A5 = 2 cos2 sin2 cos(2 ),
A4 = 2 cos2 sin2 ,
A1 =
A6 = 2 cos2 sin2 sin(2 ),

A8 = 2 cos3 sin sin ,
A7 = 2 cos3 sin cos ,

A9 = 2 cos sin3 cos ,
A10 = 2 cos sin3 sin ,
(A.7)
where [0, /2], [0, 1] and [0, 2.

A.1. Symmetries of V4
We note some symmetries of the quartic potential under the parametrization (A.2). They are
conditional symmetries, where the potential is invariant with some compensating interchange
of s:
Symmetry I:
The interchange
y y
or
(A.8)
can be compensated by
{Im 5 , Im 6 , Im 7 } { Im 5 , Im 6 , Im 7 }.
(A.9)
This is of course nothing but the reality condition of the potential.

Symmetry II:
Under

2
together with
1 2
and
(A.10)
6 7
the potential is invariant. This is the symmetry under the interchange 1 2 .
(A.11)
73
Symmetry III:
Under
(x, y) (x, y) or + mod 2
(A.12)
together with
{6 , 7 } {6 , 7 }
(A.13)
the potential is invariant. This is related to the well-known Z2 symmetry [2,28].

Symmetry IV:
Under
(x, y) (y, x) or
mod 2
2
(A.14)
together with
Re 5 Re 5
and
{Re 6 , Re 7 } {Im 6 , Im 7 }
(A.15)
the potential is invariant.

A.2. Stability
For the stability condition to be satisfied, V4 must be positive for all combinations of
[0, /2], [0, 1] and [0, 2. This is both a necessary and a sufficient condition.
Whenever 4 0, the potential will have its global maximum and minimum when = 1. Thus,
it is sufficient to check that V4 ( , ; = 1) satisfies stability when 4 is non-positive. In order to
see this, we return to Eqs. (A.5) and (A.6) and rewrite the potential as

V4 = 4 x 2 + y 2 cos2 sin2 + h(x, y)
where h(x, y) is a harmonic function. The maximum principle tells us that h(x, y) will attain
its global minimum at the boundary where x 2 + y 2 = 1 ( = 1). Whenever 4 0, the term
4 (x 2 + y 2 ) cos2 sin2 will also attain its minimum whenever = 1, and so will V4 .
Some points from the parameter space give us some rather simple stability conditions. We
now turn our attention towards these special points.
= 0 or = /2
First we consider the boundary points = 0 and = /2:
1
,
2
2
V4 ( = /2) = .
2
V4 ( = 0) =
This leaves us with the trivial stability conditions of (3.20):

1 > 0 and 2 > 0.
(A.16)
74
=0
Considering the points where = 0, we find that
1
2 4
(A.17)
cos4 +
sin + 3 cos2 sin2 .
2
2
Thus, we must require that

1
1
2
+ 2 tan .
3 >
(A.18)
2 tan2
The right-hand side has its minimum for tan2 = 1 /2 , and we obtain the following necessary constraint on 3 :

3 > 1 2 ,
(A.19)
V4 ( = 0) =
in agreement with [53].

A.3. The limit 6 = 7 = 0
It is instructive to consider the simple limit
6 = 7 = 0.
(A.20)
Then the quartic part of the potential can be written

1
2 4
cos4 +
sin + 3 + 2 (4 + Re 5 cos 2 + Im 5 sin 2 ) cos2 sin2 .
2
2
(A.21)
This expression has the same structure as (A.17), with
V4 =
3 3 + 2 [4 + Re 5 cos 2 + Im 5 sin 2 ].
Thus, the condition for stability can be adapted from (A.19), leading to:

3 + min 0, 4 |5 | > 1 2 ,
(A.22)
(A.23)
as obtained by [53]. However, we stress that this constraint only applies when 6 = 7 = 0.
Appendix B
The Higgsvector-boson couplings can be extracted from the covariant derivatives:

D 1 (D 1 ) + D 2 (D 2 )

ig
Z cos 2W 1 1+ 1+ 1 + 2 2+ 2+ 2
=
2 cos W

+ i[1 1 1 1 + 2 2 2 2 ]

1
igW (1 i1 ) 1+ (1 i1 )1+ + (2 i2 ) 2+
2

(2 i2 )2+
75

1
+ igW (1 + i1 ) 1 1 (1 + i1 ) + (2 + i2 ) 2
2

2 (2 + i2 ) + .
(B.1)
These Higgs fields have to be transformed into the physical basis:

i = Rj i Hj ,
with

=
i = 1, 2, 3,
cos
sin
sin
cos
(B.2)

G
H

,
1
2

=
cos
sin
sin
cos
G0
3
(B.3)
With all momenta incoming (in an obvious notation), we find

ig cos 2W +
p p ,
2 cos W

ig
cos 2W +
p p ,
ZG+ G :
2 cos W

g
(sin Rj 1 cos Rj 2 )Rk3 (sin Rk1 cos Rk2 )Rj 3 pj pk ,
ZHj Hk :
2 cos W

g
0
(cos Rj 1 + sin Rj 2 ) pj p0 ,
ZHj G :
(B.4)
2 cos W
ZH + H :
and
W H Hj :
W G Hj :
W G G0 :

g
i(sin Rj 1 cos Rj 2 ) + Rj 3 pj p ,
2

ig
(cos Rj 1 + sin Rj 2 ) pj p ,
2

g 0
p p .
2
(B.5)
There are no ZH G or W H G0 couplings. The CP-conserving limit is obtained by evaluating R for 2 = 0, 3 = 0, 1 = + /2, with the mapping H1 h, H2 H and H3 A.
In that limit, we recover the results of [10].
References
[1] T.D. Lee, Phys. Rev. D 8 (1973) 1226.
[2] S. Weinberg, Phys. Rev. Lett. 37 (1976) 657.
[3] G.C. Branco, M.N. Rebelo, Phys. Lett. B 160 (1985) 117;
J. Liu, L. Wolfenstein, Nucl. Phys. B 289 (1987) 1;
S. Weinberg, Phys. Rev. D 42 (1990) 860;
Y.L. Wu, L. Wolfenstein, Phys. Rev. Lett. 73 (1994) 1762, hep-ph/9409421.
[4] A. Riotto, M. Trodden, Ann. Rev. Nucl. Part. Sci. 49 (1999) 35, hep-ph/9901362.
[5] M. Dine, A. Kusenko, Rev. Mod. Phys. 76 (2004) 1, hep-ph/0303065.
[6] J.R. Ellis, Eur. Phys. J. C 34 (2004) 51.
[7] C.R. Schmidt, M.E. Peskin, Phys. Rev. Lett. 69 (1992) 410;
D. Atwood, A. Aeppli, A. Soni, Phys. Rev. Lett. 69 (1992) 2754.
[8] W. Bernreuther, A. Brandenburg, Phys. Rev. D 49 (1994) 4481, hep-ph/9312210;
W. Bernreuther, A. Brandenburg, M. Flesch, CERN-TH/98-390, PITHA 98/41, hep-ph/9812387.
[9] D. Atwood, S. Bar-Shalom, G. Eilam, A. Soni, Phys. Rep. 347 (2001) 1, hep-ph/0006032;
W. Bernreuther, A. Brandenburg, Z.G. Si, P. Uwer, Phys. Rev. Lett. 87 (2001) 242002, hep-ph/0107086;
E. Accomando, et al., hep-ph/0608079.
76
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
J.F. Gunion, H.E. Haber, G. Kane, S. Dawson, The Higgs Hunters Guide, AddisonWesley, Reading, 1990.
W. Khater, P. Osland, Nucl. Phys. B 661 (2003) 209, hep-ph/0302004.
A.K. Grant, Phys. Rev. D 51 (1995) 207, hep-ph/9410267.
S. Eidelman, et al., Particle Data Group, Phys. Lett. B 592 (2004) 1.
P. Gambino, M. Misiak, Nucl. Phys. B 611 (2001) 338, hep-ph/0104034.
S. Bertolini, Nucl. Phys. B 272 (1986) 77.
M. Boonekamp, Eur. Phys. J. C 33 (2004) S720;
P. Achard, et al., L3 Collaboration, Phys. Lett. B 583 (2004) 14, hep-ex/0402003;
J. Abdallah, et al., DELPHI Collaboration, Eur. Phys. J. C 38 (2004) 1, hep-ex/0410017.
G. Abbiendi, et al., OPAL Collaboration, Eur. Phys. J. C 18 (2001) 425, hep-ex/0007040.
J. Wess, B. Zumino, Nucl. Phys. B 70 (1974) 39.
P. Fayet, S. Ferrara, Phys. Rep. 32 (1977) 249.
S. Dimopoulos, H. Georgi, Nucl. Phys. B 193 (1981) 150.
H.P. Nilles, Phys. Rep. 110 (1984) 1.
H.E. Haber, G.L. Kane, Phys. Rep. 117 (1985) 75.
G. Abbiendi, et al., OPAL Collaboration, Eur. Phys. J. C 37 (2004) 49, hep-ex/0406057.
R.A. Diaz, PhD thesis, hep-ph/0212237.
K. Cheung, O.C.W. Kong, Phys. Rev. D 68 (2003) 053003, hep-ph/0302111.
I.F. Ginzburg, M. Krawczyk, P. Osland, hep-ph/0101208;
I.F. Ginzburg, M. Krawczyk, P. Osland, Nucl. Instrum. Methods A 472 (2001) 149, hep-ph/0101229;
I.F. Ginzburg, M. Krawczyk, P. Osland, hep-ph/0211371.
G.C. Branco, L. Lavoura, J.P. Silva, CP Violation, Oxford Univ. Press, 1999.
S.L. Glashow, S. Weinberg, Phys. Rev. D 15 (1977) 1958.
S. Davidson, H.E. Haber, Phys. Rev. D 72 (2005) 035004, hep-ph/0504050;
S. Davidson, H.E. Haber, Phys. Rev. D 72 (2005) 099902, Erratum.
I.F. Ginzburg, I.P. Ivanov, hep-ph/0312374;
I.F. Ginzburg, I.P. Ivanov, Phys. Rev. D 72 (2005) 115010, hep-ph/0508020.
A. Abulencia, et al., CDF Collaboration, hep-ex/0510065;
CDF Collaboration, FERMILAB-PUB-05-485-E.
A.G. Akeroyd, A. Arhrib, E.M. Naimi, Phys. Lett. B 490 (2000) 119, hep-ph/0006035;
A. Arhrib, hep-ph/0012353.
S. Kanemura, T. Kubota, E. Takasugi, Phys. Lett. B 313 (1993) 155, hep-ph/9303263.
L.F. Abbott, P. Sikivie, M.B. Wise, Phys. Rev. D 21 (1980) 1393.
G.G. Athanasiu, P.J. Franzini, F.J. Gilman, Phys. Rev. D 32 (1985) 3010.
S.L. Glashow, E. Jenkins, Phys. Lett. B 196 (1987) 233.
C.Q. Geng, J.N. Ng, Phys. Rev. D 38 (1988) 2857;
C.Q. Geng, J.N. Ng, Phys. Rev. D 41 (1990) 1715, Erratum.
M. Misiak, M. Steinhauser, hep-ph/0609241.
D.A. Ross, M.J.G. Veltman, Nucl. Phys. B 95 (1975) 135;
M.J.G. Veltman, Nucl. Phys. B 123 (1977) 89.
M.B. Einhorn, D.R.T. Jones, M.J.G. Veltman, Nucl. Phys. B 191 (1981) 146.
S. Schael, et al., ALEPH Collaboration, DELPHI Collaboration, L3 Collaboration, OPAL Collaboration, SLD
Collaboration, LEP Electroweak Working Group, SLD Electroweak Group, SLD Heavy Flavour Group, CERNPH-EP/2005-041;
S. Schael, et al., ALEPH Collaboration, DELPHI Collaboration, L3 Collaboration, OPAL Collaboration, SLD Collaboration, LEP Electroweak Working Group, SLD Electroweak Group, SLD Heavy Flavour Group, SLAC-R-774,
September 2005, Phys. Rep. 427 (2006) 257, hep-ex/0509008.
D. Chang, W.F. Chang, C.H. Chou, W.Y. Keung, Phys. Rev. D 63 (2001) 091301, hep-ph/0009292.
S.M. Barr, A. Zee, Phys. Rev. Lett. 65 (1990) 21;
S.M. Barr, A. Zee, Phys. Rev. Lett. 65 (1990) 2920, Erratum.
S. de Jong, Plenary talk at EPS International Europhysics Conference on High Energy Physics (HEP-EPS 2005),
Lisbon, Portugal, 2127 July 2005, hep-ex/0512043.
G.W. Bennett, et al., Muon g-2 Collaboration, Phys. Rev. Lett. 92 (2004) 161802, hep-ex/0401008.
A. Denner, R.J. Guth, W. Hollik, J.H. Kuhn, Z. Phys. C 51 (1991) 695.
T. Hahn, M. Perez-Victoria, Comput. Phys. Commun. 118 (1999) 153, hep-ph/9807565. See also http://www.
feynarts.de/looptools/.
77
[48] G.J. van Oldenborgh, J.A. Vermaseren, Z. Phys. C 46 (1990) 425.

[49] T.P. Cheng, M. Sher, Phys. Rev. D 35 (1987) 3484;
M. Sher, Y. Yuan, Phys. Rev. D 44 (1991) 1461;
M. Sher, in: Proceedings of 29th International Conference on High-Energy Physics (ICHEP 98), Vancouver, Canada,
2329 July 1998, hep-ph/9809590.
[50] D. Atwood, L. Reina, A. Soni, Phys. Rev. D 55 (1997) 3156, hep-ph/9609279.
[51] A.C. Kraan, CDFRun II Collaboration, hep-ex/0611017.
[52] J. Pumplin, D.R. Stump, J. Huston, H.L. Lai, P. Nadolsky, W.K. Tung, JHEP 0207 (2002) 012, hep-ph/0201195.
[53] N.G. Deshpande, E. Ma, Phys. Rev. D 18 (1978) 2574.
[54] S. Nie, M. Sher, Phys. Lett. B 449 (1999) 89, hep-ph/9811234.
[55] S. Kanemura, T. Kasai, Y. Okada, Phys. Lett. B 471 (1999) 182, hep-ph/9903289.
[56] P.M. Ferreira, R. Santos, A. Barroso, Phys. Lett. B 603 (2004) 219, hep-ph/0406231;
P.M. Ferreira, R. Santos, A. Barroso, Phys. Lett. B 629 (2005) 114, Erratum.
On the one-loop corrections to inflation II:

The consistency relation
Martin S. Sloth
Department of Physics and Astronomy, University of Aarhus, DK-8000 Aarhus C, Denmark
Received 2 January 2007; received in revised form 1 March 2007; accepted 2 April 2007
Abstract
In this paper we extend our previous treatment of the one-loop corrections to inflation. Previously we
calculated the one-loop corrections to the background and the two-point correlation function of inflaton
fluctuations in a specific model of chaotic inflation. We showed that the loop corrections depend on the
total number of e-foldings and estimated that the effect could be as large as a few percent in a 4 model
of chaotic inflation. In the present paper we generalize the calculations to general inflationary potentials.
We find that effect can be as large as 70% in the simplest model of chaotic inflation with a quadratic m2 2
inflationary potential. We discuss the physical interpretation of the effect in terms of the tensor-to-scalar
consistency relation. Finally, we discuss the relation to the work of Weinberg on quantum contributions to
cosmological correlators.
1. Introduction
From the point of view of inflationary model building, we are entering a very interesting era.
While in the nineties a plethora of inflationary models was constructed, we are only now beginning to be able to discriminate between them experimentally. One of the important statements in
the announcement of the WMAP 3 data set, was the claim that 4 inflation is ruled out [13].
Although this turns out not to be true if one allows for a non-vanishing neutrino fraction in the
universe [4], it illustrates the precision with which WMAP and the upcoming PLANCK satellite
can probe inflationary physics. As a consequence, we will soon need to take sub-leading effects
into account in our theoretical predictions from various models.
E-mail address: sloth@phys.au.dk.
doi:10.1016/j.nuclphysb.2007.04.012
M.S. Sloth / Nuclear Physics B 775 (2007) 7894
79
The scope of this paper is to consider the one-loop effects in simple monomial models of
chaotic inflation. If inflation has lasted only a short time, not much more than 65 e-foldings, one
can for most practical purposes treat the Hubble parameter, H , and the slow-roll parameters, ,
, as constant during inflation. In this approximation one will find that the one-loop effects on
the power spectrum are suppressed by a factor H 2 /Mp2 , making them completely irrelevant for
observations. However, if inflation has lasted very long this approximation breaks down and there
is an enhancement of the loop effects in the infrared. If Hi is the initial Hubble parameter at the
beginning of inflation and it is much larger than the Hubble parameter, H , when the physically
observable modes exit the horizon during the last 65 e-foldings or so, then the loop effects can be
enhanced by a factor (Hi2 /H Mp )n , where n is some model dependent power [5]. This is due to
the fact that the infrared cutoff on the loop-momentum is given by the initial size of the inflating
patch, which is determined essentially by the initial value of the Hubble parameter.
In models of chaotic inflation [6], where one typically has a very large total number of efoldings of inflation, it turns out that the enhancement of the one-loop effects can be so large
that it may be observationally relevant. The back-reaction of one-loop effects on the classical
background has been calculated earlier in models of chaotic inflation, and shown to be large
[717], although it is not clear that this has any physical relevance in single field models, since
the effect on the local expansion rate can be gauged away [1820]. The type of infrared divergence that leads to the logarithmic growth of the one-loop correction with the scale factor, has
also been claimed to induce an effective mass for the photon during inflation in scalar quantum
electrodynamics and thus leads to generation of magnetic fields [2126]. An other approach to
loop effects, the stochastic approach, has been applied by Linde [27] (see also [28]), in order to
understand the global structure of spacetime in eternal inflation.
Despite the significant amount of work on the back-reaction of one-loop effects on the classical background, the one-loop correction to the two-point quantum correlation function of inflaton
field fluctuations was only recently evaluated explicitly in Ref. [5] for the first time. The one-loop
correction to the two point function is dominated by the seagull-diagram, and thus the calculation requires the fourth order action of the field perturbations, which was not available until it
was calculated in the appendix of Ref. [5]. In Ref. [5] it was furthermore estimated, that the
one-loop correction to the two-point function of inflaton field perturbations in a model of 4
chaotic inflation can be as large as a few percent. Subsequently, the fourth order action, with
the inclusion of vector modes, was also given by Seery, Lidsey and Sloth [29], who derived
the primordial tri-spectrum of curvature perturbations. Although the one-loop correction to the
two-point correlation function of inflaton field fluctuations was not calculated explicitly before
Ref. [5], a general estimate of the effect of loop contributions on cosmological correlation functions appeared earlier in the work of Weinberg [30,31].
In the present paper we generalize the calculations of Ref. [5], and investigate the effects in
more details. We find that in simple models of chaotic inflation the one-loop effects can be very
large and may have important effects on cosmological observables, such as the tensor-to-scalar
ratio.
The paper is organized in the following way. In Section 2, we take the super-horizon limit
of the third order action, which was derived by Maldacena [32], and of the fourth order action
given in Ref. [5,29]. In Section 3.1 we apply the super-horizon limit of the third order action to
calculate the one-loop back-reaction on the classical background. In Section 3.2 we calculate the
one-loop correction to the two-point correlation function of inflaton fluctuations, using the superhorizon limit of the fourth order action. In Section 3.3, we discuss the physical implications in
80
terms of the tensor-to-scalar consistency relation. In Section 3.4, we discuss the relation to the
work of Weinberg [30,31]. Finally, in Section 4 we roundup with a discussion.
2. Effective action of perturbations
The one-loop corrections to the two-point function of inflaton quantum fluctuations is dominated by the seagull diagram, which contains a vertex with four legs. Thus, in order to calculate
it, we need the effective action of inflaton perturbations to fourth order in the field fluctuations.
Below we will review the calculation of the fourth order action and use it to calculate the effective
interaction for a general inflaton potential in the super-horizon limit.
2.1. ADM formalism
It is convenient to use the ADM formalism [33] to derive the action for the inflaton perturbations. Let us consider the scalar action of the inflaton field

1
S=
(1)
g R ()2 2V () ,
2
in the ADM metric, given by

ds 2 = N 2 dt 2 + hij dx i + N i dt dx j + N j dt .
(2)
In this metric, the action becomes [33]

2
1
S=
h N R (3) 2N V + N 1 Eij E ij E 2 + N 1 N i i
2

N hij i j ,
(3)
where
1
Eij = (h ij i Nj j Ni ).
(4)
2
We find it convenient to discuss the effective action of the inflaton perturbations in the uniform
curvature gauge, where, when ignoring vector and tensor modes.1 The tensor modes are expected
to be suppressed, we have
= c + ,
hij = a 2 ij ,
N = 1 + ,
N i = i .
(5)
In Section 3.4, we will elaborate on the physical motivations for choosing this particular gauge
for our calculations.
The benefit of the ADM formalism is that the constraint equations are easily obtained by
varying the action in N and Ni , which acts as Lagrange multipliers. In this way the constraint
equations in the uniform curvature gauge become

2
a 2 ij i j 2V N 2 Eij E ij E 2 + N i i = 0,
(6)
1 In general one should also include vector and tensor contributions in the calculations, since they can seed scalar perturbations to higher order in perturbation theory. However, the contribution from the tensor modes tends to be suppressed
[32] and the gradient structure of the vector modes [29] appears to exclude leading order IR divergent contributions of
the kind we are looking for.
81
and

j N 1 Eji ji E = N 1 N j j i .
(7)
If one perturbs the action by taking

= c + ,
= 1 + 2 + ,
= 1 + 2 + ,
(8)
and solves the constraint equations order by order, one finds to first order [32]
1 c
1 c
1 H
1
c +
,
2 1 =
.
2H
2H
2 H
2H
It is now trivial to obtain the second order action,

1
2
a 3 2 + i i V 2 + 2 c2 V 2 .
S2 =
2
H
1 =
(9)
(10)
Generally, in order to obtain the action to order n, one only needs to derive the constraint
equations to order n 1, since the nth order terms multiplies the constraint equation to zeroth
order. In fact, it turns out in practice that the terms in the constraint equation to order n 2
cancels out as well. This implies that one only needs the first order terms in Eq. (9) in order to
obtain the action to third order in perturbations. One obtains

a 2 c
1 c 2
3
i 1 i
S3 = a

()2
4H
4 H
1
1 c3 2
2
3 c3 3 1 c
+ 1 c 2 2 1

V, 3 V, 3 +
+
2
8H
4H
6
4H
4H

1 c
+
(11)
i j 1 i j 1 + 2 1 2 1 ,
4H
as first derived by Maldacena [32], and subsequently generalized in Refs. [3436]. In Eq. (11),
we have inserted the expression for 1 in order to bring it on the same form as in Ref. [32].
By going one order further, one can in a similar fashion obtain the action to fourth order in
perturbations [5,29]

1
1
j 2 j
S4 = a 3 V, 4 + j 1 j m 1 m
24
2

1

1
1
V, 3 212 V,
+ 12 2 22 6H 2 + 2 +
2
2
3

+ 1 i i V, 2 2i j 2 i j 1 + 2 2 2 2 1

j
j
+ 2j 2 + 2 j 1 .
(12)
We note that 3 has cancelled out of the action, so we only need the solution to the constraint
equations to second order [5,29],
2 =
and
c2
2 + F (, ),
8H 2
(13)
82
3 c2 2 3 c 2
a2
1
2 + c i 1 i

+
()2
8H
4 c
4H
4H
2H

V
1 2 2
1 (i j 1 )2 F (, ),
+
4H
H
where we have for convenience defined [5]

1 2 2
2 .
i +
=
1 2 1 i j 1 i j 1 + i
F (, )
2H
2 2 =
(14)
(15)
2.2. The infrared limit

The dominant one-loop contribution to the two-point correlation function of inflaton fluctuations comes from the infrared part of the seagull-diagram shown in Fig. 1. In order to evaluate
it, we therefore only need the infrared part of the fourth order action, which is equivalent to the
super-horizon limit of it.
To linear order in perturbations, the perturbation equation yields

1 2
+ V 6H 2 = 0.
(16)
2
a
Normalized to the BunchDavis vacuum in the infinite past, the solution for the mode function
in Fourier space is given by
H 3/2 H(2) (k),

k () =
(17)
2
where = 3/2 + 3 and 2 /2H 2 , V /3H 2 are the slow-roll parameters. In the
super-horizon limit, where we can neglect the gradient terms, it becomes

k 3
3/2
.
AH k
(18)
aH
+ 3H

For convenience we have defined A ei/2 23/2 ()/ (3/2). It is easy to see that in this
limit, we have

k = ( 2)H + O 2 .
(19)
Let us consider the interaction terms of the effective action in the super-horizon limit. By neglecting higher order gradient terms, the third order action S3 in the super-horizon limit becomes

3
1 c 2
SH
3
i 1 i + 3 c 3

S3 = a
4H
8H

3
1 c
1
1
1 c2 2 2
c
2
(20)
V, 3 V, 3 +
+
1 .
4H
6
4 H2
4H
To leading order in the slow-roll expansion, we can use the approximation
2 1
1 c
+ O().
2H
(21)
Naively,
One has to be careful when estimating the slow-roll order of the terms involving .
is lower
after partial integrations it appears that the order of the terms involving two factors of
than if we applied directly Eq. (19) in the action. One can show that, after a couple of partial
83
Fig. 1. The seagull diagram, which dominates the one-loop contribution to the two-point correlation function of inflaton
field fluctuations.
spacetime integrations, we would have up to a total time-derivative

1
1 c 2
S3SH a 3
H (3 4) 3 V, 3
2H
6

1 c 2
L
i i
.
2H
1
(22)
The term proportional to the first order equation of motion can then be eliminated by a field
redefinition

1 c 2
i i .
+
(23)
2H
However, the theory is not invariant under this point transformation of variables. In the transformation we have ignored a total time derivative term involving factors of the canonical conjugate
and, as we show below in Section 3.4, this is inconsistent when calculating quantum
field ,
correlation functions. This implies that terms in the action which are non-linear in the canonically conjugate field cannot in general be simplified by partial time integrations, and the order
of magnitude of their contribution is actually given by applying directly Eq. (19) in the action,
appears in the loop integral. When
appears on an external leg, it does not introduce
when
any further slow-roll suppression.2 Thus, the correct expression for the super-horizon action to
leading order in slow-roll is

1
1 c 2
1 c 2
SH
3

H (3 2) 3 V, 3
S3 a
4H
6
4H

1 c 2
i i .
+
(24)
2H
In the same spirit, we can calculate the super-horizon limit of the S4 action. Neglecting the
higher order gradient terms, the fourth order action reduces to

1
SH
3
i 2 i
S4 = a V, 4
24

1
1
1 2
2
2
2
+ 21 2 2 6H + c + 1 V, 3 212 V,
2
2
3

2
2
i j
i
2
.
+ 2 1 2 i j 1 2 + 2 2 i 1 V,
(25)
2 We thank D. Seery for pointing this out.
84
When evaluating the terms involving gradients one-by-one, we note that it is not possible to
eliminate any time-derivative fields by time-like partial integrations, because the time-like partial
integration will introduce new surface terms and terms involving higher order time derivatives of
the perturbation fields, which cannot be eliminated by a field redefinition without changing the
term contributes effectively O() when it appears on an
perturbation theory. In this case, a
internal leg, as can be seen from Eq. (19). In this way, we obtain to leading order in slow-roll in
the super-horizon approximation that the important terms are

1 c2 2
3 2
2 j
j ) +
S4SH a 3 c 2 2 j (
j
2
8H
8H

2
1 2
1 c
F VF2 ,
2
2
+
(26)
16 H 2
2
where F is given in Eq. (15). To leading order it becomes
F
1 2 j
(j ).
2H
(27)
3. One-loop corrections to inflation

The SchwingerKeldysh real-time formalism [37,38] is appropriate for evaluating the oneloop corrections to expectation values self-consistently. It has also been extended to curved space
and expanding backgrounds [3944] and used to study infrared divergences [4549]. For a review
of the formalism, see the appendix of Ref. [50]. In this a approach the expectation value of some
operator O is given by
0|O|0 =
0|T {Oei
0|T {e
0
d [HI (c ,
+ )H
I (c ,
0
)]
d [HI (c , )HI (c , )]
}|0
}|0
(28)
if the initial state is the vacuum state |0. A step function ( infl ) is absorbed in HI , such
that the time integral effectively have infl as lower limit.
This matrix element describes a system in the initial state (infl ), evolved from conformal
time to 0 with an operator inserted at , and back again from 0 to , with a set of +
fields on the increasing-time contour and a set of fields on the decreasing-time contour.
The contractions between different pairs of the two types of fields now yields four kinds of
propagators

0|T (x) (x ) |0 = iG (x, x )

d 3 k i k(

= i
(29)
e x x ) G
k (, ).
(2)3
The time-ordering of the contractions then yields

>

<

G++
k (, ) = Gk (, )( ) + Gk (, )( ),

>

<

G
k (, ) = Gk (, )( ) + Gk (, )( ),

>

G+
k (, ) = Gk (, ),

<

G+
k (, ) = Gk (, ),
(30)
85
where

G>
k (, ) = iUk ()Uk ( ),

G<
k (, ) = iUk ()Uk ( ).
(31)
<

One can of course also define G> (x, x ), G< (x, x ) from which G>
k (, ), Gk (, ) can obtained by a Fourier transform.
From the previous section, it follows with , that the effective interaction Hamiltonian
in the super-horizon limit to leading order in slow-roll is

d 3y
c + 3H c + V (c )
HI c ,
4 H 4

1
1 c 2
3
+
V +
H (3 2)
6
4H

1 c 1 2 1 c 1 2

+
+
i i
4 H a2
2 H a2

1 c 2 1 2
1 c 2 1 2 2

2
+
i
i
16 H2 a 2
8 H2 a 2

2 F + V F 2 ,
(32)
where H a /a = aH and we have truncated the arguments of (, y). The integration measure is given by the determinant of the de Sitter metric in conformal coordinates.
3.1. One-loop corrections to the background
The one-loop effective background equation of motion for the classical background field, c ,
follows directly from the tadpole renormalization condition. When we split the inflaton field in
the classical background field, c , and the quantum fluctuation, , the tadpole condition defines
what we mean by the classical background. The definition of the classical background field as
the vacuum expectation value of the inflaton
=
c + = c ,
(33)
requires that the tadpole condition

= 0 is satisfied to all orders. This implies that the oneloop effective background equation of motion is actually defined by the tadpole renormalization
condition,

0 = (x) 0

0
>

d 3 y
G (x, y) G< (x, y)
=
d
4 H 4 ()

1
3 c 2
H (3 2) G> (y, y) .
c + 2Hc + V i V +
(34)
2
4H
For this relation to be satisfied we must have

1
3 c 2
V +
H (3 2) 2 = 0,
c + 3H c + V +
2
4H
(35)
86
where the last term on the left-hand side is the one-loop correction to the tree-level background
equation of motion. Any divergent part of this, or any piece that appears as a time-independent
coupling, can be cancelled by counter terms in the effective action, but a finite time-dependent
piece will generally be leftover from such a procedure and give rise to a small non-vanishing
one-loop correction.
It is useful to consider a generic monomial type of inflation with the generic potential
V () = Mp4
(36)
such that the tree-level slow-roll parameters become

=
2 Mp2
,
2 c2
= ( 1)
Mp2
c2
(37)
It is easy to verify that in the case = 4, the one-loop correction appears as an effective mass
term [5], but in general the form of the one-loop correction is non-trivial. In the slow-roll limit
the one-loop effective equation of motion becomes

V
1 V 1 c
c
+
(3 2) 2 .
(38)
H
V
2 V
4H
Using the definition eff c2 /(2H 2 ) of the one-loop effective slow-roll parameter [5], we obtain
from Eq. (38)

V V 1
2
eff + , =
(39)
.
(3
2)
2
2V 2
Mp2
The tree-level slow-roll parameters are evaluated at the time the observable modes exits the horizon, while the quantum two-point correlator
2 may contain information on the full history
of inflation in a subtle way. In Ref. [5] we reviewed the evaluation of the two-point correlation
function in details for the specific case of = 4. The generalization is given in [10]

2
1
c 2+ i 4+ 2
Mp ,
=
(40)
Mp
c
12(4 + ) 2
where i denotes the value of the classical field c at the initial time ti , which we can take to
be the beginning of inflation. The initial value of the background inflaton field appears through
the infrared cutoff on the loop-momentum. The infrared cutoff on the loop-momentum variable
is given by kIR = ai Hi and can be expressed in terms of i .
Using the slow-roll condition, we can write the total number of e-foldings of inflation as
a(t )
N ln
=
a(ti )
t
ti
1
H dt 2
Mp
i
V
1
d
2,
V
2Mp2 i
(41)
where we also applied the assumption i . We conclude that

2 N (4+)/2 , which is
consistent with the statement that the quantum correlator grows like a power of log a(t) [30] (we
will return with a more detailed discussion of this in Section 3.4). By considering the scenario of
chaotic inflation and using the value i at the end of the self-reproduction regime, and c when
the observable modes exit the horizon, we can estimate the largest possible loop correction.
When the observable modes exit the horizon N 60 e-folding before the end of inflation, one
87
Fig. 2. The relative one-loop correction / to the slow-roll parameter . In the left panel we have plotted the one-loop
correction in the case of a low reheating temperature for N = 45. In the right panel we have shown the same plot, but
now with N = 60.
can easily show that = 2N Mp , while the field value at the end of the self-reproduction
era is
2 2
1
2 +2
i =
(42)
Mp .
In order to match the observed level of CMB anisotropies we must further require
12 2 1010 (2N )/2 . Combining these theoretical and phenomenological constraints yields
+4
+2
2+8
V 2 1
2 1010 4 3
2
Mp (3 2)

(2N ) 2+4 ,
2
10
V
2
2 4 + 6 10
Mp
(43)
where we have inserted c for . In Fig. 2, we have plotted the relative one-loop correction /
to the slow-roll parameter for N = 45 and N = 60. The effect is a few percent for = 4. For
= 2, the effect is order 1015%.

=
3.2. One-loop corrections to the two-point function

The two-point function, T (0 , k)
k (0 )k (0 ), evaluated to one-loop order can be organized in terms of contributions to zeroth T (0) , first T (1) , and second T (2) order in , where
lambda is a generic coupling constant counting the number of vertices. Thus, the tree-level contribution is given by T (0) , and to one-loop order we have T = T (0) + T (1) + T (2) . By using
Eq. (32) in Eq. (28), with four copies of + inserted as the operator and doing the appropriate
contractions, we can compute T . The leading one-loop contribution to the two-point function
comes from the seagull diagram in Fig. 1. From the seagull diagram, we obtain the following
contribution to the two-point correlation function of inflaton fluctuations in Fourier space
0
T
(1)
(0 , k) = 2
infl

d
>
Im G>
k (0 , ) Gk (0 , )
2 H 2

d 3k
1
1
(i)G>
+ (2 )
k (, )
8
4
(2)3

88
0
2

d
>
>
Im
G
(
,
)G
(
,
)
0
0
k
k
3 H 3
infl
3
(2 )H
16
d 3k
(i)G>
k (, ).
(2)3
(44)
The time that appears in the lower limit of the integral is the end of the self-reproduction regime,
but for most practical purposes we can take infl . We have also used the approximation
where momenta integrated over in the loop is much smaller than the external momentum k k,
such that 2 i ( i ) in Fourier space becomes |k + k |2 (k i + k i )( ki ) k /k , if
the conjugate field appears as an external leg and the other as an internal, which yields a loop
integral that is not divergent in the infrared and can be ignored. Similarly if they both appear
internal or external, the contribution becomes (1/2) , due to momentum conservation at the
vertex. By subsequently applying Eq. (19) when the conjugate field appears on an internal leg,
this leads to the last term in Eq. (44) above.
Now, we can follow the calculation of [5]. In terms of the mode functions Uk (), we have
0
T
(1)
(0 , k) = 2

2 2
1

d
1
Im Uk () Uk (0 )
+ (2 ) 2
2
2
8
4
H
infl
0
+2
d
3 H 3

3

Im Uk () Uk ()Uk2 (0 )
(2 )H 2 ,
16
(45)
infl
where
2 is given in Eq. (40). For the physically observable modes, which has spent only
short time out side the horizon, in Eq. (45) it is a good approximation to assume [5]
iH
Uk () = (1 + ik)eik .
(46)
k 2k
The conformal time integrals in Eq. (45) turn out to get their dominant contributions from the
integration from to 0 , where is the conformal time at which the physically observable
comoving momenta k crosses outside the horizon. In the interval [ , 0 ], we can treat the twopoint correlation
2 , the slow-roll parameters , and the potential V and its field derivatives
as constant. The integral then simplifies, and can easily be evaluated analytically. In the limit
x0 = k0 0 we obtain
0

2 2

d
H2
Im
U
()
U
(
)
k
0
k
2 H 2
8k 3
(47)
infl
and
0
d
3 H 3

H
Im Uk () Uk ()Uk2 (0 ) 3 Ci(2k0 ) 2 .
4k
(48)
infl
We finally obtain on super-horizon scales, the following one-loop corrected two-point function

2
H2
1
3
1
P(0 , k)
(49)
+ (2 ) (2 ) Ci(2k0 ) ,
1
16
2
8
4 2
89
Fig. 3. The relative one-loop correction P/P to the power spectrum of inflaton fluctuations. In the left panel we have
plotted the one-loop correction for N = 45. In the right panel we have shown the same plot, but with N = 60. We see
that in the present approximation the maximal one-loop correction is of the order 515% for 4 , while it is 5070% for
a typical model of m2 2 chaotic inflation.
which can be regarded as the generalization of Eq. (63) in Ref. [5]. The expression in Ref. [5]
is however one order higher in the slow-roll expansion, because we used the approximation
k appeared in an external leg. It is reassuring that the corrections in
in Eq. (19) also when
Eqs. (49) and (43) are of the same order in the slow-roll expansion.
In Fig. 3, we have plotted the maximal relative one-loop correction to the power-spectrum
P/P, by considering again the scenario of chaotic inflation with the potential given in Eq. (36).
Especially, we remark that for a model of chaotic inflation with a potential of the type m2 2 ,
the one-loop corrections significantly influences our predictions for the two-point function of
inflaton fluctuations. The effect appears to be of the order of 50% with N = 60 and as large as
70% if N = 45.
3.3. Physical interpretation and the consistency relation
With the WMAP data, it has been possible to start to severely constrain various inflationary models. In fact, it was claimed with the release of the WMAP 3 data, that the simple 4
inflationary model is ruled out. It has later turned out not to be true, and 4 is marginally allowed if the universe is composed with a non-vanishing neutrino fraction [4]. This illustrates how
sensitive data is becoming, to the exact theoretical predictions from various models of inflation.
Crucial for ruling out different polynomial models of chaotic inflation is the predicted tensorto-scalar ratio. Taking the chaotic model of 4 or m2 2 inflation seriously, we must ask for
its precise prediction for the tensor-to-scalar ratio including loop effects, before we can even
start to constrain it with data. In the previous sections we have seen that the loop effects can be
rather significant. On the other hand, one could turn the argument around and view the one-loop
corrections from a low-energy effective point of view, and claim that the results just imply that a
pure 4 model is not likely in the effective framework of chaotic inflation. However, the lowenergy effective potential will have to be extremely finely tuned, if it is constructed such that
it exactly mimics the loop effects. In other words, the 4 model, or other monomial chaotic
inflationary models, are very simple and well defined theoretical models when loop effects are
included, and it is important to experimentally constrain them if possible.
At present, we have not yet calculated the loop corrections to the tensor perturbations. Before
we can do this, we must have the fourth order action for tensor perturbations, which has not
90
Fig. 4. The likelihood contours of the tensor-to-scalar ratio vs. the scalar spectral index [4]. Two models are indicated. On
the edge of the 95% exclusion likelihood contour is the predictions from the 4 model while in the middle of the 68%
exclusion likelihood contour the predictions of the m2 2 model is indicated. We have indicated the model predictions
with 45 < N < 60. If we did not take into account one-loop corrections, the predictions would be line-shaped between
the squares. The full polygons indicates qualitatively the theoretical uncertainty when the one-loop correction to the
two-point correlation function of inflaton fluctuations are included.
yet been calculated. Assuming that the loop corrections to the tensor perturbations are of same
order of magnitude as the loop corrections to the scalar perturbations or smaller, and does not
exactly cancel the loop corrections to the scalar power spectrum when the tensor-to-scalar ratio is
calculated, we can still estimate the size of the loop correction. From the left panel of Fig. 3, we
estimate that the correction to the tensor-to-scalar ratio is as large as 70% for the simple m2 2
chaotic inflation model, while it is only a few percent for 4 . This is consistent with the result
of Ref. [5].
When constraining models of monomial chaotic inflation, it means the predicted theoretical
model lines in the spectral index, ns , vs. tensor-to-scalar ratio, r, plots are no-longer line shaped
but becomes blurred over a small region due to the loop effects. We have illustrated this in Fig. 4,
where the small polygon shaped regions would have been line shaped, if we did not take into
account the one-loop correction to the two-point correlation function of inflaton fluctuations. We
see that this can turn out to be important in the future, making it possible to discriminate between
a model of m2 2 inflation with a minimum number of total e-foldings of inflation and a model of
m2 2 chaotic inflation, with a huge total number of e-foldings. We find that this is an interesting
possibility that deserves further investigation.
3.4. Quantum contributions to general cosmological correlations
The analysis carried forward in Ref. [5] and in the present work, is closely related to a slightly
more general question lately addressed by Weinberg [30,31]. At tree-level it is well known that
quantum correlations in cosmology only depend on the behavior of the unperturbed background
field near the time of horizon exit. Weinberg asked whether this is still true when loop effects are
included, or if the contribution of loop graphs can depend on the whole history of the unperturbed
universe. It is was found that in general the correlations can depend on the whole history of the
universe, although at most to powers of the logarithm of the scale factor [31]. As we will discuss
in this subsection, this is consistent with our findings [5].
91
In [30,31], the expectation value of any product of operators in Eq. (28) is expanded on the
form
t
tN
t2

N
i
dtN
dtN1
dt1 HI (t1 ), HI (t2 ), . . . HI (tN ), O(t) .
O(t) =
N=0
(50)
Until now we have consistently worked in the zero-curvature gauge, with the scalar perturbations
given by SasakiMukhanov variable, which is identical to the field fluctuations themselves in
this particular gauge. In [30,31] the comoving curvature gauge was chosen. In the comoving
curvature gauge the field fluctuation vanishes and the instead the scalar perturbations are given
by the comoving curvature perturbation , which is defined as the perturbation of the spatial
section of the metric
gij = a 2 e2 ij ,
(51)
where we for simplicity are ignoring tensor perturbations and only consider scalar perturbations
of the metric in single field inflation. In the interaction picture, the curvature perturbation field is

d 3k
k (t)eikx ak + k (t)eikx ak .
(x, t) =
(52)
3/2
(2)
In single field inflation k is conserved on super-horizon scales, because on these scales k vanishes very fast as a 2 . So if k0 is the time independent limit of k , the difference k k0 goes
again essentially like a 2 . The crucial observation in [30,31] is that this implies that the commutator of any two combination of -fields always goes as a 3 .
One can now evaluate the time dependence of any correlation function of s outside the
horizon by observing that the Lagrangian density maximally carries three powers of the scale
factor. Since in Eq. (52) there are just as many commutators as there are interactions, and each
commutator carries three powers of the inverse scale factor, there can at most be zero factors of
a(t) in any of the integrals over time in Eq. (52). This implies that the integrands can at most
grow like a power of t , which is similar to a power of ln a(t) [30,31]. It is thus concluded that the
quantum correlation function can at most grow like a power of ln a(t), and without huge amount
of e-foldings the loop corrections can never become large.
This conclusion is in agreement with the results of Ref. [5], which indeed did find large loop
corrections to the quantum correlations in chaotic inflation only with a very long period of total
inflation. In fact it was found that the correction grows like a power of the total number of efoldings N , which is equivalent to a power of ln a(t) (see for instance Eq. (29) of [5]).
However, there is a fundamental difference between the two approaches. In one approach it is
the field perturbation (which is the SasakiMukhanov variable in that gauge) which is quantized,
while in the other approach it is the curvature perturbation which is quantized. The two approaches does not appear to be equivalent at the quantum level, since the transformation between
the two variables is a non-trivial point transformation. To see this, consider the general subclass
of canonical transformations given by point transformations, which transforms the canonical
conjugated field, , in the following way [51]
=
L
F [, t]

,
(53)
92
where the generating function for the transformation, F , appears as a total time derivative in the
Lagrangian after the transformation, i.e.
L[, ]

L[, ]
d
F [, t].
dt
(54)
This implies that the action transforms as

S[] S[] F [i , ti ] + F [f , tf ],
(55)
and does clearly not change the classical tree-level equation of motion, which is derived from
requiring that the variation of the action in the field vanishes. On the other hand, the evolution
operator and the states transforms as
U (tf , ti ) eiF (f ,tf ) U (tf , ti )eiF (i ,ti ) ,

(t) (t) eiF (,t)
(56)
such that, if we consider the expectation value of some composite operator O(, ) in the transformed vacuum state, it yields

O(t) = U (t, ti )eiF (,t) O(t)eiF (,t) U (t, ti ) .
(57)
Now, if O is only a function of everything commutes and the phases cancel out, and when
O depends on , one can use Eq. (53) to see that the phases will cancel out in general [51].
However, the same argument does not apply if F depends on , in which case the theory is
not invariant beyond tree-level. In fact, if F is linear in , the transformation is still a point
transformation, which will take + F / . This is exactly the kind of transformation
needed to transform from the variable to the variable on super-horizon scales, up to a trivial
time-dependent scale transformation [32]. We may note, that when we are evaluating the Smatrix, we take the expectation value in a sandwich of states in the past and future infinity. This
implies that the S-matrix is still invariant under the type of point transformation with F linear
in , provided the field, , vanishes fast enough in future and past infinity [52,53].
The tree-level correlation functions to nth order are still independent of the phase F on superhorizon scales, since, as discussed above, on those scales the commutator vanishes very fast as
a 3 and the fields essentially becomes classical [30,31]. However, on sub-horizon scales or for
loop corrections where quantum effects becomes important, the two variables does not appear to
yield equivalent results. In fact, this seems to raise a question regarding the physical interpretation
during inflation of the non-linear quantity, constructed to be conserved on all scales in Ref. [54].
One could argue for calculating the loop corrections in terms of , since it is really the observable quantity, which is conserved on super-Hubble scales [30]. However, it is not clear that
the classically conserved is also conserved at the quantum level [30]. This imply that we would
first have to calculate the loop corrections in order to find the conserved in the one-loop effective theory. In addition, this argument would only be strictly valid in single field models, since
in multi-field inflationary models is anyway not conserved on super-Hubble scales. On the
other hand, the SasakiMukhanov variable really represents the fundamental degrees of freedom,
which appears to be the fluctuations of the matter fields. In terms of the matter field fluctuations,
the splitting between the background and the fluctuations is very clear and well defined in terms
of the tadpole renormalization condition. It thus appears to be more appropriate to calculate the
loop corrections in terms of the SasakiMukhanov variable.
93
4. Discussion
We have generalized and expanded the analysis in Ref. [5] of the one-loop correction to the
two-point correlation function of inflaton fluctuations. It is shown that in chaotic inflation, where
inflation starts just below the self-reproduction regime and gives rise to huge total amount of
e-foldings, the one-loop corrections may be physically significant. For the model of 4 chaotic
inflation investigated in Ref. [5], our results are consistent with a maximum correction of a few
percent if the observable modes exited the horizon about 60 e-foldings before the end of inflation (N = 60) [5]. The effect can be slightly larger, of the order 15%, if the modes exited 45
e-foldings before the end of inflation (N = 45). Having generalized our result to general inflationary potential and especially to general models of monomial chaotic inflation, we find it
very intriguing that in a model of m2 2 chaotic inflation the one-loop correction to the two-point
correlation function of inflaton fluctuations can be as large as 70% for N = 45.
This seems to imply that the one-loop effects might have important consequences, when the
cosmological data reaches the level where they in principle can rule out the m2 2 model of
inflation. In fact, we are intrigued by the fact that cosmological data in the near future will reach
the level where one appears to be able to discriminate between a pure low-energy effective m2 2
model of inflation with only the minimal total number of e-foldings, and the chaotic m2 2 model
of inflation with a huge total number of e-foldings. This possibility deserves further studies.
However, one should be aware of some shortcomings in our treatment of one-loop corrections
so far. For practical reasons, we have only calculated the one-loop correction to the scalar powerspectrum of inflaton fluctuations, while we have not yet been in the position to calculate the
correction to the tensor spectrum. The calculation of the one-loop corrections to the scalar powerspectrum already required us to calculate the fourth order action of scalar perturbations. If we
desired to calculate the one-loop correction to the tensor power-spectrum, we would also need
the fourth order action for the tensor modes, which has not yet been calculated.
In addition, one should note that in the case of m2 2 chaotic inflation the one-loop effects can
be so large, that the perturbative approach is on the verge of breaking down. This indicates that
we in principle also should include two-loop effects in a precise treatment. However, this would
require us to calculate the action of scalar perturbations beyond fourth order.
Acknowledgements
I would like to thank Kari Enqvist, Steen Hannestad, Nemanja Kaloper, Shinsuke Kawai,
James Lidsey, David Lyth, Sami Nurmi, David Seery and Filippo Vernizzi for interesting and
motivating discussions. I would also like to thank Kari Enqvist and Helsinki Institute of Physics
for the hospitality during the completion of parts of this work, and David Lyth and the Department of Physics at Lancaster University for the hospitality during the workshop Non-Gaussianity
from Inflation, June 2006.
References
[1]
[2]
[3]
[4]
[5]
[6]
D.N. Spergel, et al., astro-ph/0603449.

W.H. Kinney, E.W. Kolb, A. Melchiorri, A. Riotto, Phys. Rev. D 74 (2006) 023502, astro-ph/0605338.
M. Tegmark, et al., astro-ph/0608632.
J. Hamann, S. Hannestad, M.S. Sloth, Y.Y.Y. Wong, astro-ph/0611582.
M.S. Sloth, Nucl. Phys. B 748 (2006) 149, astro-ph/0604488.
A.D. Linde, Phys. Lett. B 129 (1983) 177.
94
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
V.F. Mukhanov, L.R.W. Abramo, R.H. Brandenberger, Phys. Rev. Lett. 78 (1997) 1624, gr-qc/9609026.
L.R.W. Abramo, R.H. Brandenberger, V.F. Mukhanov, Phys. Rev. D 56 (1997) 3248, gr-qc/9704037.
L.R.W. Abramo, R.P. Woodard, Phys. Rev. D 60 (1999) 044010, astro-ph/9811430.
N. Afshordi, R.H. Brandenberger, Phys. Rev. D 63 (2001) 123505, gr-qc/0011075.
L.R. Abramo, R.P. Woodard, Phys. Rev. D 65 (2002) 063515, astro-ph/0109272.
L.R. Abramo, R.P. Woodard, Phys. Rev. D 65 (2002) 063516, astro-ph/0109273.
F. Finelli, G. Marozzi, G.P. Vacca, G. Venturi, Phys. Rev. D 65 (2002) 103521, gr-qc/0111035.
F. Finelli, G. Marozzi, G.P. Vacca, G. Venturi, Phys. Rev. D 69 (2004) 123508, gr-qc/0310086.
B. Losic, W.G. Unruh, Phys. Rev. D 72 (2005) 123510, gr-qc/0510078.
P. Martineau, R.H. Brandenberger, Phys. Rev. D 72 (2005) 023507, astro-ph/0505236.
C.H. Wu, K.W. Ng, W. Lee, D.S. Lee, Y.Y. Charng, astro-ph/0604292.
W. Unruh, astro-ph/9802323.
G. Geshnizjani, R. Brandenberger, Phys. Rev. D 66 (2002) 123507, gr-qc/0204074.
G. Geshnizjani, R. Brandenberger, JCAP 0504 (2005) 006, hep-th/0310265.
K. Dimopoulos, T. Prokopec, O. Tornkvist, A.C. Davis, Phys. Rev. D 65 (2002) 063505, astro-ph/0108093.
T. Prokopec, E. Puchwein, Phys. Rev. D 70 (2004) 043004, astro-ph/0403335.
T. Prokopec, O. Tornkvist, R.P. Woodard, Phys. Rev. Lett. 89 (2002) 101301, astro-ph/0205331.
T. Prokopec, O. Tornkvist, R.P. Woodard, Ann. Phys. 303 (2003) 251, gr-qc/0205130.
T. Prokopec, R.P. Woodard, Am. J. Phys. 72 (2004) 60, astro-ph/0303358.
E.O. Kahya, R.P. Woodard, Phys. Rev. D 74 (2006) 084012, gr-qc/0608049.
A.D. Linde, Phys. Lett. B 175 (1986) 395.
A.S. Goncharov, A.D. Linde, V.F. Mukhanov, Int. J. Mod. Phys. A 2 (1987) 561.
D. Seery, J.E. Lidsey, M.S. Sloth, astro-ph/0610210.
S. Weinberg, Phys. Rev. D 72 (2005) 043514.
S. Weinberg, Phys. Rev. D 74 (2006) 023508, hep-th/0605244.
J. Maldacena, JHEP 0305 (2003) 013, astro-ph/0210603.
R. Arnowitt, S. Deser, C.W. Misner, gr-qc/0405109.
P. Creminelli, JCAP 0310 (2003) 003, astro-ph/0306122.
D. Seery, J.E. Lidsey, JCAP 0506 (2005) 003, astro-ph/0503692.
D. Seery, J.E. Lidsey, JCAP 0509 (2005) 011, astro-ph/0506056.
J.S. Schwinger, J. Math. Phys. 2 (1961) 407.
L.V. Keldysh, Zh. Eksp. Teor. Fiz. 47 (1964) 1515, Sov. Phys. JETP 20 (1965) 1018.
E. Calzetta, B.L. Hu, Phys. Rev. D 35 (1987) 495.
E. Calzetta, B.L. Hu, Phys. Rev. D 37 (1988) 2878.
D. Boyanovsky, H.J. de Vega, Phys. Rev. D 47 (1993) 2343, hep-th/9211044.
D. Boyanovsky, D. Cormier, H.J. de Vega, R. Holman, Phys. Rev. D 55 (1997) 3373, hep-ph/9610396.
D. Boyanovsky, D. Cormier, H.J. de Vega, R. Holman, S.P. Kumar, Phys. Rev. D 57 (1998) 2166, hep-ph/9709232.
D. Boyanovsky, H.J. de Vega, astro-ph/0006446.
D. Boyanovsky, H.J. de Vega, Phys. Rev. D 70 (2004) 063508, astro-ph/0406287.
D. Boyanovsky, H.J. de Vega, N.G. Sanchez, Phys. Rev. D 71 (2005) 023509, astro-ph/0409406.
D. Boyanovsky, H.J. de Vega, N.G. Sanchez, Nucl. Phys. B 747 (2006) 25, astro-ph/0503669.
D. Boyanovsky, H.J. de Vega, N.G. Sanchez, Phys. Rev. D 72 (2005) 103006, astro-ph/0507596.
D. Boyanovsky, H.J. de Vega, N.G. Sanchez, astro-ph/0601132.
H. Collins, R. Holman, Phys. Rev. D 71 (2005) 085009, hep-th/0501158.
A.L. Matacz, Phys. Rev. D 49 (1994) 788, gr-qc/9212008.
J.S.R. Chisholm, Nucl. Phys. 26 (1961) 469.
S. Kamefuchi, L. ORaifeartaigh, A. Salam, Nucl. Phys. 28 (1961) 529.
K. Enqvist, J. Hogdahl, S. Nurmi, F. Vernizzi, gr-qc/0611020.
The disjointed thermodynamics of rotating black holes

with a NUT twist
A.M. Ghezelbash a, , R.B. Mann a , Rafael D. Sorkin b,c
a Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
b Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, ON N2L 2Y5, Canada
c Department of Physics, Syracuse University, Syracuse, NY 13244-1130, USA
Received 5 March 2007; received in revised form 29 March 2007; accepted 2 April 2007
Abstract
We study the solutions of the Euclidean-signature Einstein equations (gravitational instantons) whose
line-element takes the KerrBolt form, characterized by three real parameters: a size, a NUT charge, and
a spin rate. The exclusion of singularities eliminates most combinations of these parameters, leaving only
a disconnected family of solution-manifolds between which continuous transitions are impossible. (The
angular velocity divided by the temperature is forced to be a rational multiple of 2 .) This quantization
prevents the free variations presupposed by an equation like T dS = dM + dJ , and thereby renders the
first law true, false, meaningless, or tautological, depending on how one approaches it.
1. Introduction
Black hole thermodynamics has been a subject of intense study for more than 30 years. It
has provided us with crucial insights into quantum gravity, and has raised some basic questions
concerning quantum theory [1]. Most of this research has concentrated on spacetimes whose
topology at infinity is a product of some spatial manifold with time, which in the (Euclideansignature) thermodynamic path integral becomes S 1 .
E-mail addresses: amasoud@sciborg.uwaterloo.ca (A.M. Ghezelbash), rbmann@sciborg.uwaterloo.ca (R.B. Mann),

rsorkin@perimeterinstitute.ca (R.D. Sorkin).
doi:10.1016/j.nuclphysb.2007.04.004
96
A.M. Ghezelbash et al. / Nuclear Physics B 775 (2007) 95119
In recent years spacetimes with NUT charge and their applications have been studied extensively [212]. Such spacetimes differ from ordinary asymptotically flat spacetimes because, near
infinity, their topology is that of a circle fibered non-trivially over the spatial manifold. In the 4dimensional case the topology at infinity has most often been the Hopf-fibration of the 3-sphere
or one of its quotients (the spatial topology being maintained as that of a 2-sphere).
In spite of these topological complications, the path-integral formulation of gravitational thermodynamics has been carried through in some cases, although, as we will see in detail below,
the calculations are subject to ambiguities. Without fully resolving these ambiguities, one has
computed conserved quantities and gravitational entropy. Unlike when the topology is that of a
product, the entropy of NUT-charged spacetimes has not turned out to be proportional to the area
of the event horizon. NUT-charged spacetimes have also provided interesting tests of the AdS
CFT correspondence and have also furnished counterexamples to conjectured maximal masses
and maximal entropies of asymptotically de Sitter spacetimes [13,14]. In addition, these metrics
have played an important role in the KaluzaKlein setting, where their nontrivial bundle structure
corresponds to nonzero magnetic charge [15].
Comparatively little is known about the properties of NUT-charged spacetimes with nonzero
rotation. These KerrBolt spacetimes are exact solutions to the Einstein equations, and are
known both with and without a cosmological constant. Although a computation of their conserved mass and angular momentum was carried out several years ago [16] (with their entropy
being inferred from the relation (15) whose derivation we recapitulate below), there has been no
investigation of the gravitational thermodynamics of this class of spacetimes.
The purpose of this paper is to carry out such an investigation in the asymptotically flat case.
In Section 2, we review some of the basic thermodynamic and statistical mechanical relationships that one would like to reproduce in the black hole situation, highlighting the central role
of the thermodynamic potential that corresponds to the statistical mechanical partition function. In Section 3, we recall the KerrBolt metric, and in Section 4 we find, unexpectedly, that
its singularities can be removed only for certain, very special values of the parameters. In Section 5 we find that asymptotic regularity in the presence of NUT charge implies a closely related
quantization of the parameters of any thermodynamic reservoir that can be represented adequately by boundary conditions on the gravitational path integral. In Section 6, we calculate the
action-integral and the resulting expressions for energy and angular momentum of the KerrBolt
spacetimes, pointing out some unresolved ambiguities entering into the calculation. In Section 7,
we discuss the status of the first law of thermodynamics with respect to these spacetimes, especially in light of the parameter quantization found in Sections 4 and 5.
2. Review of thermodynamics: Everything from log Z
First let us recall some thermodynamic relationships from flat spacetime. The expression
=
1
exp{ H + J},
Z
(1)
where = , describes the Gibbs equilibrium state of a system in contact with a environment/reservoir of temperature T = 1/ that rotates with angular velocity around a specified
axis. Here, H is the Hamiltonian operator and J is the angular momentum operator, or rather its
component about the specified axis. In addition, of course, the partition function Z is given by
Z = Tr exp{ H + J}.
(2)
97
one has
It follows that, under variations of and ,
d log Z = H d + J d .
(3)
Defining the thermodynamic mass M = H and the thermodynamic angular momentum

J = J, and setting = log Z 1 , we can write this as a relation among one-forms1
= M J
or equivalently

M=
,

(4)

J =

(5)
For the entropy, we have S = Tr log 1 log 1 = log Z + H J or equivalently

.
S = M J
(6)
This states that S is the Legendre transform of with respect to and :

S=
+
.

(7)
Also,
= M J () = (M J ) J ,
whence
(8)

= M J,
(9)
whence
= (M J ) =
S = M J

,
(10)
an alternative expression for the entropy that is useful in understanding the origin of the area
law [17]. Taking the gradient of (6), we find immediately a final well-known equation,
S = M J.
(11)
Although the simple derivations rehearsed above take as their point of departure the Gibbs
state (or in other words the canonical ensemble), the resulting relations among the thermodynamic variables , S, M, J , , (or ), can be valid more generally, and we will assume
that this is the case, specifically, for the relations (4), (6), (7), (10) and (11). (Remember in this
connection that a canonical ensemble does not adequately describe systems in which two different thermodynamic phases are present simultaneously, nor can it be taken literally for systems
containing black holes.)
Now in a path integral calculation of the partition function, one is formally summing over
configurations which are periodic under a shift of i in time and i in angle. One might perhaps accomplish this by an analytic continuation in both time and angle, but often it is more
1 In the following equations we denote the gradient operator on the manifold of equilibria by the symbol . We reserve
the symbol d for infinitesimal variations (or for its role in expressing integration).
98
convenient to continue in time alone, while keeping the angle real. However, one must then do
and the final result can be expressed as follows. Let = it be the
a subsidiary continuation in ,
Euclidean time, let IE = IE ( ) be the Wick-rotated action of the path , and let ZE (, ) be
the partition function, defined as the value of the path integral with integrand exp{IE ( )}, taken
over all periodic paths whose period is given by the Killing vector = / + /.
The thermodynamic partition function Z that figures in the general formulas given above is then
an analytic continuation of ZE to imaginary :
Z = ZE (, i ).
(12)
Let us turn now to thermodynamic systems for which gravity is important. For isolated gravitating systems, one customarily omits any explicit environment, idealizing it instead as the region
at infinity in an asymptotically flat spacetime. In this idealization the temperature, angular veloc will be
ity, etc., will be encoded in the asymptotic metric, and the partition function ZE (, )
given by a gravitational path integral in which the metric is required to satisfy boundary condi We will return
tions corresponding to the desired values of the intensive parameters and .
2
to the correct definition of these boundary conditions in a later section.
Once the boundary conditions are in place, the gravitational path integral can be approximated
using the saddle point method. To zeroth order the saddle point approximation is just the value
of the integrand at the saddle point, which in this case means that we approximate Z as exp(I )
where I is the value of the Wick rotated action at the saddle point or instanton solution, analytically continued in as described above. Strictly speaking, one should include the one loop
terms in the saddle point approximation, since they are actually comparable in magnitude to
the contribution from I . However we will follow the usual practice of ignoring them, or rather of
attributing their contribution to ambient thermal radiation, and attributing the contribution from I
to the horizon per se. We thus approximate, for the isolated black hole,
I.
(13)
With this approximation, our basic thermodynamic formulas become
I = M J ,
and
I =
S = M J
(14)
I
I =

I.
(15)
3. The KerrBolt line element before identifications

The spatio-temporal line element from which we will build our instantons is given by
ds 2 =

2
2 sin2
L
dt + 2n cos a sin2 d + 2 a dt r 2 + n2 + a 2 d
2
L
L
L2 dr 2
+ L2 d 2 ,
L
(16)
2 The path integral used in the gravitational case is basically just copied from the non-relativistic case, even though the
analytic continuation it presupposes has never really been understood, as far as we know.
99
where L = r 2 2mr + a 2 n2 and L2 = r 2 + (n + a cos )2 . The corresponding line element

of Euclidean signature is obtained by analytically continuing the t coordinate, the NUT charge n
and the rotation parameter a to imaginary values:
ds 2 =

2
2 sin2
E (r)
dt 2n cos a sin2 d + 2 a dt r 2 n2 a 2 d
2
E
E

2
dr
+ E2
+ d 2 ,
E (r)
(17)
where
E2 = r 2 (n + a cos )2 ,
(18)
E (r) = r 2mr a + n .
2
(19)
These metrics are actually special cases of a more general family of metrics with cosmological
constant, which we give for completeness in Appendix A. However, since we will only study the
asymptotically flat setting, we will only need the metrics with zero cosmological constant.
We also remark here, that (17) can be expressed more felicitously in terms of the spinor coordinates of (C.1) below, if one takes advantage of invariantly defined one-forms such as
[18]. Using that technique, one should be able to carry out the removal of the singularities more
perfectly than we do below, but in this paper we will work with the more familiar polar coordinates r, , , and t .
The above line element depends on three parameters, n, m and a.
Equivalently, we can employ
instead of the mass parameter m the horizon radius r0 = m + m2 + a 2 n2 , the outermost
root of E (r) = 0, which (for Euclidean signature) is the lower limit of the allowed range of
variation of r. In terms of r0 we have
r02 a 2 + n2
.
(20)
2r0
Our parameters are then n, r0 and a. Notice that one parameter (say n) can be identified with an
overall scale, whence the distinct shapes are determined by only two parameters.
For future reference we define also the angular velocity of the horizon

gt
a
= 2
,
H =
(21)
g
r n2 a 2
m=
r=r0
and the inverse horizon temperature

H =
4(r02 n2 a 2 )r0
r02 n2 + a 2
(22)
related to the surface gravity

=

r02 n2 + a 2
r0 m
dE
1
=
,
=
2(r02 n2 a 2 ) dr r=r0 (r02 n2 a 2 ) 2(r02 n2 a 2 )r0
(23)
by H /2 = 1/.
All the instantons we will consider will be derived from (17) by identifications. As we will
see, these identifications will not only twist the imaginary-time direction near infinity, but also
modify even the topology of the spatial two-sphere at infinity, somewhat in the manner of the
so-called Asymptotically Locally Euclidean instantons. That is, the topology at infinity will
(unfortunately) not necessarily be that of a circle bundle over S 2 .
100
4. The non-singular instantons

As written, our line element is singular along the polar axes = 0, and at the horizon
r = r0 . The string singularities at the poles have, as is well known, a more complicated structure than that belonging to the familiar spherical coordinates for S 2 , and they cannot be removed
merely by the imposition of periodicity in azimuth . Rather two separate identifications are required to remove the two strings. Moreover, one needs a further identification to remove the
singularity at the horizon, and all of the identifications must be compatible. This compatibility
condition is crucial, but it seems to have been overlooked in the literature on these metrics. To
analyze it, we may refer everything to a lattice in the time-azimuth plane.
4.1. Lattice analysis
That /t and / are both Killing vectors makes it possible to quotient our metric by any
linear combination of the two (with constant coefficients). More generally, we can quotient by
any lattice L of vectors which is closed under vector sum and difference. The resulting spacetime
is entirely determined by L. Let us consider how L must be chosen in order to remove the two
types of singularity we have to deal with.
To take advantage of the scale invariance of our family of metrics, we will introduce in place
of t the dimensionless coordinate
= t/2n,
(24)
and the corresponding Killing vector

/ = 2n /t.
(25)
Now look at the neighborhood of the north polar axis, = 0. At the pole, the line-element
acquires a degenerate direction. In order to compensate, we must quotient by a vector + that is
parallel to this degenerate direction and whose length is chosen so that the quotient metric will
not exhibit a conical singularity. That is, the length of + at a proper distance from the north
pole, must be 2 to first order in . Some algebra shows that the required vector is
+ = 2(/ + /).
(26)
This therefore is one of the vectors of L, but we can say more. It is also necessary that no
submultiple of + belong to L; otherwise we would really be quotienting by a smaller vector.
/ L for
That is, + must be a minimal element of the lattice. (By minimal we mean that +
0 < < 1.) The same analysis at the south pole furnishes a second minimal lattice vector,
= 2(/ /).
(27)
The singularity at r = r0 can be analyzed similarly. Although the algebra is a bit messier, it
proceeds along the same lines and reveals a third minimal lattice vector,
= H (/t + H /),
(28)
where H and H were defined in (22) and (21).

Finally, there are the special points where the polar axes meet the horizon. When the angular
velocity vanishes, no extra conditions on L are imposed by regularity at these special points; by
analyticity there is no reason to suppose that extra conditions arise when rotation is turned on.
101
Fig. 1. The vectors + , , , and the lattice L for the case p = 2, q+ = 5, q = 1.
What does impose a crucial extra condition, however, is the requirement that our quotient
spacetime be a manifold. If the lattice L were dense in the -t -plane, this obviously would not
be the case, as the quotient R2 /L would be pathologically non-Hausdorff. In order to preclude
this, one must arrange that three vectors and be commensurate. (In other words, must be
a linear combination of with rational coefficients.) Bringing in the minimality constraints as
well (see Appendix B), one can see that the complete condition is
p = q+ + + q ,
(29)
where p, q+ and q are three relatively prime integers: (p, q+ ) = (p, q ) = (q+ , q ) = 1. See
Fig. 1.
Notice that these conditions have the exceptional solution, p = q+ = 1, q = 0. For this
solution, coincides with + , but in all other cases (except for the mirror image case = ) all
three vectors point in distinct directions. This exceptional solution will also be exceptional in its
thermodynamic properties.
From (29), we find
q + + q
,
p
1 q + q
H =
,
2n q+ + q
H = 4n
(30)
(31)
and therefore
H = 2
q + q
.
p
(32)
We emphasize that the parameters p, q are integers and therefore not continuously variable.
For this reasonand in sharp contrast to instantons without NUT chargethe solutions considered here fall into disconnected families between which no continuous transition is possible.3
3 One might suspect that this is related to the analogous findings of [19] in the Lorentzian setting.
102
4.2. What is the topology near infinity?

The identifications induced by the vectors , , produce a Riemannian manifold that, near
infinity, has (for fixed radius) the topology of a lens space, or more particularly of the lens space
L(p; q+ , q ).
We review some lens space lore in Appendix C. By the expression L(p; q1 , q2 ) we denote
the quotient of the unit sphere S 3 C2 by the action of the matrix, diag(1q1 /p , 1q2 /p ), where
1y e2iy and the qi can be construed as belonging to Zp , the integers modulo p. In fact the
lens space L(p; q1 , q2 ) depends only on the ratio q = q1 /q2 taken in Zp and for this reason
it is commonly denoted simply by L(p, q). (In the exceptional case, p = q+ = 1, q = 0, we
would have q = 1/0 = 0/0, since Z1 is itself zero, and 1 = 0 therein. If we make the obvious
convention that 0/0 = 0 in Z1 , then our notation tells us that L(1; 1, 0) = L(1, 0), which of
course is just S 3 , since one is quotienting by the identity.) To grasp why L(p; q+ , q ) is the
result of quotienting the coordinates by our lattice L, note first that an initial quotienting
by + and produces simply the 3-sphere, S 3 , as explained, for example in [15]. (The integral
curves of the vector field + + then constitute the Hopf fibration of S 3 over S 2 .) Translation
through the vector = (q+ /p)+ + (q /p) can then be seen to be equivalent to multiplication
by diag(1q+ /p , 1q /p ) when S 3 is represented in terms of complex coordinates as above. (These
relationships appear with particular clarity when expressed in terms of the spinor coordinates
alluded to earlier.)
For us, the important point is that different choices of the discrete parameters p, q yield
in general different topologies near infinity, because the lens spaces L(p, q) are in general not
homeomorphic. (This makes it obvious that our collection of instantons could not have formed
a single continuous family.) But that is not all. Except when q = 1, L(p, q) is not even a circle
bundle over S 2 . It follows that near infinity the corresponding instantons do not even possess
the spatial topology of an R3 ; that is, they are not asymptotically flat, even in the topological
sense. Only for q = q+ = q = 1, meaning = p1 (+ + ), do we have a full 2-sphere at
infinity. (This solution would look near infinity like a monopole of charge p if interpreted as a
spatial hypersurface in a KaluzaKlein spacetime.) For all other q, the fraction of 4 that remains
turns out be 2/(q+ + q ), independently of the value of p.
Remark. We have seen that not all values of and are accessible. Correspondingly, where IE
because different L(p, q+ , q ) can
is defined, it fails to be a single-valued function of and ,
belong to the same (cf. (41) below).
4.3. Parameter counting
We have found in the last two subsections that the exclusion of singularities rules out a large
fraction of our original family of (local) solutions (17). How many free parameters remain?
Originally, we began with three continuous parameters, n, r0 and a, but our lattice analysis
revealed that a more appropriate starting point would have been the three integer parameters
p, q . Once these are chosen, two combinations of n, r0 and a are determined, as we have seen,
to wit: H /n and H . Thus only a single continuous parameter remains.
The one exception occurs when = + (or ). In that case, it turns out that (21) and (22)
are not independent conditions on n, r0 and a, and there survives a 2-dimensional manifold of
instantons, which can be parameterized by, for example, H and r0 .
In no case, though, can the parameters H and H vary independently, since their product
is fixed to 2(q+ q )/p by Eq. (32). For any other combination of horizon temperature and
103
angular velocity, no instanton is available to contribute to the thermodynamic partition function

Z in the saddle point approximation. Although this suggests a pathological variation of Z with
thermodynamic temperature and/or angular velocity, we will see next that this would probably
be an incorrect way to read the situation. The correct conclusion seems to be rather that the
thermodynamic reservoirs corresponding to the forbidden points fail to exist altogether.
5. Boundary conditions at infinity for a rotating, heated environment with NUT charge
is meant to describe a system imAs reviewed in Section 2, a partition function Z(, )
mersed in an environment which is rotating with the angular velocity = /.

Whether such
a picture really makes sense in a spacetime of NUT type can certainly be doubted, because the
string singularities associated with the NUT charge (respectively, the closed causal curves that
result if one makes identifications to remove the strings) might be taken to render the spacetime
itself unphysical. If this were the case, then the Wick rotation leading to the Euclidean-signature
path integral invoked earlier (whose justification is obscure in the best of cases, when gravity
is involved) would also be meaningless, even if the instanton itself were free of singularities.
Despite such doubts, however, one can still hope that something of the usual thermodynamic
schema will continue to make sense in the presence of NUT charge.
Proceeding in this vein, let us ask what boundary conditions at infinity could represent the
type of reservoir we want. In an asymptotically flat spacetime (no NUT charge), the answer
is clear-cut and derives from the flat-space identifications described earlier. But since the line
element we are working with is also flat near infinityat least locallywe can hope to arrive at
a unique set of boundary conditions for it as well by analogy with the asymptotically flat case.
For large radii, our line element (17) takes the form

ds 2 = (dt 2n cos d)2 + r 2 d 2 + sin2 d 2 .
(33)
The singularities at = 0, are essentially the same as before, and the same analysis goes
through, telling us that in order to render the metric nonsingular, we must divide out by the same
two vectors + and that we found earlier. (With zero NUT charge, we would at this stage
simply have divided out by 2 /, i.e. by nothing at all had our coordinates been Cartesian.)
This having been done, we can consider what further identifications will express the thermodynamic boundary conditions we want. We saw earlier that in the absence of curvature (i.e. in
Wick rotated Minkowski spacetime) one obtains ZE by identifying with respect to the vector4
= /t + /,
(34)
which says that the manifold is to be made periodic with respect to simultaneous shifts of the
Here, we will simply assume that the same expression
Euclidean time by and of angle by .
applies in our situation. In respect of the first term, this assumption would seem to be on safe
ground, since the Killing vector /t is singled out as the unique one whose norm goes to unity
at infinity. We have no equally compelling argument for the second term, but we will assume it
to be correct nonetheless.
The above equation may be taken as the definition of the thermodynamic parameters and .
Physically the latter represent intensive variables conjugate respectively to the energy and the
angular momentum, and they would of course be interpreted as specifying the temperature and
4 In Section 2 we wrote for t to avoid confusion between Lorentzian and Euclidean time.
104
(after the analytic continuation (12)) the angular velocity of the reservoir. Given this interpretation, it is the requirement of global non-singularity that makes the connection between the metric
parameters H and H defined at the horizon, and the thermodynamic parameters and
defined at infinity (because it forces the vector to be the same everywhere).
Remark. This is actually an oversimplification, because all we really know is that the lattice of
identifications L must be constant, and this leaves open the possibility that H and could be
different minimal vectors in L. In this way, one and the same instanton can apparently contribute
to many different values of the partition function. In this connection, we remark also that the
definition of the Euclidean suggests that, unlike its Lorentzian counterpart, it is only defined
modulo 2 .
With the additional identification with respect to we are now back in exactly the same situation as before, characterized by a lattice L = span{, + , }. The commensurability conditions
are then also as before, and they lead to the analogous conclusion that the manifold of possible
equilibria is not connected: we do not have a reservoir for every possible choice of temperature
and rotation rate! In particular, we learn from (34), (29), and (26)(27) that = cannot vary
continuously at all; it is fixed at the value
= 2(q+ q )/p.
(35)
Obviously, this will make it difficult to define a thermodynamic mass, angular momentum and
entropy as we have done above, since the partial derivatives occurring in the definitions cannot
all be taken without going outside the manifold of equilibria.
Remark. If we were willing to let vary, then we could introduce a second free parameter.
However, the resulting singularities at the poles (which would be conical singularities or worse)
would result in divergent expressions for the Euclidean action I .
Finally, it is worth asking what went wrong. Why, for example, is it possible to vary and
freely and independently for the Kerr instanton? The answer, as intimated earlier, lies in the
nature of the lattice L. Without NUT charge, the coordinate singularity of polar coordinates
can be canceled by a single identification with respect to the lattice vector 2 /, leaving the
vector that carries the thermodynamic information free to roam unmolested through the entire
t --plane, because, in a lattice with only two generating vectors, no incommensurability can
arise. (One could equally well say that there are still three generating vectors, but they reduce
effectively to two, since the identifications needed at the north and south poles are the same. In
this way of looking at it, turning off the NUT charge flattens these two lattice vectors so that
they become proportional: = + .) For all choices of , the quotient topology is then simply
S 2 S 1 . But with NUT charge, the angular singularities of the line element are altered, and two
independent identifications are needed to cancel them. The thermodynamic vector is then
restricted to a discrete set of values, each yielding one of the lens space topologies.
6. The Euclidean action, conserved mass and angular momentum
In order to apply the thermodynamic definitions of Section 2 to our family of instantons, we
need to define the Euclidean action I . Once we have done so, we can also appeal to Noethers
105
method to derive a second pair of definitions of mass M and angular momentum J, logically distinct from the thermodynamic quantities defined earlier. We use fraktur letters to distinguish
these quantities from the corresponding thermodynamic quantities defined earlier. We will call
the first set of quantities thermodynamic and the second set conserved (despite the doubts
about conservation expressed below). For consistency of the analysis, one would hope to obtain
the same values, no matter which definition one used, but we will see below that things are not
quite that simple when NUT charge is present.
In Minkowski spacetime, Noethers theorem furnishes us with a set of quantities derived directly from the action principle underlying the dynamics, and conserved in the sense of taking
the same value no matter which Cauchy surface they are evaluated on. In a gravitational setting
both the action and the conserved quantities must be defined relative to an environment at infinity. The corresponding boundary conditions are delicate however, and people have devised
several distinct versions of the definitions for application to curved spacetimes. In addition to
the counterterm method that we will employ here, we mention in particular the method of [20]
and [21]. In [22] and [23], the earlier form [20] of this method was successfully applied to the
KaluzaKlein monopole, a spacetime whose topology resembles that of the gravitational instantons we are studying in the present paper. It would be interesting to apply its later form [21] to
some of the solutions with NUT charge.
Unfortunately, the Noetherian method, when carried over to spacetimes with NUT charge,
has trouble rooted in the topological twist characterizing such spacetimes. Either a string singularity is present or else closed timelike curves. In the former case, one encounters divergent
integrals, which have to be renormalized somehow; in the latter case, the integrands are finite,
but the surface over which one wishes to integrate gets torn. For present purposes, we need
to set up mass and angular momentum integrals for unphysical metrics of Euclidean signature
(instantons), but it turns out that the topological obstruction to doing so is essentially the same as
in the Lorentzian case.
In defining the gravitational Euclidean action, one integrates over the entire spacetime, and the
fact that the topology at infinity is that of a twisted circle bundle presents no particular problem
of principle. But the topology does present a problem when one tries to define mass and angular
momentum.5 The integrals defining these quantities ought to extend only over a section of the
bundle, but in a spacetime with NUT charge, no such section exists. If one ignores this problem
and integrates, over a surface with boundaries, then the result depends on which surface one
chooses because flux can leak through the gaps. This means that the mass M and angular
momentum J one ends up with are not really conserved in the sense of being independent of
integration surface.
In light of the above, it seems fair to say that the definitions of mass and angular momentum in
NUT-type spacetimes are inherently ambiguous (mirroring, perhaps, the ambiguities of definition
in the thermodynamic angular momentum). In the following we will use the values that result
from a seemingly natural choice of integration surface, namely a surface of constant t .
In applying the counterterm method, we will temporarily allow for a nonzero cosmological
constant which we will then send to zero in order to obtain values appropriate to the asymptotically flat case. We refer the reader to Appendix A for the notation and the explicit form of the
metric. To compute M and J, we consider the four dimensional action that yields the Einstein
5 See [19] for related remarks.
106
equations with a negative cosmological constant

1
1
d 4 x g(R 2)
d 3 x hK + Ict ,
I=
16
8
M
(36)
where M is the boundary for constant (large) r, and M d 3 x indicates an integral over this
boundary, whose induced metric and extrinsic curvature are respectively h and K induced
from the bulk spacetime metric g . Ict is the counter-term action, calculated to cancel the divergences from the first two terms, and we have set the gravitational constant G = 1. The associated
boundary stress-energy tensor T ab is obtained by the variation of the action with respect to the
boundary metric T ab = hIab , the explicit form of which can be found in [24,25].
If the boundary geometries admit an isometry generated by a Killing vector a , then it is
straightforward to show that T ab b is divergenceless, from which it followsat least locally
that there will be a conserved charge Q between surfaces of constant t . If /t is itself the
Killing vector, then Q = M, the conserved mass associated with the boundary surface at any
given time.
For the Killing vector /t we can obtain the conserved mass, from the KerrBoltAdS metric near infinity. [The KerrBoltAdS metric can be obtained by analytic continuation of the
cosmological parameter in the metric (A.1) or (A.4) from positive to negative values.] For vanishing cosmological constant, this reduces (in the limit of infinite radius) simply to the mass
parameter m. The other conserved charge, associated with the axial Killing vector /, is the
am
conserved angular momentum, J =
2 , which vanishes when a = 0, and reduces to am in the
limit of vanishing cosmological constant.
Finally, the action itself for the KerrBolt metric is given by
IE =
4nm
.
p
(37)
The values (37), (38) and (39) can also be obtained directly in the asymptotically flat setting,
without introducing a nonzero cosmological constant.
Remark. In arriving at (37), one needs to do a trivial integral over and t . We have used for the
value of that integral not 2, as one might naively expect to obtain, but the area of a unit cell
of our lattice, namely 16 2 n/p.
As explained above, we have computed M and J ignoring the fact that our surfaces of constant t are both multivalued and singular at the poles. The latter difficulty we have agreed to
ignore for now, but the multivaluedness can and should be corrected for by introducing the factor
2/(q+ + q ) obtained in Section 4.2 above. When we do so, we find
2
m,
q + + q
(38)
2
am.
q + + q
(39)
M=
and
J=
In computing I , there was no ambiguity of the sort that affected M and J, but unfortunately,
there is a different kind of ambiguity. As explained earlier in connection with Eq. (12), the action
IE belongs in effect to a purely imaginary value of the angular velocity. In order to obtain the true
107
but the desired analytic

thermodynamic potential , we would need to continue from to i ,
is defined only for a discontinuous set of ,
and
continuation is ambiguous because IE (, )
where it is defined, it is a multi-valued function of and (as follows from (41) below). Clearly,
this ambiguity will also affect the definition of the true thermodynamic M and J . Once again the
topology is getting in the way of the thermodynamics!
For now, let us just record the value of the pre-continuation action (37). Expressed as a function of p, q , and n, it is
IE = 4n2
p + 4q+ q /p

,
p(q+ + q ) + 4q+ q (p 2 (q+ q )2 )
(40)
which can also be written simply in terms of the rationalized dimensionless variables = /4n,
= /2
as
IE =
4n2
1 + 2 2

.
p
+ (1 2 )( 2 2 )
(41)
Notice that for our exceptional solution (where p = = = 1) this reduces to

IE = 4n2 =
2
.
4
(42)
7. Do the thermodynamic relationships hold?

Does the first law hold for rotating black holes with NUT charge? If it is meant to include
the normal relationship between horizon area and entropy, namely
S = A/4G = 2A/,
(43)
(where = 8G is the rationalized gravitational constant), then it fails already in the nonrotating case, a = 0 [16]. With nonzero a, it seems at first sight to fail even more abjectly,
because one cannot even define a consistent entropy on the basis of the formula (11) together
with the formulas for the conserved mass and angular momentum given in Section 6. However, the question is complicated by the unusual nature of the manifold of equilibrium states, as
it emerged in Section 4 above, which is not simply parameterized by the quantities and .
Given this, it seems best to begin by reconsidering the meaning of the first law in as general a
setting as possible.
To that end, let E be the manifold of thermodynamic equilibrium states of some system for
which the only relevant environmental parameters are the (inverse) temperature and the
angular velocity . (Notice that this does not imply that dim E = 2. The case dim E > 2 is
perfectly possible for a system undergoing a phase transition; and dim E = 1 also makes sense
if and fail to be independent.) Normally, we would expect a number of state functions
to be defined on E, including the thermodynamic potentials = log Z 1 and S, the intensive
and the corresponding extensive quantities M (or E) and J . Among these
quantities and ,
six scalar fields on E there subsist, normally, three basic relationships recalled earlier as Eqs. (4),
(6) and (11):
= M J ,
,
S = M J
(44)
(45)
108
S = M J.
(46)
We may thus interpret the question whether the first law holds as asking whether these three
equations are verified. Or we may interpret it more broadly to require, in addition, the area
law (43).
Note that in order for the points of E to represent genuine equilibria, it is necessary that (46)
hold not only on tangent vectors within E but also for variations that lead off of E; i.e. it must
hold as an equation among 1-forms in the larger state-manifold, not just when pulled back to E.
In addition, the equilibrium will be unstable unless certain subsidiary conditions of a quadratic
nature are fulfilled. (For discussion of some of these points see for example [26] and [27].) To
our knowledge these secondary conditions have never been verified for the black hole case, not
even for a geometry as simple as that of the Schwarzschild metric. Doing this would place black
hole thermodynamics on an even firmer footing than it already rests on.
Of course these equations are not all independent, because (44) and (46) are equivalent in the
presence of (45). For example, suppose we begin with a family of Euclidean-signature instantons
If we then define M and J from (44)
parameterized by and and set = IE (, i ).
[assuming this is possible] and S from (45), then (46) becomes a tautology; there is nothing to
check except possibly (43). On the other hand, if we know independently any of S, M and J ,
then there exist consistency conditions to be satisfied. In addition, Eq. (44) itself implies a certain
consistency condition since it implies that is (locally) a function of and alone, and this is
not automatic when dim E > 2 or when and fail to span the cotangent space to E.
Now let us turn to the case of interest in this paper. As we have seen, the possible reservoirs
at infinity, and therefore the possible equilibria, fall into families parameterized by the relatively
prime integers p, q . Let E(p, q ) be the equilibrium submanifold belonging to given p, q , or
rather the set of instantons belonging to them, since we are working in the approximation (13).
What is peculiar about our situation is that is constant on each submanifold E(p, q ), as we
learned in Section 4. Its gradient is thus zero and the basic equation (44) reduces simply to
= M ,
(47)
while (46) also simplifies to

) = M.
(S + J
(48)
Eq. (45) is not affected as such, although J can no longer be read off from and must come
from somewhere else, assuming it is defined at all. The conditions we have to consider in connection with the first law are thus (47), (45), (48), (43). Implicit in these as well are the conditions
of agreement between the thermodynamic quantities M and J (when they are defined) and their
analogs M and J as given by (38) and (39):
M = M,
J = J.
(49)
For any given equilibrium family E(p, q ), we may thus pose a sequence of questions to
bring out the extent to which the above thermodynamic relationships are verified. Beginning
with as approximatedmodulo analytic continuation!by (40) or (42), we can ask whether
(47) holds6 for some M. If it does, we can take this to define the thermodynamic mass, and we
. But this is as far as we can
can further solve (48) to give meaning to the combination, S + J
6 That (47) holds says basically that (and consequently M) is a function of alone.
109
go on the basis of I alone, since (47) obviously does not let us define a thermodynamic angular
momentum J . Consequently, we cannot define a separate S either, and therefore we cannot give
meaning to (46) or check the area law for entropy.
We pause to note that we could possibly define a thermodynamic angular momentum J if we
could somehow extend off of E (or extend E itself), say by admitting conical singularities in
the metric near infinity, but that would likely introduce divergences into I . Even if this failed, it
might still be possible to bypass the definition of J and define an entropy S directly, if we could
carry over to the KerrNUT case the technique of [17], based on shrinking the metric along .
In order to proceed further, we can bring in our independent knowledge of mass and angular
momentum in the role of conserved quantities. Specifically, we can ask whether the thermodynamic mass M (assuming it was well-defined) agrees with the conserved mass M. If this
consistency check succeeds, we can go further and simply define J to be J. Having done so,
we can obtain S from (45). [Eq. (48) will be guaranteed to hold.] We can then ask whether (43)
holds.
Finally we can proceed in the opposite direction, starting from (46), with M and J now
taken to be M and J, respectively. When (46) is integrable (a nontrivial condition only in the
2-dimensional subcase), we can deduce an entropy S from it, this being, of course, the most
venerable order in which to proceed. And we can then obtain from (45). Having done so, we
can check for consistency with I , and we can compare S with the area.
An analysis along these lines falls naturally into two subcases, the generic case in which
E(p, q ) has only a single degree of freedom, and the exceptional case of E(1, 1, 0), where two
independent parameters remain free. Let us begin with the latter, since it seems to hold the greater
interest.
The exceptional family of instantons is characterized by the equation = + (or equivalently
= ), corresponding to the parameter values p = 1, q+ = 1, q = 0. For these values, we see
from (32) and (30) that
= 2,
= 4n,
= 1/2n,
(50)
and the solution of (21) and (22) is

r0 = n + a,
(51)
(which solves both (21) and (22) simultaneously, leaving free both of the parameters n and a).
We thus obtain a two-parameter family of solutions.7 Substituting (51) into (20) yields for the
mass parameter m, simply m = n. According to (42), the Euclidean action is
IE = 4n2 ,
(52)
we can equate this to , following

and if we ignore the issue of analytic continuation in ,
Eq. (13). The conserved mass and angular momentum reduce for these solutions to8
M = 2n,
J = 2an.
(53)
Finally the horizon area, computed from ds 2 with r = r0 , t = constant, and in the range
[0, 4/(q+ + q )] is
A = 16na.
7 Assuming nothing goes wrong where the horizon meets the polar axes.
8 Strangely, M does not take on the TaubBolt value of 5 n when a 0.
4
(54)
110
We remind the reader of the ambiguities exposed above which affect the values of M, J and A,
especially the value of J, which involves both the question of integration surface and the question
of analytic continuation. The ambiguity in A is similar to that in M but milder thanks to the fact
that = 0 on the horizon (in both the Lorentzian and Euclidean settings).
Now to what extent are these values compatible with our thermodynamic formulas? To start
with, the proportionality of to (Eq. (47)) is not a foregone conclusion, since our solution manifold is 2-dimensional. However, this first test is clearly passed, because both and
depend on n alone. The thermodynamic mass M is thus well-defined and given by
= M
M = 2n.
(55)
As remarked above, we cannot deduce from a thermodynamic angular momentum J or an

entropy S, but we can conclude that
) = M = 4n(2n) = 8n n,
(S + J
(56)
from which it follows that, up to a constant of integration which we may set to zero,
= 4n2 .
S + J
(57)
. The first
J
The first law (11) then holds trivially, for any choice of J , if we set S =
law is effectively a tautology if approached from this direction.
What is not tautological, though, is the agreement (or lack thereof) between the thermodynamic and conserved masses. Somewhat remarkably, they do agree: M = M = 2n. (Notice
that this is twice the value of the parameter m.) This means that (as a two-line computation corroborates) we would have arrived at the same S had we begun from the first law (46) and plugged
in M = M, J = J. If we adopt these values, then we can also obtain S from (45), with the result
4n2
,
S = M J
= 4n(2n a n),
= 4n(n a).
(58)
By way of comparison, the area given in (54) is A = 16na. Clearly, S = 14 A. Rather their ratio
is
S
na
=
,
A/4
a
(59)
which varies from 1 to depending on the ratio n/a.

In summary, only three of our tests were nontrivial in this case: that , that M = M,
and (if we wish to include it) that S = A/4G. The first two were passed, but not the third.
Remark. That S can be negative suggests that something is wrong. If the long sought analytic
, as it normally does (and if it had no further effect
continuation were to reverse the sign of J
on !), then the second term in (58) would change sign and we would have instead of (59),
S
n+a
=
1.
A/4
a
(60)
Now let us turn to the thermodynamics of the generic solutions. Here even fewer nontrivial
tests remain, since there is a single free parameter and almost all the relevant results follow from
111
dimension counting. The only relationship that could go wrong (aside from (43) whose failure is
already familiar) would be equality of the thermodynamic and conserved masses. However, this
agreement follows very simply from the form of Eqs. (41), (38) and (30). According to the first
of these equations, IE takes the form
4
)n
2,
f (,
p
(61)
which, except for the prefactor 4/p, is determined by the fact that Action is a length squared.
From (38) we have similarly
M=
2
)n,
f (,
q + + q
(62)
for the same function f , as one sees from (37) and (38). And for we have from (30),
=
4 q+ + q
n.
p
1
(63)
These three equations imply immediately that

= M ,
(64)
).
We see in particular that the same equality would
independently of the detailed form of f (,
hold if we analytically continued f to f (, i ).

Following along as before, we can deduce
2
= 4n f (,
),
S + J
p
(65)
where the explicit form of the function f is

)
=
f (,
1 + 2 2
(1 2 )( 2 2 )
(66)
as can be read off from the equations referred to above. If we identify J with J, this yields for
the entropy

)
2 2
4n2 f (,
2
.
S=
(67)
p
1 2
The entropy is positive for 0 23 (for 1) while for > 23 it takes positive as well
as negative values, depending on the value of (Fig. 2). (Once again the analytic continuation
the resulting expression for S
in could change the story. If we naively substitute i for ,
is manifestly positive.) By way of comparison, the horizon area is given by

16n2
2 2
A=
(68)
+
.
p 2
1 2
For = = 1, corresponding to the exceptional family of instantons with p = q+ = 1 and
q = 0, the area must be computed separately. One finds 16na.
112
Fig. 2. Entropy as function of and where we set n = p = 1.
Fig. 3. Entropy after analytic continuation as function of and where we set n = p = 1.
with = 1
In connection with the above, we note that one can show that 0 2 1 ,
only if = 1; inequalities which can also be expressed as |q+ q | p q+ + q . Strictly
speaking, however, these inequalities only apply to H and H , not directly to the corresponding
thermodynamic parameters as such. See the Remark between Eqs. (34) and (35).
In the generic case, then, only two of our tests have proved to be nontrivial: that M = M,
and (if we wish to include it) that S = A/4G. The first succeeds, the second fails.
In Figs. 2 and 3 the behaviours of entropy before and after analytic continuation are presented.
113
8. Open questions and closed ones

To claim that black hole thermodynamics in the presence of NUT charge n is in satisfactory
shape would be to ignore a whole host of difficulties that stem from the topological twisting that
arises when n = 0. One of these difficultiesnot previously appreciated as far as we knowis
that suitable boundary conditions fail to exist for all the possible values of the thermodynamic
parameters and . And even when such boundary conditions do exist, there is not always
an instanton to satisfy them, if we exclude singularities like Misner strings and conical points
(e.g. there are no instantons for p > 2, q = 1). This suggests that new instantons might be found
corresponding to these unfulfilled boundary conditions. However, the question is confused by
the fact that different choices of the vector can give rise to the same lattice L. (See the Remark
between Eqs. (34) and (35).)
However, if we limit ourselves to the KerrBolt line element (17), we can classify all nonsingular instantons. We find solutions only for certain triples of relatively prime integers p, q .
In particular, the parameter = , with and defined by (34), is restricted to rational
multiples of 2 , as followed already from the asymptotic analysis. These (purely classical) quantization rules are perhaps the most surprising outcome of the above analysis.
The mathematical problem of finding all KerrBolt instantons is thus largely solved. The
thermodynamic meaning of these solutions, on the other hand, is still in the dark to a significant
extent. The obscurities begin with the correct identification of the parameters and in terms
of the asymptotic metric, and they end with the unknown reason for the failure of the normal
area law for the entropy. In between are numerous other unanswered questions: how to define
conserved quantities like energy when the t = 0 hypersurface is not closed; how to perform
the analytic continuation in that would transform IE into (an approximation to) the seminal
thermodynamic potential ; how to recover a thermodynamic angular momentum J when is
apparently not free to vary; how to interpret a lens space topology at infinity (especially when it
is not a fibration over S 2 ); etc.
This paper had its origins in the failure of the first law to hold when one simplistically carries over the n = 0 formulas (for the Kerr instanton) to the n = 0 situation. We have seen that
this failure was misleading, because the simplistic formulas are incorrect on several counts. For
example, the correct value of the d dt integral that enters into IE is not 2, but 16 2 n2 /p.
When these corrections are made, and when the limitations on and are taken into account,
the various thermodynamic relationships (including the first law) appear to be satisfied to the
extent that they continue to make sense.
However, the obscurities (some of) which we enumerated above leave our results in the previous section somewhat in limbo. Perhaps, a clearer insight into the analytic continuation question
could be had, and could improve ones understanding. Or perhaps the attempt to do black hole
thermodynamics with n = 0 is a case of pushing concepts outside their valid domain of application.
One way to help decide which is correct would be to study non-rotating but charged instantons
with n = 0, where analogous difficulties with the naive thermodynamic relationships are known
to arise.
Another interesting question is why the equality S = A/4G breaks down when n = 0. In light
of [17], this failure appears to be bound up with the failure of a Cauchy surface to exist, but it
would be nice to have a more quantitative understanding of the discrepancy.
In concluding, we would like to return to our exceptional solution in order to point out a
suggestive analogy it bears to a thermodynamic system undergoing a first order phase transition
114
like the melting of ice. In such a situation two phases are present simultaneously and a new
equilibrium parameter arises in addition to the intensive thermodynamic variables characterizing
the reservoir, namely a parameter whose variation signifies the conversion of one phase into
the other. For a reservoir characterized by a temperature and an angular velocity, such a transition
will typically occur along a critical line in the -plane. Now in our situation the analogous
line is given by = = 2 which is a hyperbola in the -plane or a horizontal line in the
-plane.
(In fact, the instantons we have found yield a boundedand totally disconnected
subset of the plane, and our critical line lies on the boundary of the region occupied by this
subset.) Along this line, there arises, as we have seen, a new parameter that becomes independent
of n. Let us take this parameter to be a, varying between 0 and . [Regularity of the line-element
(17) (in the given coordinates) restricts r0 to the range (n, ), while a = r0 n for our solution.]
In the spirit of our analogy, we could think of a as analogous to the amount of ice present in
a mass of freezing water. But does some substance actually disappear when a = 0? In a sense,
the answer is yes and the substance is the black hole horizon! With r0 = n + a, Eq. (17) becomes
spherically symmetric, and the area of the horizon reduces simply to 4(r02 n2 ) = 4a(a + 2n).
Thus, a is essentially just measuring the horizon radius in this case. When a 0 the horizon
shrinks away like a chunk of ice in a warming lake. We have here, perhaps, one more instance
in which the formation or disappearance of a black hole bears all the marks of a thermodynamic
phase transition.
Acknowledgements
This work was supported by the Natural Sciences and Engineering Research Council of
Canada and by the National Science Foundation of the United States under grant PHY-0404646.
Research at Perimeter Institute for Theoretical Physics is supported in part by the Government of
Canada through NSERC and by the Province of Ontario through MRI. We thank Sean Hartnoll
for suggesting Ref. [12] and an anonymous referee for suggesting some of the references in [1].
Appendix A. KerrBolt(A)dS spacetimes
The Lorentzian geometry of Kerr spacetimes with NUT charge and nonzero cosmological
constant is given by the line element
ds 2 =

2
L (r)
2
dt
+
2n
cos
a
sin
d
L2 L2

2 2 dr 2 L2 d 2
L ( ) sin2
a dt r 2 + n2 + a 2 d + L
+
,
2
2
L (r)
L ( )
L L
(A.1)
where
L2 = r 2 + (n + a cos )2 ,
r 2 (r 2 + 6n2 + a 2 )
(3n2 2 )(a 2 n2 )
+ r 2 2mr
,
2

2
a cos (4n + a cos )
,
L ( ) = 1 +
2
a2
L = 1 + 2 ,

L (r) =
(A.2)
115
which are exact solutions of the Einstein equations. The event horizons of the spacetime are
given by the singularities of the metric function which are the real roots of L (r) = 0. These are
determined by the solutions of the equation

4
2 2
r+
(A.3)
6n2 a 2 + 2m2 r+ + 3n2 2 a 2 n2 = 0.
r+
The Euclidean section for this class of metrics is given by
ds 2 =

2
E (r)
dt 2n cos a sin2 d
2
2
E E
+
2
2 E2 dr 2 E2 d 2
E ( ) sin2
2
2
a
dt
r
d +
a
+
,
E (r)
E ( )
E2 E2
(A.4)
where we have analytically continued the t coordinate, the NUT charge and the rotation parameter to imaginary values, yielding
E2 = r 2 (n + a cos )2 ,
r 2 (r 2 6n2 a 2 )
(3n2 + 2 )(a 2 n2 )
2
+
r
2mr
,
2
2
a cos (4n + a cos )
,
E ( ) = 1
2
a2
E = 1 2 .
(A.5)

The Euclidean section exists only for values of r such that the function E (r) is positive valued. The horizons are located at the zeros of E (r), which we shall denote by r = r0 . Moreover,
the range of depends strongly on the values of the NUT charge n, the rotational parameter a
and the cosmological constant = 3/2 , taken here to be positive (for the solution with < 0,
replace 2 with 2 in the preceding equations). The angular velocity of the horizon in the
Lorentzian geometry is given by

gt
a
H =
(A.6)
=
,
g r=r0 r02 + n2 + a 2
E (r) =
and so the surface gravity of the cosmological horizon can be calculated to give

dL
1
,
=
2(r02 + n2 + a 2 )L dr r=r0
(A.7)
where the Killing vector = + H is normal to the horizon surface r = r0 .

We first note that there are no pure NUT solutions for nonzero values of a. We demonstrate this
as follows.9 Since is a Killing vector, for any constant -slice near the horizon, additional
conical singularities will be introduced in the (t, r) section unless the period of t is t = 2
|| .
Furthermore, there are string-like singularities along the = 0 and = axes for general values
of the parameters. These can be removed by making distinct shifts of the coordinate t in the
direction near each of these locations. These must be geometrically compatible [25], yielding the
requirement that the period of t should be t = 4n = q16n
. Demanding the absence of both
+ +q
9 It also follows purely topologically from the fact that the surfaces of constant r are not homeomorphic to S 3 .
116
conical and DiracMisner singularities, we get the relation

k
8n
,
=
q + + q
(A.8)
where k is any non-zero positive integer. Demanding the existence of a pure NUT solution at
r = r0 is equivalent to the requirement that the area of the surface of the fixed point set of the
Killing vector /t,

2
(A.9)
E (r0 ) + 2 r02 n2 a 2 ,
E
2
A sin d . In other words, this surface is of zero dimension. This
vanish, where = 0 2n cos
E ()
can only occur special values of the mass parameter. However if we select for this parameter
we find an inconsistency with the relation (A.8), which must hold for the spacetime with NUT
charge.
Hence we conclude that the only spacetimes with NUT charge and rotation is TaubBolt
Kerr(A)dS spacetimes, where the term Bolt refers to the fact that the dimensionality of the
fixed point set of /t is two.
A=
Appendix B. Commensurability conditions

Let u, v, w
be three vectors such that
p u + q v + r w
= 0,
(B.1)
with p, q, r Z. Without loss of generality, gcd(p, q, r) = 1. Assume that u, v, w

are minimal
members of the lattice L = spanZ {
u, v, w}.
Then p, q and r are pairwise relatively prime.
To prove this, assume the opposite. Say for example that p and q have a common factor :
p = a, q = b; then r
/ Z. From Eq. (B.1), we conclude r w
L because a, b Z. On the
r
other hand since
/ Z, we can set r = c + where c Z and 0 < < 1. From r w
L and
cw
L, it follows that r w
cw
L, whence w
L contradicting our assumption that w
was
minimal.
Conversely, let u , v and w
three vectors not all parallel to each other and assume that they
satisfy (B.1) with p, q, r pairwise relatively prime. Then u , v and w
are minimal in L.
Let us prove for example that v is minimal. Either u or w
is independent of v, say it is the
u + (b qc
v,
former. Now take any general vector x in L as x = a u + b
v + cw
= (a pc
r )
r )
where a, b, c Z. If this vector is proportional to v then necessarily r|pc because u and v are
linearly independent. Since r and p are relatively prime, r|c, whence qc/r Z, whence x is an
integer multiple of v. This shows x is not a submultiple of v. Hence v is minimal.
Appendix C. Some information on Lens spaces
A lens space is a 3-manifold that can be obtained as the quotient of the 3-sphere S 3 by the
action of a cyclic group Zp = Z mod p = Z/pZ. It can also be obtained by sewing together two
solid tori, but the presentation as a quotient is more useful in the present context.
Let S 3 be represented as the unit sphere in R4 , or rather as the set of pairs of complex numbers
= ( 1 , 2 ) such that | 1 |2 + | 2 |2 = 1. If we think of 1 and 2 as the components of a spinor ,
117
then S 3 is simply the set of unit spinors, i.e. spinors such that = 1. We can relate this type of
coordinatization to polar coordinates r, , , by setting

= rei/2 cos(/2)ei/2 , sin(/2)ei/2 .

(C.1)
Here ranges from 0 to 2 and an elementary domain for and can be taken to be [0, 2]
[0, 4], or more symmetrically, the square in the -plane spanned by the vectors, =
2(/ /), introduced in (26), (27).
Now let q1 , q2 Zp be two integers taken modulo p, and let 1x denote the exponential
exp(2ix). The transformation g defined by
j 1qj /p j
(j = 1, 2),
(C.2)
generates the action of a cyclic group G on S 3 . In terms of the angles, and , it is clear from
(C.1) that g is equivalent to translation by the vector
=
q2
q1
+ + .
p
p
(C.3)
Let L(p; q1 , q2 ) be the quotient of the 3-sphere by this action:

L(p; q1 , q2 ) = S 3 /G.
(C.4)
= 1. Without modifying g we can

Plainly, G is a cyclic group of order at most p, since
clearly divide out any common factor in p, q1 , q2 ; then gcd(p, q1 , q2 ) = 1. Having done so, we
claim further that both q1 and q2 must be relatively prime to p (that is, the phases 1qj /p must be
primitive pth roots of unity); for in the contrary case we would have (say) r|p, r|q1 , whence
q2
q1 q2

p q1 p q2
g p/r = diag 1 r p , 1 r p = diag 1 r , 1 r = diag 1, 1 r ,
(C.5)
gp
since r|q1 . But this transformation has the north pole = (1, 0) as a fixed point, while it is not
the identity in G because r cannot divide q2 if it divides p and q1 . Thus, S 3 /G would not be a
manifold. Conversely, if q1 and q2 are both relatively prime to p then G has no fixed points and
then S 3 /G is a manifold. We thus obtain a lens space L(p; q1 , q2 ) for every triple of integers
(p, q1 , q2 ), where the qj are taken modulo p and must be prime to it. Comparing with (28) then
shows that the instanton constructed in Section 4 with parameters p, q , is topologically the lens
space L(p; q , q+ ).
In fact, we can go further and require the qj to be relatively prime to each other as well. If
they were not, they would have a common factor r, so without changing G, we could replace qj
by qj /r, which modulo p coincides with (1/r)qj , where (1/r) is the reciprocal of r in Zp . (This
reciprocal exists since r and p are relatively prime.) It follows that we exhaust the lens spaces by
letting q1 and q2 run over all pairs of integers in Zp that are relatively prime to each other and
to p. In fact, the same reasoning demonstrates that we leave nothing out by limiting ourselves to
the form L(p; q, 1). This in turn allows a simpler notation L(p, q) for lens spaces, related to the
earlier notation by
L(p; q1 , q2 ) = L(p; q1 /q2 , 1) L(p, q1 /q2 ),
(C.6)
where, of course, the division is done in Zp . (This equation breaks down when p = 1 because
0 and 1 are not distinct elements in Z1 and 0/0 is not defined a priori. For this special case, the
natural convention to adopt is 0/0 = 0 so that L(1; 0, 0) and L(1, 0) both denote the 3-sphere
S 3 .)
118
Finally, we can ask which of the L(p; q1 , q2 ) are homeomorphic to each other. First of all,
we can observe that L(p; q1 , q2 ) depends only on the ratio q = q1 /q2 , as we have already seen
in effect, and as follows explicitly from the simple calculation, L(p, q1 , q2 ) = L(p; rq1 , rq2 ) =
L(p; q, 1) = L(p, q), where r = 1/q2 , the reciprocal of q2 in Zp . Now by exchanging 1 with 2
(a diffeomorphism of S 3 ), we learn that L(p, q) = L(p, 1/q). Similarly we have the equivalence
L(p, q) = L(p, q), by complex conjugation of 2 (say). Combining these yields the set of
equivalences,

L(p, q) = L p, q 1 .
(C.7)
According to [28], there are no other equivalences among the L(p, q). That is, L(p, q) is homeomorphic to L(p , q ) if and only if p = p and q = q 1 , where the latter equality is modulo
p, i.e. q and q are regarded as elements of Zp . If L(p, q) is treated as an oriented manifold,
then the second equality should be replaced by simply q = q 1 .
The lens space L(1; 1, 1) is simply S 3 itself, and the above vector in this case generates
(as a continuous symmetry, not a discrete one) the Hopf fibration of S 3 over S 2 . Similarly the
L(p; 1, 1) = L(p, 1) are circle bundles over S 2 with p twistsmonopoles of charge p in the
KaluzaKlein interpretation. Since there are no other circle bundles over S 2 , it follows that all of
the other lens spaces are something else; and the corresponding instantons cannot possess a full
S 2 s worth of directions at infinity.
References
[1] S.W. Hawking, C.J. Hunter, Phys. Rev. D 59 (1999) 044025;
G.W. Gibbons, S.W. Hawking, Phys. Rev. D 15 (1977) 2752;
S.W. Hawking, G.T. Horowitz, Class. Quantum Grav. 13 (1996) 1487;
J.D. Brown, J. Creighton, R.B. Mann, Phys. Rev. D 50 (1994) 6394;
R.D. Sorkin, D. Sudarsky, Class. Quantum Grav. 16 (1999) 3835;
R.D. Sorkin, Ten theses on black hole entropy, Stud. History Philos. Mod. Phys. 36 (2005) 291, hep-th/0504037.
[2] M.H. Dehghani, R.B. Mann, Phys. Rev. D 72 (2005) 124006.
[3] H.G. Svendsen, Phys. Rev. D 71 (2005) 044027.
[4] R.B. Mann, C. Stelea, Phys. Lett. B 632 (2006) 537;
R.B. Mann, C. Stelea, Phys. Rev. D 72 (2005) 084032.
[5] D. Astefanesei, R.B. Mann, E. Radu, JHEP 0501 (2005) 049;
D. Astefanesei, R.B. Mann, E. Radu, Phys. Lett. B 620 (2005) 1.
[6] Z.-W. Chong, G.W. Gibbons, H. Lu, C.N. Pope, Phys. Lett. B 609 (2005) 124.
[7] R. Clarkson, L. Fatibene, R.B. Mann, Nucl. Phys. B 652 (2003) 348.
[8] K. Zoubos, JHEP 0212 (2002) 037.
[9] A. Awad, A. Chamblin, Class. Quantum Grav. 19 (2002) 2051.
[10] S.W. Hawking, C.J. Hunter, D.N. Page, Phys. Rev. D 59 (1999) 044033.
[11] A. Chamblin, R. Emparan, C.V. Johnson, R.C. Myers, Phys. Rev. D 59 (1999) 064010.
[12] S.A. Hartnoll, S.P. Kumar, JHEP 0506 (2005) 012.
[13] R. Clarkson, A.M. Ghezelbash, R.B. Mann, Phys. Rev. Lett. 91 (2003) 061301.
[14] R. Clarkson, A.M. Ghezelbash, R.B. Mann, Nucl. Phys. B 674 (2003) 329.
[15] R.D. Sorkin, A KaluzaKlein monopole, Phys. Rev. Lett. 51 (1983) 87;
R.D. Sorkin, Phys. Rev. Lett. 54 (1985) 86, Erratum, reprinted in: T. Appelquist, A. Chodos, P.G.O. Freund (Eds.),
Modern KaluzaKlein Theories, Benjamin, 1986.
[16] R.B. Mann, Phys. Rev. D 61 (2000) 084013.
[17] R.D. Sorkin, S. Surya, Why do instantons always yield area as entropy?, in preparation.
[18] R.D. Sorkin, The KaluzaKlein Monopole and Positive Energy in Five Dimensions, Instituto de Ciencias Nucleares,
Universidad Nacional Autnoma de Mxico, Mexico City, unpublished lectures.
[19] G. Holzegel, Class. Quantum Grav. 23 (2006) 3951.
119
[20] R.D. Sorkin, Conserved quantities as action variations, in: J.W. Isenberg (Ed.), Proceedings of a Conference on
Mathematics and General Relativity, Santa Cruz, California, June 1986, in: American Mathematical Society Contemporary Mathematics Series, vol. 71, American Mathematical Society, Providence, 1998;
R.D. Sorkin, The gravitational-electromagnetic Noether-operator and the second-order energy flux, Proc. R. Soc.
London A 435 (1991) 635.
[21] R.D. Sorkin, M. Varadarajan, Class. Quantum Grav. 13 (1996) 1949.
[22] J. Lee, R.D. Sorkin, Commun. Math. Phys. 116 (1988) 353.
[23] L. Bombelli, R.K. Koul, G. Kunstatter, J. Lee, R.D. Sorkin, Nucl. Phys. B 289 (1987) 735.
[24] S. Das, R.B. Mann, JHEP 0008 (2000) 033.
[25] A.M. Ghezelbash, R.B. Mann, JHEP 0201 (2002) 005.
[26] R.D. Sorkin, Astophys. J. 249 (1981) 254;
R.D. Sorkin, Astophys. J. 257 (1982) 847.
[27] R.D. Sorkin, Two topics concerning black holes: Extremality of the energy, fractality of the horizon, in: S.A. Fulling
(Ed.), Proceedings of the Conference on Heat Kernel Techniques and Quantum Gravity, Winnipeg, Canada, August
1994, in: Discourses in Mathematics and Its Applications, vol. 4, University of Texas Press, 1995, pp. 387407,
gr-qc/9508002.
[28] Encyclopedic Dictionary of Mathematics, second ed., MIT Press, 1993, 91.C Lens Spaces;
A. Hatcher, The classification of 3-manifoldsA brief overview, http://www.math.cornell.edu/hatcher/Papers/
3Msurvey.pdf.
Tri-bimaximal neutrino mixing and quark masses

from a discrete flavour symmetry
Ferruccio Feruglio a , Claudia Hagedorn b , Yin Lin a, , Luca Merlo a
a Dipartimento di Fisica G. Galilei, Universit di Padova, INFN, Sezione di Padova, Via Marzolo 8,
I-35131 Padua, Italy

b Max-Planck-Institut fr Kernphysik, Postfach 10 39 80, 69029 Heidelberg, Germany
Received 4 March 2007; accepted 3 April 2007

Abstract
We build a supersymmetric model of quark and lepton masses based on the discrete flavour symmetry
group T , the double covering of A4 . In the lepton sector our model is practically indistinguishable from
recent models based on A4 and, in particular, it predicts a nearly tri-bimaximal mixing, in good agreement
with present data. In the quark sector a realistic pattern of masses and mixing angles is obtained by exploiting the doublet representations of T , not available in A4 . To this purpose, the flavour symmetry T should
be broken spontaneously along appropriate directions in flavour space. In this paper we fully discuss the related vacuum alignment problem, both at the leading orderand by accounting for
small effects coming from
higher-order corrections. As a result we get the relations: md /ms |Vus | and md /ms |Vtd /Vts |.
1. Introduction
It is a remarkable fact that, despite an intense and continuous theoretical effort over many
years, our understanding of fermion masses and mixing angles remains at a very primitive level.
Our theoretical constructions clearly suffer from the lack of a guiding principle and we can only
replicate variants of very few basic ideas. Perhaps one of the most fruitful idea in the field is
the one by Froggatt and Nielsen (FN) [1], who advocated a spontaneously broken flavour symmetry to account for the small mass ratios and the small mixing angles characterizing the quark
E-mail addresses: feruglio@pd.infn.it (F. Feruglio), hagedorn@mpi-hd.mpg.de (C. Hagedorn), yin.lin@pd.infn.it

(Y. Lin), merlo@pd.infn.it (L. Merlo).
doi:10.1016/j.nuclphysb.2007.04.002
F. Feruglio et al. / Nuclear Physics B 775 (2007) 120142
121
sector. This idea was originally formulated with a U(1) flavour symmetry group, broken by the
vacuum expectation value (VEV) of a single scalar field, but it has been subsequently extended
and adapted to a large variety of symmetry groups, with more elaborated symmetry breaking patterns. While the original U(1) flavour symmetry is still a viable candidate to reproduce, at least at
the order-of-magnitude level, both quark and lepton masses and mixing angles [2], models based
on the discrete symmetry group A4 [38] have been recently singled out as good candidates to
explain the approximate tri-bimaximal (TB) mixing [9] observed in the lepton sector. As a matter of fact the range for the lepton mixing angles, as determined from a global fit to neutrino
oscillation data, is given by (2 errors) [10]:

2
sin2 23 = 0.45 1+0.35
sin2 13 = 0.8+2.3
sin2 12 = 0.314 1+0.18
0.15 ,
0.20 ,
0.8 10 ,
(1)
and it is perfectly compatible with a TB mixing:
1
1
TB
TB
TB
= ,
= ,
= 0.
sin2 23
sin2 13
sin2 12
(2)
3
2
Given the present experimental uncertainties, the agreement is not so impressive as far as 23 and
13 are concerned, but it is certainly striking for the angle 12 :

TB
= 35.3 ,
12 = 34.1+1.7
(1 )
12
(3)
1.6
given the small 1 error, less than 0.04 rad 2 , 0.22 being the Cabibbo angle. Measurements in a near future will bring down the precision/sensitivity on 23 and 13 to a similar level
[11], thus providing a stringent test of the TB mixing scheme.
It is well known by now that, by working in the framework of a FN model, the only way to
achieve TB lepton mixing is to set up an appropriate vacuum alignment mechanism. The starting point is a flavour symmetry group F , spontaneously broken by the VEVs of a set of scalar
fields. Neutrino masses and charged lepton masses should be mainly sensitive to two separate
sets of scalar fields, whose VEVs break the symmetry F down to two different subgroups. This
VEV misalignment can produce the TB mixing, at the leading order. A complete separation between VEVs affecting neutrino masses and charged lepton masses is however impossible, since
a mixing is generally induced at some level by higher dimensional operators allowed by the
symmetries of the model. Such a mixing can be kept under control if the typical expansion parameters, the dimensionless ratios between the VEVs and the cut-off of the theory, are sufficiently
small. Along these lines it is possible to construct a model reproducing the TB mixing, perturbed
by small corrections, below the 2 level.
Perhaps the simplest examples of this kind of models are those based on the flavour symmetry
group A4 , the group of even permutations of four objects, also equal to the group of proper rotations leaving a regular tetrahedron invariant. This group is small, it has only 12 elements and four
inequivalent irreducible representations: a triplet one and three independent singlets 1, 1 and 1 .
Lepton SU(2) doublets li (i = e, , ) are assigned to the triplet A4 representation, while the
lepton singlets ec , c and c are assigned to 1, 1 and 1 , respectively. The symmetry breaking
sector consists of scalar fields neutral under the SM gauge group: (T , S , ), transforming as
(3, 3, 1) of A4 . Additional ingredients are needed in order to reproduce the desired alignment.
The simplest version of the model is supersymmetric (though SUSY is not really necessary to
achieve the alignment) and possesses an additional Z3 discrete symmetry that eliminates unwanted operators. The key feature of the model is that the minimization of the scalar potential
at the leading order, that is by neglecting higher dimensional operators, leads to the following
122
VEVs:
T (1, 0, 0),
S (1, 1, 1),

= 0.
(4)
In the basis chosen for the generators of A4 , these VEVs imply a diagonal mass matrix in the
charged lepton sector. In this sector the relative hierarchy between me , m and m can be controlled by a FN U(1)FN flavour symmetry. At the same time, the neutrino mass matrix gives
rise to the TB mixing, independently from the values of the free parameters, which, in a finite
portion of the parameter space, only affect the neutrino mass eigenvalues. Of course, while the
specific form of neutrino and charged lepton mass matrices does depend on the basis chosen, the
physical properties of the system, such as the mass eigenvalues and the mixing angles, are basis
independent features.
It would be desirable to extend this construction to the quark sector, thus realizing a coherent
description of all fermion masses, but the simplest extrapolations explored so far turn out to be
unrealistic. In the simplest possible extension, quark SU(2) doublets qi are assigned to the triplet
A4 representation, while the quark SU(2) singlets (uc , d c ), (cc , s c ) and (t c , bc ) are assigned to
1, 1 and 1 , respectively. Then, given the VEVs in (4), and taking into account the additional
Z3 symmetry, at the leading order the quark mass matrices in the up and down sectors are both
diagonal and the quark mixing matrix VCKM is the unity matrix. Subleading corrections, coming
from higher dimensional operators contributing to quark masses were analyzed in Ref. [7] and
are too small. A possible way out might be to consider new sources of symmetry breaking. This
can consist in an explicit breaking [12], which however does not allow a complete control of the
model and introduces a high degree of arbitrariness. Otherwise it can be realized by extending
the symmetry breaking sector, by allowing for some new scalar fields, whose VEVs could substantially contribute to the quark sector, giving rise to a realistic Cabibbo angle . The difficulty
with such an option is that these new scalar fields tend to affect also the lepton sector, giving rise
to too large, unacceptable, corrections to the TB mixing pattern. Finally, a disturbing feature of
this construction is that the mass of the top quark comes from a non-renormalizable operator, as
for all the other fermions. This is against expectation, since in the SM the Yukawa coupling of
the top quark is of order one. To reproduce such a Yukawa coupling from a non-renormalizable
operator we should introduce large dimensionless couplings, which is unnatural. A discussion
about the symmetry breaking pattern suitable to produce realistic mixing angles in the quark
sector can be found in Ref. [13].
In the present paper we explore a different possibility, by considering as a starting point of
our FN construction, the T group, the double covering of A4 . The relation between T and A4 is
quite similar to the familiar relation between SU(2) and SO(3). In particular, SU(2) and SO(3)
possess the same Lie algebra, but SO(3) has only integer representations, while SU(2) possesses
both integer and half-integer representations. Similarly, the representations of T are those of
A4 plus three independent doublets 2, 2 and 2 . By working only with the triplet and singlet
representations, T is indistinguishable from A4 . This allows to replicate with T the successful
construction realized within A4 in the lepton sector. At the same time, the presence of the doublet
representations can be exploited to describe the quark sector. It is natural to assign quarks to a
reducible singlet plus doublet representation. This has several advantages. By assigning the third
generation to a singlet representation the top and bottom quarks can acquire mass already at the
renormalizable level. Moreover, by using doublets to describe quarks of first two generations,
the VEVs in (4) provide masses for the charm and the strange quarks, which are conveniently
suppressed with respect to the top and bottom masses. The mixing between second and third
generations can be induced by the VEV of a T scalar doublet, whose effects do not modify
123
the TB mixing in the lepton sector. First generation masses and mixing angles can arise through
subleading effects. Actually, the whole picture in the quark sector is very similar to that detailed
in a series of papers [14] exploiting the T group and it is also very close to that emerging
from the study of another popular FN group, U(2) [15]. In our paper we will combine the good
known features of the old U(2) and T constructions for the quark sector, with the ability of T in
reproducing the TB mixing in the lepton sector [16]. Such a combination is far from trivial and
is not exhausted by a list of particle representations. On the contrary, the key point is represented
by the study of the vacuum alignment problem. With an enlarged symmetry breaking sector we
will analyze, both at the leading and at the subleading level the symmetry breaking pattern of T
and we will explicitly check the existence of a finite portion of the parameter space that gives rise
to TB lepton mixing and to a realistic pattern of quark masses and mixing angles. Eventually we
get a very appealing group theoretical interpretation of the difference between quark and lepton
mixing angles. Large lepton mixing angles correspond to a breaking of the flavour symmetry
group down to two different subgroups in the neutrino and in the charged lepton sectors. Small
quark mixing angles arise from the breaking of the flavour group along the same subgroup both
in the up and in the down sectors. In our model the gauge group is that of the Standard Model.
For recent attempts to incorporate quarks in a grand unified picture with flavour symmetry A4 ,
see Ref. [17].
2. The group T
Our model is based on the flavour group F = T where T is the binary tetrahedral
group [18] that we will describe in this section and dots denote some additional group factor
that we will specify later on. The key role in our construction is played by the T group that is
literally the double covering of the tetrahedral group A4 . The relation between T and A4 can
be understood by thinking of A4 , the group of proper rotation in the three-dimensional space
leaving a regular tetrahedron invariant, as a subgroup of SO(3). Thus the 12 elements of A4 are
in a one-to-one correspondence with 12 sets of Euler angles. Now consider SU(2), the double
covering of SO(3), possessing twice as many elements as SO(3). There is a correspondence
from SU(2) to SO(3) that maps two distinct elements of SU(2) into the same set of Euler angles
of SO(3). The group T can be defined as the inverse image under this map of the group A4 .
The group T has 24 elements and has two kinds of representations. It contains the representations of A4 : one triplet 3 and three singlets 1, 1 and 1 . When working with these representations
there is no way to distinguish the group T from the group A4 . In particular, in these representations, the elements of T coincide two by two and can be described by the same matrices that
represent the elements in A4 . The other representations are three doublets 2, 2 and 2 . The
representations 1 , 1 and 2 , 2 are complex conjugated to each other. Note that A4 is not a
subgroup of T , since the two-dimensional representations cannot be decomposed into representations of A4 . The character table is shown in Table 1. The generators S and T fulfill the
relations:
S 2 = R,
T 3 = 1,
(ST )3 = 1,
R2 = 1,
(5)
where R = 1 in case of the odd-dimensional representation and R = 1 for 2, 2 and 2 such
that R commutes with all elements of the group. From Table 1 we see that, beyond the center of
the group, generated by the elements E and R, there are other Abelian subgroups: Z3 , Z4 and
Z6 . In particular, there is a Z4 subgroup here denoted by GS , generated by the element T ST 2
and a Z3 subgroup here called GT , generated by the element T . As we will see GS and GT are
124
Table 1
2i
Character table of the group T taken from [14]. is the third root of unity, i.e. = e 3 = 12 + i 23 . Ci are the
classes of the group, Ci is the order of the ith class, i.e. the number of distinct elements contained in this class, hCi
h
is the order of the elements A in the class Ci , i.e. the smallest integer (> 0) for which the equation A Ci = 1 holds.
Furthermore the table contains one representative for each class Ci given as product of the generators S and T of the
group
Classes
T
C1
C2
C3
C4
C5
C6
C7
C2 , C 2 R
C3
C32
C3 R
C32 R
T2
G
C
i
h
Ci
1
1
1
R
1
2
S
6
4
ST R
4
6
4
3
T
4
3
(ST )2 R
4
6
1
1
1
2
2
2
3
1
1
1
2
2
2
3
1
1
1
2
2
2
3
1
1
1
0
0
0
1
2
1
2
0
1
2
1
2
2
1
2
0
1
2
1
2
of great importance for the structure of our model. Realizations of S and T for 2, 2 , 2 and 3
can be found in Appendix A and are taken from [14].
The multiplication rules of the representations are as follows:
1a r b = r b 1a = r a+b
for r = 1, 2,
1 3 = 3 1 = 3,
a
2a 2b = 3 1a+b ,
2a 3 = 3 2a = 2 2 2 ,
3 3 = 3 3 1 1 1 ,
(6)
1 ,
11
1
where a, b = 0, 1 and we have denoted 1,
and similarly for the doublet representations. On the right-hand side the sum a + b is modulo 3. The ClebschGordan
coefficients for the decomposition of product representations are shown in Appendix A and were
already calculated in [14]. Further synonyms of T are Type 24/13 [18] and SL2 (F3 ) [16].
10
11
3. Outline of the model

In this section we introduce our model and we illustrate its main features. We choose the
model to be supersymmetric, which would help us when discussing the vacuum selection and
the symmetry breaking pattern of T . The model is required to be invariant under a flavour symmetry group F = T Z3 U(1)FN . The group factor T is the one responsible for the TB lepton
mixing. The group T is unable to produce all the necessary mass suppressions for the fermions
of the first and second generations. These suppressions originate in part from a spontaneously
broken U(1)FN , according to the original FN proposal. Finally, the Z3 factor helps in keeping
separate the contributions to neutrino masses and to charged fermion masses, and it is an important ingredient in the vacuum alignment analysis. The fields of the model, together with their
transformation properties under the flavour group, are listed in Table 2.
125
Table 2
The transformation rules of the fields under the symmetries associated to the groups T , Z3 and U(1)FN . We denote
Dq = (q1 , q2 )t where q1 = (u, d)t and q2 = (c, s)t are the electroweak SU(2)-doublets of the first two generations,
Duc = (uc , cc )t and Ddc = (d c , s c )t . Dq , Duc and Ddc are doublets of T . q3 = (t, b)t is the electroweak SU(2)-doublet
of the third generation. q3 , t c and bc are all singlets under T
Field
ec
Dq
Duc
T
Z3
U(1)FN
1
2
2n
1
2
n
1
2
0
2
2
2
n
Ddc
2
2
0
q3
tc
bc
hu,d

1
2
0
1
2
0
1
1
0
3
1
0
2
1
0
1
1
0
3.1. Pattern of symmetry breaking

The most important feature of our model is the pattern of symmetry breaking of the flavour
group T . We will see that, at the leading order, T is broken down to the subgroup GS , generated by the element T ST 2 , in the neutrino sector and to the subgroup GT , generated by T , in the
charged fermion sector. This pattern of symmetry breaking is achieved dynamically and corresponds to a local minimum of the scalar potential of the model. This result is already sufficient
to understand the predicted pattern of fermion mixing angles. Indeed, given the T assignment
of the matter fields displayed in Table 2 and the explicit expressions of the generators S and T
for the various representations (see Appendix A), specific mass textures are obtained from the
requirement of invariance under T ST 2 or T . For instance, neutrinos are in a triplet of T and the
element T ST 2 in the triplet representations is given by:

2
1 1 2
2
T ST =
(7)
2 1 2 .
3 2
2 1
The most general mass matrix for neutrinos invariant under GS , in arbitrary units, is given by:

a+c
b/3 c + d
b/3
m = b/3 c + d
(8)
c
a b/3
b/3
a b/3
d
where a, b, c and d are arbitrary parameters. Similarly, the most general mass matrices for
charged fermions invariant under GT have the following structure:

0 0
me = 0 0 ,
(9)
0 0

0 0 0
mu,d = 0 ,
(10)
0
where a cross denotes a non-vanishing entry. The lepton mixing originates completely from m
and, with an additional requirement, reproduces the TB scheme. This additional requirement is
the condition c = d, which is not generically implied by the invariance under GS . In our model
the fields that break T along the GS direction are a triplet S and an invariant singlet . There are
no further scalar singlets, transforming as 1 or 1 that couple to the neutrino sector. We will see
in a moment that due to this restriction our model gives rise to a particular version of the neutrino
mass matrix in Eq. (8), where c = d = 2b/3, which implies directly a TB mixing. It is interesting to note that, while the requirement of GT invariance implies a diagonal mass matrix in the
126
charged lepton sector, this is not the case for the quark sector, due to the different T assignment.
At the leading order, in both up and down sectors, we get mass matrices with vanishing first row
and column, Eq. (10). Moreover, the element 33 of both mass matrices is larger than the other
elements, since it is invariant under the full T group, not only the GT subgroup. The other nonvanishing elements carry a suppression factor originating from the breaking of T down to GT .
This pattern of quark mass matrices, while not yet fully realistic, is however encouraging, since
it reproduces correctly masses and mixing angle of the second and third generations. As we will
see, the textures in Eqs. (8)(9) are modified by subleading effects. These effects are sufficiently
small to keep the good feature of the leading order approximation, and large enough to provide
a realistic description of the quark sector.
Fermion masses are generated by the superpotential w:
w = wl + wq + wd ,
(11)
where wl is the term responsible for the Yukawa interactions in the lepton sector, wq is the
analogous term for quarks and wd is the term responsible for the vacuum alignment, which
will be discussed in the next section. We will consider the expansion of w in inverse powers
of the cut-off scale and we will write down only the first non-trivial terms of this expansion.
This will provide a leading order approximation, here analyzed in detail. Corrections to this
approximation are produced by higher dimensional operators contributing to w, which will be
studied subsequently. As we will see in Section 4, at the leading order, the scalar components of
the supermultiplets T , S , , , and develop VEVs
S = (vS , vS , vS ),
T = (vT , 0, 0),
= u,
= (v1 , 0),
= 0,
= 0.
(12)
(13)
These VEVs can be very large, much larger than the electroweak scale. In Section 4 we will see
that it is reasonable to choose:
VEV
2 ,
(14)
where VEV stands for the generic non-vanishing VEV in Eqs. (12), (13). Since the ratio (14)
represents the typical expansion parameter when including higher dimensional operators, the
choice in Eq. (14) keeps all the leading order results stable, up to correction of relative order 2 .
There is a neat misalignment in flavour space between T , and S : T = (vT , 0, 0),
= (v1 , 0) and = 0 break T down to the subgroup GT , while S = (vS , vS , vS ) breaks
T down to the subgroup GS . It is precisely this misalignment the origin of the mass textures
(8)(10).
A certain freedom is present in our formalism and this can lead to models that are physically
equivalent though different at a superficial level, when comparing VEVs or mass matrices. One
source of freedom is related to the possibility of working with different basis for the generators
S and T . Another source of freedom is related to the fact that vacua that break T are degenerate
and lie in orbits of the flavour group. For instance, when we say that the set of VEVs in Eq. (13)
breaks T leaving invariant the Z3 subgroup generated by T , VEVs obtained from this set by
acting with elements of T are degenerate and they preserve other Z3 subgroups of T . Both these
sources of freedom can lead to mass matrices different than those explicitly shown in Eqs. (8)
(10). It is however easy to show that the different pictures are related by field redefinitions
and the physical properties of the system, such as the mass eigenvalues and the physical mixing
angles, are always the same. Thus it is not restrictive to work in a particular basis and to choose
a single representative VEV configuration, as we will do in the following.
127
3.2. Leptons
Lepton masses are described by wl , given by:
wl = ye ec (T l)hd / + y c (T l) hd / + y c (T l) hd /
+ (xa + xa )(ll)hu hu /2 + xb (S ll)hu hu /2 + h.o.,
(15)
where here and in the following formulae h.o. stands for higher dimensional operators. After
electroweak symmetry breaking, hu,d = vu,d , given the specific orientation of T (1, 0, 0),
wl gives rise to diagonal mass terms for charged leptons:
yl v T
ml = v d
2
(l = e, , ).
(16)
The T symmetry, as was the case for A4 , is unable to produce the required hierarchy among me ,
m and m and, to this purpose, we make use of an additional spontaneously broken U(1)FN
flavour symmetry. We introduce a new supermultiplet , carrying U(1)FN charge 1 and neutral
under all other symmetries. Its non-vanishing VEV, / < 1 breaks U(1)FN and provides an
expansion parameter for charged lepton masses. We also assign U(1)FN charges (2n, n) to the
fields (ec , c ). All other lepton fields are taken neutral under this Abelian symmetry. In this way
y 1, y ( /)n , ye (/)2n and the mass hierarchy can be reproduced by choosing

n
2 .
(17)
All the information about lepton mixing angles is encoded in the neutrino mass matrix that can
be easily evaluated from Eq. (15):

b/3
vu2 a + 2b/3 b/3
m =
(18)
b/3
2b/3
a b/3 ,
b/3
a b/3
2b/3
where
u
vT
(19)
,
b xb .
Notice that m is not the most general mass matrix invariant under GS , Eq. (8). This is due to
the absence of fields transforming as 1 and 1 under T , developing non-vanishing VEVs and
directly contributing to m . The neutrino mass matrix is diagonalized by the transformation:
a xa
U T m U =
with
vu2
diag(a + b, a, a + b),

2/3
U = 1/6
1/ 6
1/3
1/3
1/ 3

0
1/2 .
+1/ 2
(20)
(21)
Therefore the TB mixing of Eq. (2) is reproduced, at the leading order. For the neutrino masses
we obtain:

1
2
|m1 | = r +
m2atm ,
8 cos2 (1 2r)
128
Fig. 1. On the left panel, sum of neutrino masses versus cos , the phase difference between a and b. On the right panel,
the lightest neutrino mass, m1 and the mass combination mee versus cos . To evaluate the masses, the parameters |a| and
|b| have been expressed in terms of r m2sol / m2atm (|m2 |2 |m1 |2 )/(|m3 |2 |m1 |2 ) and m2atm |m3 |2 |m1 |2 .
The bands have been obtained by varying m2atm in its 3 experimental range, 0.0020 eV0.0032 eV. There is a negligible sensitivity to the variations of r within its current 3 experimental range, and we have realized the plots by choosing
r = 0.03.
|m2 |2 =
1
8 cos2 (1 2r)

|m3 |2 = 1 r +
m2atm ,

1
m2atm ,
8 cos2 (1 2r)
(22)
where r m2sol / m2atm (|m2 |2 |m1 |2 )/(|m3 |2 |m1 |2 ), m2atm |m3 |2 |m1 |2 and is
the phase difference between the complex numbers a and b. For cos = 1, we have a neutrino
spectrum close to hierarchical:
|m3 | 0.053 eV,
|m1 | |m2 | 0.017 eV.
(23)
In this case the sum of neutrino masses is about 0.087 eV. If cos is accidentally small, the
neutrino spectrum becomes degenerate. The value of |mee |, the parameter characterizing the
violation of total lepton number in neutrinoless double beta decay, is given by:

1 + 4r
1
2
|mee | =
(24)
+
m2atm .
9
8 cos2 (1 2r)
For cos = 1 we get |mee | 0.005 eV, at the upper edge of the range allowed for normal
hierarchy, but unfortunately too small to be detected in a near future. Independently from the
value of the unknown phase we get the relation:

10
r
2
2
2
|m3 | = |mee | + matm 1
(25)
,
9
2
which is a prediction of our model. In Fig. 1 we have plotted the neutrino masses predicted by the
model. All the results listed above coincide with those obtained in the A4 models of Refs. [68],
129
at the leading order in the VEV expansion. We will see that, when higher order effects are included, these results are modified by terms of relative order VEV/. If this parameter is of order
2 , as assumed in Eq. (14), we have a stable TB mixing in the lepton sector.
3.3. Quarks
The contribution to the superpotential in the quark sector is given by

wq = yt t c q3 hu + yb bc q3 hd + y1 T Duc Dq hu / + y5 T Ddc Dq hd /

+ y2 Duc Dq hu / + y6 Ddc Dq hd / + y3 t c (Dq ) + y4 Duc q3 hu /

+ y7 bc (Dq ) + y8 Ddc q3 hd / + h.o.

(26)
Observe that the supermultiplets S , and , which control the neutrino mass matrices, do not
couple to the quark sector, at the leading order. Conversely, the supermultiplets T , and ,
which give masses to the charged fermions, do not couple to neutrinos at the leading order. This
separation is partly due to the discrete Z3 symmetry, described in Table 2. By recalling the VEVs
of Eqs. (12), (13), we can write down the mass matrices for the up and down quarks:

0
0
0
mu = 0 y1 vT / y4 v1 / vu + ,
(27)
0 y3 v1 /
yt

0
0
0
md = 0 y5 vT / y8 v1 / vd + ,
(28)
yb
0 y7 v1 /
where dots stand for higher order corrections. These mass matrices are the most general ones
that are invariant under GT , see Eq. (10). The following quark masses and mixing angles are
predicted, at the leading order:
mu = 0,
mc y1 vu vT /,
md = 0,
ms y5 vd vT /,
Vus = 0,
Vub = 0,
mt y t v u ,
mb y b v d ,

y 7 y3 v 1
Vcb
.
yb
yt

(29)
The masses of the top and bottom quarks are expected to be of the order of the VEVs vu and vd ,
respectively. Their relative hierarchy can be explained if vd vu . For values of order one of the
dimensionless coefficients yb and y5 , the ratio ms /mb is correctly reproduced since it is approximately given by vT / and this ratio has been chosen of order 2 , see Eq. (14). To reproduce
mc /mt a further suppression is needed. This can be achieved by exploiting the U(1)FN symmetry. For instance, we can take vanishing U(1)FN charge for all quarks but the T doublet Duc , to
which we assign charge n. In this way the parameters y1 and y4 are of order (/)n 2 , and

mc
y1 vT
n vT
(30)
4 .
mt
yt
With this assignment of the FN charges, the element Vcb is of order v1 / 2 . Masses and
mixing angles are still unrealistic, since mu /mc , md /ms , Vub and Vus are vanishing, at this level.
We will see that all these parameters can be generated by higher order corrections, in particular
those affecting the VEVs in Eqs. (12), (13).
130
Table 3
The transformation rules of the driving fields under the symmetries associated to the groups T and Z3
Field
T
Z3
T0
S0
0
3
1
2
1
1
4. The vacuum alignment at the leading order

In this section we will discuss the minimization of the scalar potential of the model. To this
purpose we should complete the definition of the superpotential w by specifying the last term
in Eq. (11). This is the term responsible for the spontaneous symmetry breaking of T and it
includes a new set of fields, the driving fields, whose transformation properties are shown in
Table 3. Along the same lines described in Ref. [7], we also exploit a U(1) R-symmetry of the
theory. All matter supermultiplets, those describing quarks and leptons, have R-charge 1, while
all supermultiplets that will develop a VEV, like hu,d , S,T , , , and , have vanishing Rcharges. Driving fields have R-charge 2 and their contribution to the superpotential reads:

wd = M T0 T + g T0 T T + g7 T0 T + g8 T0 + g1 S0 S S + g2 S0 S

+ g3 0 (S S ) + g4 0 2 + g5 0 + g6 0 2 + M 0 + g9 T 0
+ M 0 + g10 0 (T T ) + h.o.
(31)
We start by analyzing the scalar potential in the supersymmetric limit. We look for a supersymmetric vacuum as the solution to the equations:

2g1 2
S 1 S 2 S 3 = 0,
3

w
2g1 2
= g2 S 3 +
S 2 S 1 S 3 = 0,
S
3
02

w
2g1 2
= g2 S 2 +
S 3 S 1 S 2 = 0,
S
3
03

w
= g4 2 + g5 + g6 2 + g3 S 21 + 2S 2 S 3 = 0,
0
w
S
01
= g2 S 1 +

2g 2
T 2 T 3 + g7 T 2 + ig8 12 = 0,
3 T1

2g 2
w
= MT 3 +
T 2 T 1 T 3 + g7 T 1 + (1 i)g8 1 2 = 0,
0
3
T 2

2g 2
w
= MT 2 +
T 3 T 1 T 2 + g7 T 3 + g8 22 = 0,
0
3
T 3

w
= M 2 + g9 (1 i)1 T 3 2 T 1 = 0,
0
1

w
(1
+
i)
= 0,
=
M
1
9
2
T
2
1
T
1
20
w
T0 1
= MT 1 +
(32)
131

w
= M + g10 T2 2 + 2T 1 T 3 = 0.
0
(33)
Concerning the first set of equations, (32), there are flat directions in the SUSY limit. We can
enforce = 0 by adding to the scalar potential a soft SUSY breaking mass term for the scalar
field , with m2 > 0. In this case, in a finite portion of the parameter space, we find the solution
= 0,
= u,
S = (vS , vS , vS ),
vS2 =
g4 2
u
3g3
(34)
with u undetermined. By choosing m2S , m2 < 0, then u slides to a large scale, which we assume
to be eventually stabilized by one-loop radiative corrections in the SUSY broken phase. The
VEVs in (34) break T down to the subgroup GS .
Concerning the last six equations, (33), by excluding the trivial solutions where all VEVs vanish, in the SUSY limit we find three classes of solutions. One class preserves the subgroup GS ,
as for the set of VEVs given in (34). It is characterized by
= 0 and = 0. A representative
VEV configuration in this class is:
=
M
,
g7
= (0, 0),
T = (vT , vT , vT ),
vT2 =
MM
.
3g7 g10
(35)
The second class preserves a subgroup Z6 generated by the elements T and R. It is characterized
by = 0 and = 0:
= 0,
= (0, 0),
T = (vT , 0, 0),
vT =
3M
.
2g
(36)
The third class preserves the subgroup GT . It is characterized by = 0 and

= 0:
= (v1 , 0),
T = (vT , 0, 0),
= 0,

M
1 2
v1 =
vT =
i 2M g + 3MM g9 ,
.
g9
g9 3g8
(37)
The three sets of minima (35), (36) and (37) are all degenerate in the SUSY limit and we will
simply choose the one in Eq. (37). We have checked that, by adding soft masses m2 > 0, m2 < 0,
the desired vacuum is selected as the absolute minimum, thus reproducing the result in Eqs. (12),
(13). In summary, we have shown that the VEVs in (12), (13) represent a local minimum of
the scalar potential of the theory in a finite portion of the parameter space, without any ad hoc
relation among the parameters of the theory. As we will see in the next section, these VEVs
will be slightly perturbed by higher order corrections induced by higher dimensional operators
contributing to the driving potential wd . Such corrections will be important to achieve a realistic mass spectrum in the quark sector. Finally, concerning the numerical values of the VEVs,
radiative corrections typically stabilize u and vS well below the cut-off scale . Similarly, mass
parameters in the superpotential wd can be chosen in such a way that v1 and vT are below . It
is not unreasonable to assume that all the VEVs are of the same order of magnitude:
VEV 2 .
(38)
132
5. Higher order corrections

The inclusion of higher order corrections is essential in our model. First of all, from these
corrections we hope to achieve a realistic mass spectrum in the quark sector. The leading order
result is encouraging, but quarks of the first generations are still massless at this level and there
is no mixing allowing communication between the first generations and the other ones. Moreover we should check that the higher order corrections do not spoil the leading order results. At
the leading order there is a neat separation between the scalar fields giving masses to the neutrino sector and those giving masses to the charged fermion sector. As a result the T flavour
symmetry is broken down in two different directions in the two sectors: neutrino mass terms
are invariant under the subgroup GS , while the charged fermion mass terms are invariant under
the subgroup GT . It is precisely this misalignment the source of the TB lepton mixing. Such a
sharp separation is not expected to survive when higher dimensional operators are included and
this will cause the breaking of the subgroup GS (GT ) in the neutrino (charged fermion) sector.
It is important to check that this further breaking does not modify too much the misalignment
achieved at the leading order and that the TB mixing remains stable.
The corrections are induced by higher dimensional operators, compatible with all the symmetries of our model, that can be included in the superpotential w, thus providing the next terms in
a 1/ expansion. It is convenient to discuss separately the higher order contributions to wl , wq
and wd .
5.1. Corrections to wl
The leading operators giving rise to ml are of order 1/ (see Eq. (15)). At order (1/)2 there
are no new structures contributing to ml . Indeed, at this order the new invariant operators are:
c

c

c
c
f lT T hd /2 ,
f hd /2 ,
f T hd /2 ,
f = e c , c , c .
(39)
Either they vanish because = 0 or they replicate the leading-order pattern. The leading
operators contributing to m are of order 1/2 (see Eqs. (15), (18), (19)). At the next order we
have two operators that vanish due to = 0:
() h2u /3 ,
(S ) h2u /3 ,
(40)
and three new operators, whose contribution to m , after symmetry breaking, cannot be absorbed
by a redefinition of the parameters xa,b :
(T S ) (ll) hu hu /3 ,
(T S ) (ll) hu hu /3 ,
(T ll)hu hu /3 .
(41)
In addition to the above operators, there are also those obtained by replacing with : they do not
contribute at this order due to the vanishing VEV of . The combined effects of these operators
and of the corrections to the vacuum alignment (12), (13) were discussed in Ref. [7]. Lepton
masses and mixing angles are modified by terms of relative order 2 . This correction is within
the 1 experimental error for 12 and largely within the current uncertainties of 23 and 13 .
From the experimental view point, a small non-vanishing value 13 2 and a deviation from
/4 of order 2 of 23 , are both close to the reach of the next generation of neutrino experiments
and will provide a valuable test of this model.
133
5.2. Corrections to wq
The higher dimensional operators contributing to wq fall into two classes: those depending
on some of the fields S , , and those not depending on any of these fields. Leading operators
in the first class are necessarily cubic in S , , , due to the Z3 symmetry given in Table 2.
Therefore they will induce corrections to the quark mass matrices at least of order 1/3 . In the
down quark sector these corrections are at least of order 6 and are completely negligible. In
the up quark sector, where an additional 2 suppression to the mass matrix elements of the first
two rows is provided by the FN mechanism, the corrections are also completely negligible, with
the exception of a contribution of order 8 to the 11 entry, which will arise anyway also through
other effects.
Operators in the second class depend only on the T breaking fields T , and , whose VEV
pattern, Eqs. (12), (13), leaves invariant the subgroup GT . Since the quark mass matrices shown
in Eq. (28) are already the most general mass matrices invariant under this subgroup, any higher
order operator contributing to wq and leaving T invariant after spontaneous breaking will predict
the same textures of Eq. (28) and its effect can be absorbed in a redefinition of the leading order
operators. Therefore we do not need to consider explicitly the new, next-to-leading, operators
contributing to the quark masses: either their effects are negligible or they can be absorbed in a
redefinition of the existing parameters. In the quark sector all the effects modifying the leading
order results come from the corrections to the VEVs in Eqs. (12), (13).
5.3. Correction to wd and to the vacuum alignment
Finally we are left to the corrections to the VEV due to higher dimensional operators contributing to wd . We detail the discussion of this issue in Appendix B. Here we only give the
results. All the leading order operators in wd are of dimension three. After inclusion of a complete set of operators of dimension four, the leading order VEVs
S = (vS , vS , vS ),
T = (vT , 0, 0),
= u,
= 0,
= (v1 , 0),
= 0
(42)
are shifted into

S = (vS + vS1 , vS + vS2 , vS + vS3 ),
T = (vT + vT 1 , vT 2 , vT 3 ),
= u ,
= u,
= u,
= (v1 + v1 , v2 ),
(43)
u and vi are of order 1/ and, given the large

where all the corrections vT i , vSi , u,
number of input parameters, they can be considered as mostly independent. Since we typically
have VEV/ 2 at the leading order, we expect
VEV
4 ,
in a finite portion of the parameter space.
(44)
6. Quark masses and mixing angles

As explained in Section 5, the main effect of higher order corrections to quark masses comes
from the modified VEVs, Eq. (43). When we insert these VEVs in wq , Eq. (26), we get new
134
quark mass matrices:

iy1 vT 2 / +
(1 i)y1 vT 3 /2 + y2 u /
mu = (1 i)y1 vT 3 /2 y2 u /
y1 vT /
y3 v1 /
y3 v2 /

y4 v2 /
y4 v1 / vu ,
yt
(45)

iy5 vT 2 / +
(1 i)y5 vT 3 /2 + y6 u / y8 v2 /
md = (1 i)y5 vT 3 /2 y6 u /
y5 vT /
y8 v1 / vd ,
y7 v1 /
yb
y7 v2 /
(46)
where we have redefined vT + vT 1 vT and v1 + v1 v1 and the dots in the 11 entry of
mu and md stand for additional contributions from higher dimensional operators. Not all the
available parameter space is suitable to correctly reproduce the masses and the mixing angles of
the first generation quarks. Indeed, by recalling that, generically, we expect VEV/VEV 2 ,
we see that in this regime we would obtain, up to small corrections
vT 2
m u md
=
=
2 ,
mc
ms
vT
(47)
which is not correct in the up sector. To overcome this difficulty we assume that the correction
vT 2 is somewhat smaller than its natural value:
vT 2
4 .
(48)
vT
This brings the up quark mass in the correct range but depletes too much the down quark mass.
To get the appropriate mass for the down quark we assume that the dimensionless coefficient y6
is larger than one by a factor 1/:
1
y6 .
(49)
We cannot justify the two assumptions (48), (49) within our approach, where, in the absence of a
theory for the higher-order terms, we have allowed for the most general higher-order corrections.
From our effective Lagrangian approach, they should be seen as two moderate tunings that we
need in order to get up and down quark masses. To summarize, in our parameter space we naturally have y1 y2 y4 2 from the U(1)FN symmetry. All other dimensionless parameters,
with the exception of y6 are of order one. Concerning the VEVs, we can naturally accommodate
VEV/ VEV/VEV 2 , with the exception of vT 2 . Within the restricted region of the parameter space where the two relations (48), (49) are approximately valid, the quark mass matrices
have the following structures:

8
6 6
6
4
4
mu = vu ,
(50)
4 2 1

6
3 4
3
2
2
md = vd .
(51)
4 2 1
By diagonalizing the matrices in Eq. (46) with standard perturbative techniques we obtain:

y 2 u 2

1 i 2 vT2 3 y22 u 2
vT 2
,
2
+ ,
md vd 6
mu y1 vu i
2
vT y1 vT
y5 v T

6

vT
vT

ms y5 vd + O 4 ,
mc y1 vu + O ,
4

mb |yb vd | + O 4 .
mt |yt vu | + O ,
135
(52)
For the mixing angles, we get:

Vud Vcs 1 + O 2 ,
Vtb 1,

y6
1 i vT 3 y2 u
Vus Vcd
+ O 3 ,
y5 v T
2
vT
y1 v T

y 7 y3
v1
v2
1 i vT 3 y2 u
,
Vub
yb
yt
vT
2
y1

y 7 y3 v 1
Vts
+ O 4 ,
Vcb
yb
yt

y6 y7 y3 v1 u
y7 y3 v2
+
,
Vtd
y5 yb
yt v T
yb
yt
u
(53)
where, when not explicitly indicated, the relations include all terms up to O(4 ). In the previous
expressions, where all the quantities are generically complex, is possible to remove all phases
except the one carried by the combination (y7 /yb y3 /yt )v2 / which enters Vub and Vtd at
the order 4 . Notice that in our model Vub is of order 4 whereas Vtd is of order 3 . In the
Wolfenstein parametrization of the mixing matrix, this corresponds to a combination + i of
order , which is phenomenologically viable. Notice that quark masses and mixing angles are all
determined within their correct order of magnitudes and enough parameters are present to fit the
data. Moreover, despite the large number of parameters controlling the quark sector, our model
contains a well-known [20] non-trivial relation between masses and mixing angles:

md
(54)
= |Vus | + O 2 .
ms
V + V = 0 and due to the fact that V is
Due to the approximate unitarity relation Vtd + Vus
ts
ub
ub
of order 4 in our model, from the relation (54) we also get:

md Vtd
(55)
+ O 2 .
=
ms Vts
These relations compare well with the data: from [19] we have md /ms = 0.2130.243,
|Vus | = 0.2257 0.0021 and |Vtd /Vts | = 0.208+0.008
0.006 . Unfortunately, the theoretical errors affecting Eqs. (54) and (55), dominated respectively by the unknown O(2 ) term in Vus and by the
unknown O(4 ) term in Vtd , are of order 20%. For this reason, and for the large uncertainty on
the ratio md /ms , it is not possible to turn these predictions into precise tests of the model. It is
interesting to compare our predictions to those of early models of quark masses based on U(2)
or T flavour symmetries [21]. They also predict Eq. (55), with a smaller theoretical error of order 3 . Moreover, due to the characteristic two zero textures, in their early versions they predict
mu /mc = |Vub /Vcb |, which is off by approximately a factor two. In our model the mass of
the up quark depends on additional free parameters, that modify this wrong relation by a relative
factor of order one.
136
7. Conclusion
We have built a SUSY model of fermion masses and mixing angles based on the flavour symmetry group T U(1)FN Z3 . In our model the key role is played by the discrete group T ,
the double covering of A4 . In the lepton sector our model maintains all the good properties of
the models based on the symmetry group A4 [3,4,68]. Indeed, since the group T possesses the
representations 1, 1 , 1 and 3, in our model we can reproduce the construction made in Refs.
[68] and we obtain the same results for lepton masses and mixing angles. The main point of
our whole paper is that the symmetry group T allows to achieve a realistic description of quark
masses and mixing angles without spoiling the results in the lepton sector. To describe quarks,
we make use of the doublet representations of T , not available in A4 . As was done in previous
models for quark masses based on T and on U(2) [14,15], we accommodate the first two generations of quarks in doublets under T , whereas the third generation is kept invariant. Such an
assignment has several advantages. First of all, it allows to generate masses for the top and for
the bottom quarks at the renormalizable level, contrary to what happens for the third generation
in the lepton sector. Moreover, this choice of quark representations does not necessarily imply
diagonal quark mass matrices, thus overcoming one of the major difficulty in extending A4 to
the quark sector. At the leading order only quarks of the second and third generations acquire
mass terms and we can consistently describe mt , mc , mb , ms and Vcb . To this purpose we need
a set of scalar fields coupled to quarks, transforming non-trivially under T and developing appropriate VEVs in order to break T along the desired direction, the one left invariant by the
generator T . We have explicitly verified that our model possesses these requirements. Masses
and mixing angles of the first quark generation are produced via higher-order effects induced by
higher dimensional operators, which are compatible with all the symmetry requirements of the
model, but are depleted by inverse powers of the cutoff scale with respect to the leading order
contributions. Therefore the minimization of the full scalar potential, including all non-leading
effects is a central aspect of our model, crucial for a correct description of light quarks. As a
result of such a minimization, which we have detailed in Appendix B, we find that in the parameter space of our model, extended by the introduction of higher dimensional operators, there is
enough room to fit all the quark
data. At the sametime, some constraints remain and we get the
two approximate relations md /ms |Vus | and md /ms |Vtd /Vts |. These are well verified
experimentally within the predicted O(2 ) uncertainty.
In the lepton sector the model predicts a nearly TB lepton mixing [9], in very good agreement
with the data. As in previous models based on A4 , such a mixing pattern is produced by a special
breaking of the T group. At the leading order and in the charged lepton sector, T is broken down
to the subgroup generated by the element T . This implies a diagonal mass matrix for e, and ,
with an hierarchy induced by the U(1)FN component of the full flavour symmetry group. All
the mixing originates from the neutrino sector, where T breaks down to the subgroup generated
by the element T ST 2 . The source of the TB lepton mixing is precisely the misalignment in
flavour space between neutrino masses and charged lepton masses [13]. A very important point
of our model is that all sources of flavour symmetry breaking, including those pertaining to the
quark sector, do not spoil the successful leading order result for the neutrino mixing angles.
Indeed, higher order corrections modify such a mixing pattern only by terms of relative order 2 ,
0.22 being the Cabibbo angle. Future oscillation neutrino experiments will test the model to
this level of accuracy.
All together we have five relations: three for the lepton mixing angles, with relative accuracy
2 and two for the quark mixing angles with relative accuracy . In the absence of a theory con-
137
cerning the origin of the higher order corrections, it seems difficult to improve these predictions
from the theory side and the most stringent test of the model is still represented by an accurate
measurement of 23 and 13 , at the 2 level. Additional tests of the model can be searched for
in the context of rare processes, both in leptonic and in hadronic transitions. Depending on the
type of SUSY breaking, an imprint of the assumed flavour structure might survive in the SUSY
breaking sector and it might give rise to specific signatures on which we hope to come back in a
future work.
Acknowledgements
This work was made possible by the very kind hospitality of Manfred Lindner at the TUM
and at the Max-Planck-Institut fr Kernphysik in Heidelberg. We all thank him also for useful
discussions and for participating in the early stage of the project. We also acknowledge useful
discussions with Isabella Masina. We recognize that this work has been partly supported by the
European Commission under contracts MRTN-CT-2004-503369 and MRTN-CT-2006-035505.
Appendix A
The matrices S and T representing the generators depend on the representations of the group:
1
1
1
S = 1,
S = 1,
S = 1,
T = 1,
T = ,
T = 2 ,
2
2
2
S = A1 ,
S = A1 ,
S = A1 ,

1 12
S=
2
3 2
T = A2 ,
T = 2 A2 ,
T = A2 ,

1 0
T= 0
0 0
2
1
22
22
2
1

,
0
0
2

,
where we have used the matrices

i/12

1
i
2e
0
,
A2 =
.
A1 =
i/12
i
0 1
3 2e
We now report the multiplication rules between the various representations. In the following
we use to indicate the elements of the first representation of the product and to indicate those
of the second representation. Moreover a, b = 0, 1 and we denote 10 1, 11 1 , 11 1
and similarly for the doublet representations. On the right-hand side the sum a + b is modulo 3.
We start with all the multiplication rules which include the 1-dimensional representations:
1 Rep = Rep 1 = Rep with Rep whatever representation,
1a 1b = 1b 1a = 1a+b ,

1
1a 2b = 2b 1a = 2a+b
,
2

3
2

1 3 = 3 = 3 .
1 3 = 3 = 1 ,
2
1
138
The multiplication rules with the 2-dimensional representations are the following:
1i

2 (1 2 + 2 1 )
3 =
,
i1 1
2 2 = 2 2 = 2 2 = 3 1
with
2
2
1 = 1 2 2 1 ,

2 2
3 = 1i ( + ) ,
1 2
2 1
2
with
2 2 = 2 2 = 3 1
i
1
1

1 = 1 2 2 1 ,
i1 1
3 =
2 2
,

1
i
22 =2 2 =31
with
(
1
2
2
1
2

1 = 1 2 2 1 ,

(1 + i)2 2 + 1 1
2
=
,
(1 i)1 3 2 1

(1 + i)2 3 + 1 2
,
2 3 = 2 2 2
with 2 =
(1 i)1 1 2 2

(1 + i)2 1 + 1 3
,
2 =
(1 i)1 2 2 3

(1 + i)2 1 + 1 3
2
=
,
(1 i)1 2 2 3

(1 + i)2 2 + 1 1
,
with 2 =
2 3 = 2 2 2
(1 i)1 3 2 1

(1 + i)2 3 + 1 2
,
2 =
(1 i)1 1 2 2

(1 + i)2 3 + 1 2
2
=
,
(1 i)1 1 2 2

(1 + i)2 1 + 1 3
2 3 = 2 2 2
,
with 2 =
(1 i)1 2 2 3

(1 + i)2 2 + 1 1
,
2 =
(1 i)1 3 2 1
The multiplication rule with the 3-dimensional representations is
3 3 = 3S 3A 1 1 1 ,
where
21 1 2 3 3 2
23 3 1 2 2 1
22 2 1 3 3 1
1 = 1 1 + 2 3 + 3 2 ,
1
3S =
3
1 = 3 3 + 1 2 + 2 1 ,

,
1
3A =
2
2 3 3 2
1 2 2 1
3 1 1 3

,
139
1 = 2 2 + 1 3 + 3 1 .
Appendix B
In this appendix we discuss the subleading terms of the superpotential wd and how they correct the VEV alignment. We work along the lines of Appendix B of [7].
The VEVs are shifted from the values
S = (vS , vS , vS ),
= u,
= 0,
T = (vT , 0, 0),
= (v1 , 0),
= 0
to the values
S = (vS + vS1 , vS + vS2 , vS + vS3 ),
T = (vT + vT 1 , vT 2 , vT 3 ),
= (v1 + v1 , v2 ),
= u,
= u,
= u ,
where the corrections vT i , vSi , vi , u and u are independent of each other. Note that there
also might be a correction to the VEV u, but we do not have to indicate this explicitly by the
addition of a term u, since u is undetermined at tree-level anyway.
We change the notation in Eq. (31) a bit by defining
g3 3g 32 ,
g4 g 42
and g8 i g 82
such that the VEVs read

vS =
g 4
u,
3g 3
vT =
M
g9
and v1 =
1
3g 8 g9
2gM2 + 3g9 MM
where we have chosen the + sign for the VEV v1 . Apart from the subleading terms which are
already presented in [7] we get 17 other invariants which involve at least one of the new fields
0 and 0 :
1,2 , , 1,2
18

15
4
4

1 T
S
X
N
Y
ti I i +
si Ii + x4 I4 +
n i Ii +
yi Ii
wd2 =
i=14
i=13
i=1
i=1
with

T
I14
= T0 T = T0 2 T 2 + T0 1 T 3 + T0 3 T 1 ,

T
= (T ) T0 = (1 + i)T 1 2 + T 3 1 (1 i)T0 2 1 T0 3 2
I15

(1 i)T 2 1 T 3 2 (1 + i)T0 1 2 + T0 3 1 ,

T
I16
= (T ) T0 = (1 + i)T 2 2 + T 1 1 (1 i)T0 1 1 T0 2 2

(1 i)T 3 1 T 1 2 (1 + i)T0 3 2 + T0 2 1 ,

T
= T0 = 1 (1 i)T0 2 1 T0 3 2 2 (1 + i)T0 1 2 + T0 3 1 ,
I17

2 2

T
= (T T )S T0 =
T 1 T 2 T 3 T0 2 + T2 3 T 1 T 2 T0 1
I18
3

+ T2 2 T 1 T 3 T0 3 ,

0

S
0
0
= S0 S = S3
S3 + S1
S2 + S2
S1 ,
I13
140

0
S
0
0
I14
= S0 S = S3
S3 + S1
S2 + S2
S1 ,

S
= S S0 S S
I15

1 0
0
0
= S2 2S1
S1 S2
S3 S3
S2
3
0

0

0
0
0
0
+ S3 2S2
S2 S1
S3 S3
S1 + S1 2S3
S3 S1
S2 S2
S1 ,

2
+ 2S1 S2 0
I4X = (S S ) 0 = S3
0 and 0 :
and furthermore the structures involving the driving fields 1,2

I1N = (T T ) 0 = T2 1 + 2T 2 T 3 10 2 20 1 ,

I2N = (T ) 0 T = (1 + i)T 1 2 + T 3 1 (1 i)10 T 1 20 T 2

(1 i)T 2 1 T 3 2 (1 + i)20 T 3 + 10 T 2 ,

I3N = 0 T = 1 (1 i)10 T 1 20 T 2 2 (1 + i)20 T 3 + 10 T 2 ,

1
I4N = ()3 0 3 = (1 + i)12 10 2 + 20 1 + 23 20 + (1 + i)12 2 10 ,
2

I1Y = (T T ) 0 = T2 1 + 2T 2 T 3 0 ,

I2Y = (T ) 0 = (1 + i)T 2 2 + T 1 1 0 2 (1 i)T 3 1 T 1 2 0 1 ,

2
+ 2S1 S3 0 ,
I3Y = (S S ) 0 = S2

2
+ 2S1 S3 0 .
I4Y = (S S ) 0 = S2
If we perform the analogous calculation as done in [7], i.e. we only take into account terms which
are at most linear in v and no terms of the order O( v
) where is the cutoff scale, and plug in
the VEVs vT and v1 , the equations take the form:

g 42
g 4 u3
t3 3
t16 2
v1
2gvT
t11 + 2 (t6 + t7 + t8 ) + vT + (1 i) v1 vT 2vT
+M
3g 3
3
v1
3g 3

4gvT
+ M+
(56)
vT 1 = 0,
3

g 42
2gvT
g 4 u3
(57)
t11 + 2 (t6 + t7 + t8 ) + M
vT 2 = 0,
3g 3
3
3g 3

g 42
g 4 u3
v2
2gvT

t11 + 2 (t6 + t7 + t8 ) + g7 vT u + (1 + i)vT
+M
3g 3
3
v1
3g 3

2gvT
+ M
(58)
vT 3 = 0,
3

vT u
9g 3 s10 3g 4 s3
+
+ 2s6
+ 3g2 u + 2g1 (2vS1 vS2 vS3 ) = 0,
(59)
g 4
g 3

3
vT u
3g 4 s4
s6 s8
+ 3g2 u + 2g1 (2vS2 vS1 vS3 ) = 0,
(60)
g 3
2

3
vT u
3g 4 s5
s6 + s8
+ 3g2 u + 2g1 (2vS3 vS1 vS2 ) = 0,
(61)
g 3
2
x2 vT u g5
+ u + 2g 3 (vS1 + vS2 + vS3 ) = 0,
3g 3
g 4
1
vT v2 (1 i)v1 vT 3 = 0,
2
n1
1
(1 + i)n4 v12 + vT2 + g9 vT 1 = 0,

2
2
3
g 4 y3 u
+ M u + 2g10 vT vT 3 = 0.
3g 32
141
(62)
(63)
(64)
(65)
As one can see Eqs. (59)(62) do not get any contributions from the terms of wd2 , i.e. the shifts
vSi and u are the same as in the A4 model. Eqs. (56)(58) are also correlated to the analogous
ones in the A4 model. In order to see this one has to set the couplings appearing in wd2 to zero
v1
2
and take into account that vT = 3M
2g in the A4 model such that 2vT ( 3 gvT + M) v1 vanishes
T
and expressions like (M + 4gv
3 )vT 1 are just MvT 1 . With this at hand Eqs. (56)(58) fully
coincide with the ones found in [7]. The last three equations are not present in case of A4 and
they simply vanish, if the couplings and the VEVs of the fields only present in case of T and not
A4 are set to zero.
Concerning the tuning we need in the parameter vT 2 , Eq. (48), it is easy to see that it is
compatible with the structure of the above equations and, at the same time, it has no consequences
on the other shifts of the VEVs.
References
[1] C.D. Froggatt, H.B. Nielsen, Nucl. Phys. B 147 (1979) 277.
[2] For reviews, see G. Altarelli, hep-ph/0611117;
G. Altarelli, F. Feruglio, New J. Phys. 6 (2004) 106.
[3] E. Ma, G. Rajasekaran, Phys. Rev. D 64 (2001) 113012, hep-ph/0106291.
[4] E. Ma, Mod. Phys. Lett. A 17 (2002) 627, hep-ph/0203238;
K.S. Babu, E. Ma, J.W.F. Valle, Phys. Lett. B 552 (2003) 207, hep-ph/0206292;
E. Ma, hep-ph/0409075;
E. Ma, New J. Phys. 6 (2004) 104;
S.L. Chen, M. Frigerio, E. Ma, Nucl. Phys. B 724 (2005) 423, hep-ph/0504181;
K.S. Babu, X.G. He, hep-ph/0507217;
A. Zee, Phys. Lett. B 630 (2005) 58, hep-ph/0508278;
E. Ma, hep-ph/0511133;
S.K. Kang, Z.z. Xing, S. Zhou, Phys. Rev. D 73 (2006) 013001, hep-ph/0511157;
X.G. He, Y.Y. Keum, R.R. Volkas, JHEP 0604 (2006) 039, hep-ph/0601001;
B. Adhikary, B. Brahmachari, A. Ghosal, E. Ma, M.K. Parida, Phys. Lett. B 638 (2006) 345, hep-ph/0603059;
E. Ma, hep-ph/0607190;
L. Lavoura, H. Kuhbock, hep-ph/0610050.
[5] For other approaches related to discrete flavour symmetry see, for example: N. Haba, A. Watanabe, K. Yoshioka,
Phys. Rev. Lett. 97 (2006) 041601, hep-ph/0603116;
J.E. Kim, J.C. Park, JHEP 0605 (2006) 017, hep-ph/0512130;
Y. Yamanaka, H. Sugawara, S. Pakvasa, Phys. Rev. D 25 (1982) 1895;
C. Luhn, S. Nasri, P. Ramond, hep-ph/0701188;
F. Plentinger, G. Seidl, W. Winter, hep-ph/0612169;
142
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
Z.G. Berezhiani, M.Yu. Khlopov, Sov. J. Nucl. Phys. 51 (1990) 739746;

A.S. Sakharov, M.Yu. Khlopov, Phys. At. Nucl. 57 (1994) 651658.
G. Altarelli, F. Feruglio, Nucl. Phys. B 720 (2005) 64, hep-ph/0504165.
G. Altarelli, F. Feruglio, Nucl. Phys. B 741 (2006) 215, hep-ph/0512103.
G. Altarelli, F. Feruglio, Y. Lin, hep-ph/0610165.
P.F. Harrison, D.H. Perkins, W.G. Scott, Phys. Lett. B 530 (2002) 167, hep-ph/0202074;
Z.z. Xing, Phys. Lett. B 533 (2002) 85, hep-ph/0204049;
P.F. Harrison, W.G. Scott, hep-ph/0402006;
P.F. Harrison, W.G. Scott, hep-ph/0403278.
T. Schwetz, Phys. Scr. T 127 (2006) 1, hep-ph/0606060;
M. Maltoni, T. Schwetz, M.A. Tortola, J.W.F. Valle, New J. Phys. 6 (2004) 122, see hep-ph/0405172 v5;
A. Strumia, F. Vissani, Nucl. Phys. B 726 (2005) 294, hep-ph/0503246;
G.L. Fogli, et al., hep-ph/0608060.
T. Schwetz, hep-ph/0510331.
E. Ma, Mod. Phys. Lett. A 17 (2002) 627, hep-ph/0203238.
X.G. He, Y.Y. Keum, R.R. Volkas, JHEP 0604 (2006) 039, hep-ph/0601001.
A. Aranda, C.D. Carone, R.F. Lebed, Phys. Lett. B 474 (2000) 170, hep-ph/9910392;
A. Aranda, C.D. Carone, R.F. Lebed, Phys. Rev. D 62 (2000) 016009, hep-ph/0002044;
P.H. Framptom, T.W. Kepart, Int. J. Mod. Phys. A 10 (1995) 4689.
R. Barbieri, G.R. Dvali, L.J. Hall, Phys. Lett. B 377 (1996) 76, hep-ph/9512388;
R. Barbieri, L.J. Hall, S. Raby, A. Romanino, Nucl. Phys. B 493 (1997) 3, hep-ph/9610449;
R. Barbieri, L.J. Hall, A. Romanino, Phys. Lett. B 401 (1997) 47, hep-ph/9702315.
When we were completing our work, the following paper appeared, also remarking that TB lepton mixing and quark
masses can be reproduced by exploiting the group T : P.D. Carr, P.H. Frampton, hep-ph/0701034.
E. Ma, H. Sawanaka, M. Tanimoto, Phys. Lett. B 641 (2006) 301, hep-ph/0606103;
S.F. King, M. Malinsky, Phys. Lett. B 645 (2007) 351, hep-ph/0610250;
S. Morisi, M. Picariello, E. Torrente-Lujan, hep-ph/0702034.
For an elementary introduction to the group T , see [14]. See also A.D. Thomas, G.V. Wood, Group Tables, Shiva
Publishing Limited.
W.-M. Yao, et al., Review of Particle Physics, J. Phys. G: Nucl. Part. Phys. 33 (1) (2006).
R. Gatto, G. Sartori, M. Tonin, Phys. Lett. B 28 (1968) 128.
R.G. Roberts, A. Romanino, G.G. Ross, L. Velasco-Sevilla, Nucl. Phys. B 615 (2001) 358, hep-ph/0104088;
Z.z. Xing, H. Zhang, J. Phys. G 30 (2004) 129, hep-ph/0309112;
H.D. Kim, S. Raby, L. Schradin, Phys. Rev. D 69 (2004) 092002, hep-ph/0401169;
I. Masina, C.A. Savoy, Nucl. Phys. B 755 (2006) 1, hep-ph/0603101;
G.C. Branco, M.N. Rebelo, J.I. Silva-Marcos, hep-ph/0612252.
Detecting the dark matter annihilation at the ground

EAS detectors
X.-J. Bi a, , Y.-Q. Guo a , H.-B. Hu a , X. Zhang b
a Key Laboratory of Particle Astrophysics, IHEP, Chinese Academy of Sciences, Beijing 100049, PR China
b Theoretical Physics Division, IHEP, Chinese Academy of Sciences, Beijing 100049, PR China
Received 27 November 2006; received in revised form 3 April 2007; accepted 11 April 2007
Abstract
In this paper we study the possibility of detecting gamma rays from dark matter annihilation in the subhalos of the Milky Way by the ground based EAS detectors within the frame of the minimal supersymmetric
Standard Model. Based on the Monte Carlo simulation we also study the properties of two specific EAS
detectors, the ARGO and HAWC, and the sensitivities of these detectors on the detection of dark matter
annihilation. We find the ground EAS detectors have the possibility to observe such signals. Conversely if
no signal observed we give the constraints on the supersymmetric parameter space, which however depends
on the subhalos properties.
1. Introduction
The existence of cosmological dark matter has been established by a multitude of observations. The evidences come mainly from the gravitational effects of the dark matter component,
such as the observations of the rotation curves in spiral galaxies and velocity dispersion in elliptical galaxies, the X-ray emission and peculiar velocities of galaxies in the clusters of galaxies,
all indicating much steeper gravitational potentials than those inferred from the luminous matter.
The primordial nucleosynthesis and cosmic microwave background measurements constrain the
baryonic component to be about 4% of the critical density, while the total amount of the clumpy
E-mail address: bixj@mail.ihep.ac.cn (X.-J. Bi).

doi:10.1016/j.nuclphysb.2007.04.013
144
X.-J. Bi et al. / Nuclear Physics B 775 (2007) 143161
matter is about 30% of the critical density. Therefore most of the dark matter is of non-baryonic
origin.
The nature of the non-baryonic dark matter is one of the most outstanding puzzles in particle
physics and cosmology. However, the gravitational effects do not shed light on the solutions of
this problem. Even though, some hints can still be obtained on the nature of dark matter. The
simulation of structure formation requires the existence of dark matter and favors the nature of
cold dark matter (CDM), that is, the dark matter particles are non-relativistic when they freeze
out the thermal bath at the early universe. The CDM nature rules out the candidate of neutrino
as the dominant component of dark matter since neutrinos are hot (relativistic) when they freeze
out at the temperature of about 1 MeV. The precise measurement of the abundance of the dark
matter component also constrains the nature of dark matter by requiring a natural explanation of
the measured density. The three years WMAP data, combining with recent observational results
from other experiments, give the abundance of CDM as CDM h2 = 0.109+0.003
0.006 [1]. The small
error strongly constrains the dark matter models.
All the candidates of non-baryonic dark matter require physics beyond the Standard Model
of particle physics. Among the large amount of candidates, the most attractive scenario involves
the weakly interacting massive particles (WIMPs). An appealing idea is that the WIMPs form
the thermal relics of the early universe and naturally give rise to the relic abundance in the
range of the observed values for both the interaction strength and the masses being at the weak
scale. The WIMPs are also well motived theoretically by the physics beyond the Standard Model
to solve the hierarchical problem between the weak scale and the Planck scale. In particular,
the minimal supersymmetric extension of the Standard Model (MSSM) provides an excellent
WIMP candidate as the lightest supersymmetric particle, usually the neutralino, which are stable
due to R-parity conservation [2]. The cosmological constraints on the supersymmetric (SUSY)
parameter space have been extensively studied in Ref. [3].
Another appealing aspect of WIMP is that it can be detected on the present running or proposed experiments, either directly by measuring the recoil energy when WIMP scatters off the
detector nuclei [4] or indirectly by observing the annihilation products of the WIMPs, such as
the antiprotons, positrons, -rays or neutrinos [5]. The WIMPs may also be generated in the
next generation colliders, which is the most direct way to resolve the nature of the dark matter
particles. The direct and indirect detection of dark matter particles are viable and complementary
ways to the collider studies in order to further constrain the nature of dark matter. WIMP annihilation provides viable explanation for exotic signals observed in the cosmic ray experiments,
such as the GeV excess of the Galactic diffuse observed by EGRET [6], the bump at about
10 GeV of the positron ratio measured by HEAT [7] and the TeV -ray emission at the Galactic
center observed by HESS [8] and CANGAROO II [9].
The rate of the WIMP annihilation is proportional to the number density square of the dark
matter particles. Therefore the searches for the annihilation signals should aim at the regions
with high matter densities, such as at the galactic center [10] or the nearby subhalos [11,12]. The
existence of a wealth of subhalos throughout the galaxy halos is a generic prediction of the CDM
paradigm of structure formation in the Universe. High resolution simulations show that for CDM
scenario the large scale structure forms hierarchically by continuous merging of smaller halos
and as the remnants of the merging process about 10% to 15% of the total mass of the halo is
in the form of subhalos [1320]. At the center of the subhalos there are high mass densities and
therefore they provide alternative sites for the search of WIMP annihilation products.
There are several advantages of detecting the -rays from the subhalos than that from the GC.
First, subhalos produce clean annihilation signals, while the annihilation radiation from the GC is
145
heavily contaminated by the baryonic processes associated with the central supermassive black
hole (SMBH) and the supernova remnant Sgr A [21]. Furthermore, the dark matter density
profile near the GC is complicated due to the existence of baryonic matter and leads to difficulties
in making theoretical calculations. For example, the SMBH can either steepen or flatten the slope
of the DM profile at the innermost center of the halo depending on the evolution of the black
hole [22]. For subhalos, their profile may simply follow the simulation results. Second, the small
subhalos form earlier and have larger concentration parameter, which leads to relatively greater
annihilation fluxes. Third, the DM profile may be not universal, as shown in the simulation given
in Refs. [23,24]. Smaller subhalos have steeper central cusp. In this case, if taking the GC the
NFW profile and the subhalos the Moore profile, the -ray fluxes from the subhalos may even
be greater than that from the GC. Forth, according to the hierarchical formation of structures in
the CDM scenario we expect that subhalos should contain their own smaller sub-subhalos, which
can further enhance the annihilation flux. The sub-subhalos have been observed in the numerical
simulations, such as in Ref. [25]. Finally, the environmental trend seems to make the subhalos
more concentrated [26]. However, the effects need further studies by more precise simulations.
The possibility of detecting dark matter annihilation signal from the Galactic Center (GC)
has been extensively studied [10]. The high energy -rays from the GC observed by HESS
and CANGAROO II have been explained as a possible signal of dark matter annihilation [27].
The subhalos are approximately uniformly distributed in the Milky Way dark halos and provide
potential -ray sources which will be observed in the next generation experiments. However, the
position of the subhalos cannot be predicted by numerical simulations, therefore the search for
the -rays from the subhalos need detectors with large field of view. Unless we have known the
position of a nearby subhalo the Cerenkov

detectors cannot do the job of blind search due to their
narrow field of view ( 5 ), despite they have high sensitivities. The satellite based experiments,
such as the GLAST [28] or the AMS [29], usually have large field of view. The possibility to
detect -rays from the subhalos by GLAST has been studied [12,30]. However, the satellite
based instruments have small effective area at the order of 1 m2 , which limits their ability
to detect low -ray fluxes. Therefore the ground based experiments, which can have very large
effective area, are complementary to the satellite based experiments.
In this paper, we explore the possibility of detecting -ray signals from subhalos by the ground
based EAS detectors, such as the ARGO [31] and the next generation All-Sky VHE Gamma-Ray
Telescope HAWC [32]. These detectors have large field of view and large effective area. In the
case that the neutralino is heavy, which annihilates into -rays with high energy and low intensity,
the ground based EAS detectors can be even superior than the satellite based experiments.
In the next section we first describe our model for the subhalo distribution and the particle nature of dark matter. The fluxes of gamma rays from the subhalos are then calculated. In Section 3,
we discuss how the ground EAS detectors can constrain the SUSY models and the properties of
two EAS detectors and the sensitivity of dark matter detection. We finally give summary and
conclusion in Section 4.
2. Model description
The flux of gamma rays from the neutralino annihilation in the subhalos is given by
v
(E) = (E)
2m2
2
(E) v
1
dV
=
2
4 2m2
4d 2
d
r
4r 2 2 (r) dr,
0
(1)
146
where (E) is the differential flux at energy E by a single annihilation in unit of 1 particle
GeV1 , m is the mass of the dark matter particle, d is the distance from the detector to the source,
r = min(Rsub , r ) is the minimal value of the subhalo radius Rsub and the angular radius at the
distance d within the angular resolution of the detector. We notice that the integration in
Eq. (1) depends only on the distribution of the dark matter (r), taken as a spherically-averaged
form, which is determined by numerical simulation or by observations and has no relation to the
particle nature of the dark matter. We define this factor as cosmological factor and the first part
in Eq. (1) the particle factor which is exclusively determined by its particle nature, such as the
mass, strength of interaction and so on.
The cosmological factor in Eq. (1) is determined by the position, mass and interior profile of
the subhalo. We adopt the N-body simulation results to calculate the cosmological factor.
2.1. Distribution of the subhalos
N-body simulation is widely adopted to investigate the spatial distribution and mass function
of substructures in the host halo. The results show that the radial distribution of substructures is
generally shallower than the density profile of the smooth background due to the tidal disruption
of substructures which is most effective near the galactic center [33]. The relative number density
of subhalos can be approximately given by an isothermal profile with a core [33]

1
n(r) = 2nH 1 + (r/rH )2 ,
(2)
where nH is the relative number density at the scale radius rH , with rH being about 0.14 times
the halo virial radius rH = 0.14rvir . The result given above agrees well with that in another recent
simulation by Gao et al. [34].
Simulations show that the differential mass function of substructures has an approximate
power law distribution, dn/dm m . In Ref. [33] both the cluster and galaxy substructure
cumulative mass functions are found to be an m1 power law, nsub (msub > m) m1 , with no
dependence on the mass of the parent halo. A slight difference is found in a recent simulation by
Gao et al. [34] that the cluster substructure is more abundant than galaxy substructure since the
cluster forms later and more substructures have survived the tidal disruption. The mass function
for both scales are well fitted by dn/dm m1.9 . Taking the power index of the differential mass
function smaller than 2 makes the fraction of the total mass enclosed in subhalos insensitive to
the mass of the minimal subhalo we take. The mass fraction of subhalos estimated in the literature is around between 5 percent to 20 percent [16,17,35]. In this work we will always take the
differential index of 1.9 and the mass fraction of substructures as 10 percent.
We then get the number density of a substructure with mass m at the position r to the galactic
center

1
m 1.9
1 + (r/rH )2 ,
n(m, r) = n0
(3)
Mvir
where Mvir is the virial mass of the MW, n0 is the normalization factor determined by requiring
the total mass of substructures converges to 10 percent of the virial mass of the MW. A population
of substructures within the virial radius of the MW are then realized statistically due to the probability of Eq. (3). The mass of the substructures are taken randomly between Mmin = 106 M ,
which is the lowest substructure mass the present simulations can resolve [36], and the maximal
mass Mmax . The maximal mass of substructures is taken to be 0.01Mvir since the MW halo does
not show recent mergers of satellites with masses larger then 2 1010 M . The -ray flux is
147
quite insensitive to the minimum subhalo mass since the flux from a single subhalo scales with
its mass [12,37].
However, due to the finite resolution of the N-body simulations the distribution in Eq. (3)
is an extrapolation of the subhalo distribution at large radius. The formula underestimates the
tidal effect which destroys most substructures near the GC. We take the tidal effects into account
under the tidal approximation, which assumes that all mass beyond the tidal radius is lost in a
single orbit while keep its density profile inside the tidal radius intact.
The tidal radius of the substructure is defined as the radius at which the tidal forces of the host
exceeds the self gravity of the substructure. Assuming that both the host and the substructure
gravitational potential are given by point masses and considering the centrifugal force experienced by the substructure the tidal radius at the Jacobi limit is given by [38]

rtid = rc
m
3M(< rc )
1
3
(4)
where rc is the distance of the substructure to the GC, M(< rc ) refers to the mass within rc .
The substructures with rtid rs will be disrupted. The mass of a substructure is also recalculated by subtracting the mass beyond the tidal radius. After taking the tidal effects into account
we find the substructures near the GC are disrupted completely. The substructures with NFW
profile can exist more near the GC than the Moore profile. This is because that the NFW profile
has smaller rs .
2.2. Concentration parameter
We will adopt both the NFW and Moore profiles of dark matter distribution in our study.
The NFW profile was first prosed by Navarro, Frenk, and White [39] and supported by recent
studies [40] that the DM profile of isolated and relaxed halos can be described by a universal
form
s
DM (r) =
(5)
,
(r/rs )(1 + r/rs )2
where s and rs are the scale density and scale radius respectively. The two free parameters
of the profile can be determined by the measurements of the virial mass of the halo and the
concentration parameter determined by simulations. The concentration parameter is defined as
c=
rvir
,
r2
(6)
where rvir is the virial radius of the halo and r2 is the radius at which the effective logarithmic
d
slope of the profile is 2, i.e., dr
(r 2 (r))|r=r2 = 0. For the NFW profile we have rs = r2 . The
concentration parameter reflects how the DM is concentrated at the center.
Moore et al. gave another form of the DM profile [41] to fit their numerical simulation
DM (r) =
s
,
1.5
(r/rs ) (1 + (r/rs )1.5 )
(7)
which has the same behavior at large radius as the NFW profile while it has a steeper central
cusps (r) r 1.5 for small r than the NFW profile. The index of the central cusp at about 1.5
is also favored by following higher resolution simulations [42]. For the Moore profile we have
rs = r2 /0.63.
148
Fig. 1. Concentration parameter as a function of the virial mass calculated according to the Bullock model [26]. The
model parameters are taken as F = 0.015 and K = 4.4. The cosmology parameters are taken as M = 0.3, = 0.7,
B h2 = 0.02, h = 0.7, 8 = 0.9 with three generations of massless neutrinos and a standard scale invariant primordial
spectrum. Both the median and the 1 values of the concentration parameters are plotted.
Concentration parameter is obtained by simulation. In a semi-analytic model based on their

simulation results Bullock et al. [26] found that the concentration of a halo is strongly correlated
with the formation epoch of the halo. At an epoch of redshift zc a typical collapsing mass M (zc )
is defined by [M (z)] = sc (z), where the [M (z)] is the linear rms density fluctuation on the
comoving scale encompassing a mass M , sc is the critical overdensity for collapsing at the
spherical collapse model. The model assumes the typical collapsing mass is related to a fixed
fraction of the virial mass of a halo M (zc ) = F Mvir . The concentration parameter of a halo
c
with virial mass Mvir at redshift z is then determined as cvir (Mvir , z) = K 1+z
1+z . Both F and K
are constants to fit the numerical simulations. A smaller Mvir corresponds to a smaller collapsing
mass and early collapsing epoch when the Universe is denser and therefore a larger concentration
parameter. Fig. 1 plots the concentration parameter at z = 0 as a function of the virial mass of a
halo according to the Bullock model [26].
From Fig. 1 we can see that between the masses 106 M 1010 M an experiential formula
cvir Mvir reflect the simulation result accurately. We expect that this exponential relation
should be very well followed, since small halos form early at the epoch when the Universe is
dominated by matter with approximate power-law power spectrum of fluctuations [26]. However,
when we fit the formula to other recent simulation results in the literature we find quite large difference, especially for subhalos from distinct small halos. We adopt these results to calculate the
density profile of the substructure and furthermore the -ray flux from the substructure, which
have large uncertainties, see Fig. 2. We find the concentration parameter is the most sensitive
parameter in determining the -ray flux.
By realizing one hundred MW sized halos distributed with subhalos due to Eq. (3) we calculate the average gamma ray intensities from the MW subhalos [43]. Fig. 2 gives the cumulative
number of subhalos emitting -rays with the integrated flux above 100 GeV greater than a
value . In the calculation we take the particle factor fixed so that the gamma ray flux from
the Galactic Center is just below the experimental limit, i.e., = 109 cm2 s1 for the Moore
149
Fig. 2. The cumulative number of subhalos as function of the integrated -ray fluxes n(> ) for the Moore profile
(upper panel) and the NFW profile (lower panel). The results are given within the zenith angle of 60 . The curves
represent the results according to different simulations as explained in the text. These curves give the number of subhalos
which emit -rays with the integrated flux above .
profile. In the upper panel we plot the results for the Moore profile while the lower panel is for
the NFW profile. The curves are given by calculating the density profile of subhalos according
to different authors N-body simulation results, where subhalo denotes the simulation results
given in Ref. [26] for real subhalos in a large smooth dark halo (the dense matter environment);
Reed et al. refers to the simulation results given by Reed et al. [23]; Bullock et al. uses the
median cvir Mvir relation for distinct halos of the Bullock model given in Ref. [26]; ENS refers
to the result by Eke, Navarro and Steinmetz [44] for the CDM model with 8 = 0.9. The latter
150
three models actually do not describe real subhalos. Instead they describe the distinct halos with
small masses. Therefore we expect the following studies refer to subhalos may give somewhat
more realistic results. From Fig. 2 we can easily read the number of the expected detectable subhalos if the sensitivity of a detector is given with same threshold energy and angular resolution
adopted here.
2.3. SUSY parameter
We now turn to the particle factor in Eq. (1). We will work in the frame of MSSM, the low energy effective description of the fundamental theory at the electroweak scale. By doing a random
scan we find how the parameter space is constrained by the ground EAS detectors.
For the R-parity conservative MSSM, the lightest supersymmetric particle (LSP), generally
the lightest neutralino, is stable and an ideal candidate of dark matter.
However, there are more than one hundred free SUSY breaking parameters even for the
R-parity conservative MSSM. A general practice is to assume some relations between the parameters and greatly reduce the number of free parameters. For the processes related with dark
matter production and annihilation, only seven parameters are relevant under some simplifying
assumptions, i.e., the Higgsino mass parameter , the wino mass parameter M2 , the mass of
the CP-odd Higgs boson mA , the ratio of the Higgs vacuum expectation values tan , the scalar
fermion mass parameter mf , the trilinear soft breaking parameter At and Ab . To determine the
low energy spectrum of the SUSY particles and coupling vertices, the following assumptions
have been made: all the sfermions have common soft-breaking mass parameters mf ; all trilinear
parameters are zero except those of the third family; the bino and wino have the mass relation,
M1 = 5/3 tan2 W M2 , coming from the unification of the gaugino mass at the grand unification
scale.
We perform a numerical random scanning of the 7-dimensional supersymmetric parameter space using the package DarkSUSY [45]. The ranges of the parameters are as following:
50 GeV < ||, M2 , MA , mf < 10 TeV, 1.1 < tan < 55, 3mq < At , Ab < 3mq , sign() =
1. The parameter space is constrained by the theoretical consistency requirement, such as the
correct vacuum breaking pattern, the neutralino being the LSP and so on. The accelerator data
constrains the parameter further from the spectrum requirement, the invisible Z-boson width, the
branching ratio of b s and the muon magnetic moment.
The constraint from cosmology is also taken into account by requiring the relic abundance of
neutralino 0 < h2 < 0.124, where the upper limit corresponds to the 5 upper bound from
the cosmological observations. When the relic abundance of neutralino is smaller than a minimal
value we can assume two different cases. One is that the neutralino relic is produced thermally
and represents a subdominant dark matter component. In this case we rescale the galaxy neutralino density as (r) (r) with = h2 /( h2 )min . We take ( h2 )min = 0.079, the
5 lower bound of the CDM abundance [1]. The effect of coannihilation between the fermions
is taken into account when calculating the relic density numerically. The other case is assuming
the neutralino relic is determined by a nonthermal mechanism [46]. In this case the dark matter
is all made up by neutralino.
The -rays from the neutralino annihilation arise mainly in the decay of the neutral pions
produced in the fragmentation processes initiated by tree level final states, the quarks, leptons
and gauge bosons. The fragmentation and decay processes are simulated with Pythia package
[47] incorporated in DarkSUSY. We focus our calculation on the continuum -rays from the
pion decays.
151
Fig. 2 shows the gamma ray emission from subhalos due to a fixed particle factor. In the
present work we scan the SUSY parameter space and study the variation of -ray flux from the
subhalos. We then explore how the SUSY parameter space is constrained by the ground EAS
detectors when observing the -ray emission from subhalos.
3. Observation of neutralino annihilation from subhalos
In this section, we study the observation of -rays from neutralino annihilation in subhalos
by the ground based EAS detectors. We first show how the SUSY models can be constrained by
the EAS detectors in an experiment-independent way by assuming its sensitivity. Then we will
discuss two specific examples of such detectors, the ARGO and HAWC experiments, and their
ability on the dark matter detection.
At present there are two kinds of different techniques adopted by the ground-based gamma
ray detectors: the air Cerenkov

telescopes (ACTs) and the extensive air shower (EAS) detectors.
There has been great progresses in improving the sensitivity of the ACTs in the recent years.
However, they have narrow field of view and can only view a small region of the sky at any
one time. The ACTs can only operate on clear moonless nights and constrain their observation
efficiency. On the other hand, the EAS detectors, such as the Tibet Array [48] and the Milagro
observatory [49], can view the entire overhead sky and operate continuously. To search the unknown -ray sources, such as the unknown AGNs or subhalos of the MW, EAS detectors with
improved sensitivities are appropriate instruments.
The detectability of a signal is defined by the ratio of the signal events to the fluctuation of the
background. Since
the background follows the Poisson statistics, its fluctuationnhas the amplitude
proportional to NB . The significance of the detection is quantified by = N .
B
The signal events are given by
m
n =
Aeff (E, )(E) dE d dT ,
(8)
Eth ,
where = 0.68 is the fraction of signal events within the angular resolution of the instrument and the integration is for the energies above the threshold energy Eth to the cutoff of the
spectrum at the neutralino mass, within the angular resolution of the instrument and for the
observation time. The (E) is the flux of -rays from DM annihilation. The effective area Aeff
is a function of energy and zenith angle.
The corresponding expression for the background is similar to Eq. (8) by substituting the
signal spectrum with the background spectrum and also the effective area to that of cosmic ray
background. The background includes contributions from the hadronic and electronic cosmicrays and the Galactic and extragalactic -ray emissions, which are given in [43]. Since the nuclei
background dominates other contributions we only consider the nuclei background in this work.
3.1. Constrain the SUSY model by EAS detectorsa general discussion
In this subsection we try to give a general discussion of how the EAS detectors can constrain
the SUSY parameter space. According to Fig. 2 we have known the cumulative numbers of the
-ray sources generated by dark matter annihilation in subhalos for different subhalo models.
The result is given for a specific SUSY model. Once the sensitivity of an EAS detector is known
152
we can predict that how many -ray sources are expected to be detected by the detector and
how the number of -ray sources varies with SUSY model, thanks to the fact, see Eq. (1), that
the particle factor and the cosmological factor are separated. Conversely if no such gamma
sources are found we can constrain the strength of the -ray sources and consequently constrain
the parameter space of SUSY.
Generally the ground-based detectors have the threshold energy at the order of 100 GeV.
We assume below all the EAS detectors have such threshold energy. Assuming these detectors
have large enough effective areas and have the sensitivities Ith = 1011 , 1012 , 1013 , 1014
photons cm2 s1 respectively, which are comparable to the sensitivities of present ACTs. The
sensitivity Ith is defined as the minimal integrated -ray flux above the threshold energy that can
be observed by the detector with high significance, for example 5 , in a finite observation time,
such as 110 years.
In Fig. 3 we show how the SUSY parameter space is constrained by the EAS detectors with
different sensitivities. The points in the figure represent different SUSY models when scanning
in the SUSY parameter space. These models above the curves predict more than one such -ray
sources should be detected at the same significance level in defining Ith . For the subhalo model
we adopt the analytic model by Bullock et al. [26]. In the upper panel we assume the neutralino
is produced thermally while the lower panel assumes the nonthermal production. Both the NFW
and Moore profiles are shown in the same figure. We can see that constraints assuming Moore
profile are similar to the NFW case with one order of higher sensitivity. Similarly assuming
nonthermal production also leads to more strict constraints.
In Fig. 4 we show how the constraints depend on the subhalo models. We assume a detector
with the sensitivity of Ith = 1012 photons cm2 s1 , the dark matter is produced nonthermally
and has Moore profile. We notice that for real subhalos the constraint is more severe than these
models for distinct small halos.
From Figs. 3 and 4 we see that the EAS detectors can indeed constrain the SUSY models via
the observation of -rays by dark matter annihilation in the subhalos due to the virtue of large
field of view and high duty circle. Considering that GLAST is superior only at low energies than
the EAS detectors we expect the ground EAS detectors and GLAST are complementary to each
other in the detection of dark matter annihilation in subhalos.
3.2. The ARGO and HAWC experiments
The ARGO-YBJ experiment, locates at YangBaJing (90.522 east, 30.102 north, 4300 m
a.s.l.) in Tibet, China, is a ground-based telescope optimized for the detection of small size air
showers. The energy threshold of the detector is designed to be about 100 GeV. The detector
consists of a single layer of RPCs floored in a carpet structure covering an area of 104 m2 .
The detector is under construction and the central carpet has been completed. ARGO will begin
stable data taking soon after.
The effective area of the detector characterizes the power of the detector in recording the
number of events. The effective area of the ARGO array is determined by a full Monte Carlo
simulation. We simulated N showers uniformly distributed over a large sampling area As including the detector and selected those which satisfy the trigger conditions. The effective area is
defined as
Aeff =
n
As ,
N
(9)
153
Fig. 3. Constraints on the SUSY parameter space by EAS detectors with different sensitivities Ith . The upper panel gives
the constraints assuming that the dark matter are produced thermally while the lower is for nonthermal production. Both
NFW and Moore profiles are adopted for the subhalos.
where N is the total sample events and n is the number of events satisfying the trigger conditions. In our simulation, the software package CORSIKA [50] is used to simulate the shower
development of the gamma ray signals and the nuclei background in the atmosphere and ARGOG
based on GEANT3 [51] for the response of the detector to the EAS events. To get better reconstructed events, we require the number of fired pad Npad 20 and the zenith angle < 45 .
A sampling area of As = 350 m 350 m, which is large enough for the ARGO array with
111.26 m 99.04 m under the trigger condition Npad 20, was used to enclose the ARGO
array at its center. In Fig. 5 we give a zenith angle averaged effective area as if the source was
154
Fig. 4. Constraints on the SUSY parameter space by EAS detectors with sensitivity Ith = 1012 photons cm2 s1 for
different subhalo models. Nonthermal production and Moore profile are adopted for the subhalos.
Fig. 5. The effective area of the ARGO array to primary gamma rays and nuclei background as a function of energy.
Npad 20 was adopted in simulation.
spread uniformly over the sky between zenith angles of 0 and 45 degrees. We notice that above
the threshold energy the effective area increases rapidly and reaches about 5000 m2 for TeV
gamma rays. At the same time, simulation also shows that nucleus have lower trigger efficiency
than photons leading to a suppression of the background.
While the Tibet group demonstrated the importance of a high-altitude site for the EAS ar
ray the Milagro observatory has pioneered the use of a large area water Cerenkov
detector for
155
Fig. 6. Constraints of ARGO on the SUSY parameter space if no gamma sources are found at the 2 level. The upper
panel gives the constraints assuming that the dark matter are produced thermally while the lower is for nonthermal
production. NFW profile is adopted for the subhalos.
the EAS detection and proven the efficacy of the technique and its ability to reject the nuclei
background [52]. Combining the advantages, a next generation high altitude water Cerenkov
(HAWC) detector for VHE gamma ray detection has been proposed recently [32]. The HAWC
detector with the all-sky and high-duty factor capabilities, but with a substantially lower energy
threshold and a greatly increased sensitivity would dramatically improve our knowledge of the
VHE gamma universe.
With an altitude above 4500 m and a large detection area of 40 000 m2 the energy threshold of HAWC is as low as 50 GeV and angular resolution of 0.25 degrees at median energy. The
156
Fig. 7. Similar as Fig. 6 except that the Moore profile is adopted for the subhalos.
average effective area of HAWC between zenith angles of 0 and 45 degrees for primary photons
is given in [32] by Monte Carlo simulation. We fit their result of the effective area as function of
energy Aeff = 2.85 105 m2 E 3 /(E + 195)3.11 for the energy in unit of GeV. We estimate the
effective area of HAWC to the nuclei background by assuming a same suppression factor relative
eff
to that of primary photons as the ARGO, i.e., Aeff
/Ap 2.
The ability of eliminating the background further by shower shape analysis in HAWC is
simulated [32]. A quality factor of 1.6, which is the relative improvement in sensitivity of the
detector, is produced independent of angular resolution. If combined with the angular resolution
the HAWC can greatly improve the ability of background rejection.
157
Fig. 8. Constraints of HAWC on the SUSY parameter space if no gamma sources are found at the 2 level. The upper
panel gives the constraints assuming that the dark matter are produced thermally while the lower is for nonthermal
production. NFW profile is adopted for the subhalos.
3.3. Sensitivities of ARGO and HAWC

Taking the effective area of ARGO and HAWC into Eq. (8) and adopting the spectrum of dark
matter annihilation we get the signal events for the observation time as a function of neutralino
mass. We get the events of background similarly and the sensitivity of ARGO and HAWC on
looking for dark matter signals. We can then study how the SUSY models are constrained by
these two detectors similar to the discussions given in Section 3.1.
158
Fig. 9. Similar as Fig. 8 except that the Moore profile is adopted for the subhalos.
In Fig. 6 we give the constraints of ARGO on the SUSY parameter space if no gamma sources
are found at the 2 level for 10 years observation. The upper panel gives the constraints assuming
that the dark matter are produced thermally while the lower is for nonthermal production. We
assume a NFW profile of the subhalos. Fig. 7 gives similar results assuming the dark matter
profile is the Moore profile.
In Figs. 8 and 9 we show the constraints on the SUSY parameter space from HAWC for 5
years observation. We can see that the sensitivity is greatly improved compared with ARGO. If
neutralino is produced nonthermally most SUSY parameter space will by constrained by HAWC
for the case of subhalo model.
159
4. Summary and conclusion

In the work we discuss the possibility to search the dark matter annihilation gamma rays in
the subhalos of the MW. The absolute flux from the subhalos may be smaller than that from the
galactic center, however the subhalos are also less influenced by the baryonic matter. For heavy
dark matter particles which may produce high energy and low flux the ground based detectors
with wide field of view are complementary to the satellite based detectors, such as GLAST [12].
Based on the N-body simulation results and scanning in the SUSY parameter space we calculate the flux of gamma rays from the subhalos. We then discuss the possibility of detection
of these fluxes at ground EAS detectors, especially the ARGO and HAWC experiments. The
properties of ARGO and HAWC are studied by Monte Carlo simulations. Due to our results
the SUSY parameter space is constrained if no gamma signal from the subhalos is detected. In
conclusion, the ground based detectors have the capability to search the dark matter signal and
constrain the SUSY parameter space, complementary to the GLAST.
Acknowledgements
This work is supported by the NSF of China under the grant No. 10575111 and supported in
part by the Chinese Academy of Sciences under the grant No. KJCX3-SYW-N2.
References
[1] D.N. Spergel, et al., Astrophys. J. Suppl. 148 (2003) 175;
D.N. Spergel, et al., astro-ph/0603449.
[2] G. Jungman, M. Kamionkowski, K. Griest, Phys. Rep. 267 (1996) 195.
[3] G. Belanger, F. Boudjema, S. Kraml, A. Pukhov, A. Semenov, Phys. Rev. D 73 (2006) 115007;
A. Djouadi, M. Drees, J.-L. Kneur, JHEP 0603 (2006) 033;
H. Baer, T. Krupovnickas, S. Profumo, P. Ullio, JHEP 0510 (2005) 020;
H. Baer, A. Mustafayev, S. Profumo, A. Belyaev, X. Tata, JHEP 0507 (2005) 065;
G. Belanger, S. Kraml, A. Pukhov, Phys. Rev. D 72 (2005) 015003;
B.C. Allanach, G. Belanger, F. Boudjema, A. Pukhov, JHEP 0412 (2004) 020;
J. Ellis, K.A. Olive, Y. Santoso, V.C. Spanos, Phys. Lett. B 565 (2003) 176.
[4] C. Munoz, Int. J. Mod. Phys. A 19 (2004) 3093.
[5] J.L. Feng, K.T. Matchev, F. Wilczek, Phys. Rev. D 63 (2001) 045024;
G. Bertone, D. Hooper, J. Silk, Phys. Rep. 405 (2005) 279.
[6] S.D. Hunter, et al., Astrophys. J. 481 (1997) 205;
H.A. Mayer-Hasselwander, et al., Astron. Astrophys. 335 (1998) 161;
R.C. Hartman, et al., Astrophys. J. Suppl. Ser. 123 (1999) 79.
[7] S.W. Barwick, et al., HEAT Collaboration, Astrophys. J. 482 (1997) L191.
[8] F. Aharonian, et al., H.E.S.S. Collaboration, Astron. Astrophys. 425 (2004) L13.
[9] K. Tsuchiya, et al., CANGAROO-II Collaboration, Astrophys. J. 606 (2004) L115.
[10] L. Bergstrom, P. Ullio, J. Buckley, Astropart. Phys. 9 (1998) 137;
D. Horns, Phys. Lett. B 607 (2005) 225.
[11] L. Bergstrm, J. Edsj, P. Gondolo, P. Ullio, Phys. Rev. D 59 (1999) 043506;
C. Calcaneo-Roldan, B. Moore, Phys. Rev. D 62 (2002) 123005;
A. Tasitsiomi, A.V. Olinto, Phys. Rev. D 66 (2002) 083006;
R. Aloisio, P. Blasi, A.V. Olinto, Astrophys. J. 601 (2004) 47;
L. Pieri, E. Branchini, Phys. Rev. D 69 (2004) 043512;
N.W. Evans, F. Ferrer, S. Sarkar, Phys. Rev. D 69 (2004) 123501.
[12] S.M. Koushiappas, A.R. Zentner, T.P. Walker, Phys. Rev. D 69 (2004) 043501.
[13] G. Tormen, A. Diaferio, D. Syer, Mon. Not. R. Astron. Soc. 299 (1998) 728.
160
[14] A. Klypin, S. Gottlber, A.V. Kravtsov, A.M. Khokhlov, Astrophys. J. 516 (1999) 530;
A. Klypin, et al., Astrophys. J. 522 (1999) 82.
[15] B. Moore, S. Ghigna, F. Governato, G. Lake, T. Quinn, J. Stadel, P. Tozzi, Astrophys. J. 524 (1999) L19.
[16] S. Ghigna, B. Moore, F. Governato, G. Lake, T. Quinn, J. Stadel, Astrophys. J. 544 (2000) 616.
[17] V. Springel, S.D.M. White, G. Tormen, G. Kauffmann, Mon. Not. R. Astron. Soc. 328 (2001) 726.
[18] A.R. Zentner, J.S. Bullock, Astrophys. J. 598 (2003) 49.
[19] G. De Lucia, G. Kauffmann, V. Springel, S.D.M. White, B. Lanzoni, F. Stoehr, G. Tormen, N. Yoshida, Mon. Not.
R. Astron. Soc. 348 (2004) 333.
[20] A.V. Kravtsov, O.Y. Gnedin, A.A. Klypin, Astrophys. J. 609 (2004) 482.
[21] G. Zaharijas, D. Hooper, Phys. Rev. D 73 (2006) 103501.
[22] P. Ullio, H.S. Zhao, M. Kamionkowski, Phys. Rev. D 64 (2001) 043504.
[23] D. Reed, F. Governato, L. Verde, J. Gardner, T. Quinn, J. Stadel, D. Merritt, G. Lake, Mon. Not. R. Astron. Soc. 357
(2005) 82, astro-ph/0312544.
[24] Y.P. Jing, Y. Suto, Astrophys. J. 529 (2000) L69;
Y.P. Jing, Y. Suto, Astrophys. J. 574 (2002) 538.
[25] A.R. Zentner, A.A. Berlind, J.S. Bullock, A.V. Kravtsov, R.H. Wechsler, Astrophys. J. 624 (2005) 505.
[26] J.S. Bullock, T.S. Kolatt, Y. Sigad, R.S. Somerville, A.V. Kravtsov, A.A. Klypin, J.R. Primack, A. Dekel, Mon. Not.
R. Astron. Soc. 321 (2001) 559.
[27] D. Hooper, I. de la Calle Perez, J. Silk, F. Ferrer, S. Sarkar, JCAP 0409 (2004) 002;
D. Horns, Phys. Lett. B 607 (2005) 225;
S. Profumo, Phys. Rev. D 72 (2005) 103521;
F. Ferrer, astro-ph/0505414.
[28] A. Morselli, et al., Proceedings of the 32nd Rencontres de Moriond, 1997.
[29] See the home page of AMS02 at, http://ams.cern.ch/AMS/ams_homepage.html.
[30] S. Peirani, R. Mohayaee, J.A. de Freitas Pacheco, Phys. Rev. D 70 (2004) 043503.
[31] Z. Cao, Talk at 29th ICRC, India, August 3, 2005.
[32] G. Sinnis, A. Smith, J.E. McEnery, astro-ph/0403096.
[33] J. Diemand, B. Moore, J. Stadel, Mon. Not. R. Astron. Soc. 352 (2004) 535.
[34] L. Gao, S.D.M. White, A. Jenkins, F. Stoehr, V. Springel, Mon. Not. R. Astron. Soc. 355 (2004) 819.
[35] F. Stoehr, S.D.M. White, V. Springel, G. Tormen, N. Yoshida, Mon. Not. R. Astron. Soc. 345 (2003) 1313.
[36] The physical cutoff of the minimal substructure is given in A.M. Green, S. Hofmann, D.J. Schwarz, JCAP 0508
(2005) 003;
A.M. Green, S. Hofmann, D.J. Schwarz, Mon. Not. R. Astron. Soc. 353 (2004) L23;
A.M. Green, S. Hofmann, D.J. Schwarz, S. Hofmann, D.J. Schwarz, H. Stoecker, Phys. Rev. D 64 (2001) 083507.
[37] R. Aloisio, P. Blasi, A.V. Olinto, Astrophys. J. 601 (2004) 47.
[38] E. Hayashi, J.F. Navarro, J.E. Taylor, J. Stadel, T. Quinn, Astrophys. J. 584 (2003) 541.
[39] J.F. Navarro, C.S. Frenk, S.D.M. White, Mon. Not. R. Astron. Soc. 275 (1995) 56;
J.F. Navarro, C.S. Frenk, S.D.M. White, Astrophys. J. 462 (1996) 563;
J.F. Navarro, C.S. Frenk, S.D.M. White, Astrophys. J. 490 (1997) 493.
[40] A. Huss, B. Jain, M. Steinmetz, Astrophys. J. 517 (1999) 64;
J.E. Taylor, J.F. Navarro, Astrophys. J. 563 (2001) 483;
A. Dekel, J. Devor, G. Hetzroni, Mon. Not. R. Astron. Soc. 341 (2003) 326;
A. Dekel, I. Arad, J. Devor, Y. Birnboim, Astrophys. J. 588 (2003) 680;
C. Power, J.F. Navarro, A. Jenkins, C.S. Frenk, S.D.M. White, V. Springel, J. Stadel, T. Quinn, Mon. Not. R. Astron.
Soc. 338 (2003) 14.
[41] B. Moore, F. Governato, T. Quinn, J. Stadel, G. Lake, Astrophys. J. 499 (1998) 5;
B. Moore, T. Quinn, F. Governato, J. Stadel, G. Lake, Mon. Not. R. Astron. Soc. 310 (1999) 1147.
[42] S. Ghigna, B. Moore, F. Governato, G. Lake, T. Quinn, J. Stadel, Astrophys. J. 544 (2000) 616;
F. Governato, S. Ghigna, B. Moore, astro-ph/0105443;
T. Fukushige, J. Makino, Astrophys. J. 557 (2001) 533;
T. Fukushige, J. Makino, Astrophys. J. 588 (2003) 674.
[43] X.J. Bi, Nucl. Phys. B 741 (2006) 83.
[44] V.R. Eke, J.F. Navarro, M. Steinmetz, Astrophys. J. 554 (2001) 114.
[45] P. Gondolo, J. Edsjo, P. Ullio, L. Bergstrom, M. Schelke, E.A. Baltz, JCAP 0407 (2004) 008, astro-ph/0406204.
[46] R. Jeannerot, X. Zhang, R. Brandenberger, JHEP 9912 (1999) 003;
W.B. Lin, D.H. Huang, X. Zhang, R. Brandenberger, Phys. Rev. Lett. 86 (2001) 954;
[47]
[48]
[49]
[50]
[51]
[52]
M. Endo, F. Takahashi, Phys. Rev. D 74 (2006) 063502;

G. Gelmini, P. Gondolo, A. Soldatenko, C.E. Yaguna, hep-ph/0610379.
T. Sjstrand, et al., Comput. Phys. Commun. 135 (2001) 238.
M. Amenomori, et al., Astrophys. J. 525 (1999) L93.
R.W. Atkins, et al., Nucl. Instrum. Methods A 449 (2000) 478.
http://www-ik.fzk.de/~heck/corsika/.
http://www.fisica.unile.it/~argo/analysis/argog/index.html.
R.A. Atkins, et al., Astrophys. J. 595 (2003) 803.
161
Modelling non-perturbative corrections to bottom-quark

fragmentation
Ugo Aglietti a,b , Gennaro Corcella a, , Giancarlo Ferrera a,b
a Dipartimento di Fisica, Universit di Roma La Sapienza, P.le A. Moro 2, I-00185 Roma, Italy
b INFN, Sezione di Roma, P.le A. Moro 2, I-00185 Roma, Italy
Received 9 October 2006; received in revised form 17 January 2007; accepted 11 April 2007
Abstract
We describe B-hadron production in e+ e annihilation at the Z 0 pole by means of a model including
non-perturbative corrections to b-quark fragmentation as originating, via multiple soft emissions, from an
effective QCD coupling constant, which does not exhibit the Landau pole any longer and includes absorptive effects due to parton branching. We work in the framework of perturbative fragmentation functions at
NLO, with NLL DGLAP evolution and NNLL large-x resummation in both coefficient function and initial
condition of the perturbative fragmentation function. We include hadronization corrections via the effective
coupling constant in the NNLO approximation and do not add any further non-perturbative fragmentation
function. As part of our model, we perform the Mellin transforms of our resummed expressions exactly. We
present results on the energy distribution of b-flavored hadrons, which we compare with LEP and SLD data,
in both x- and N-spaces. We find that, within the theoretical uncertainties on our calculation, our model is
able to reasonably reproduce the data at x 0.92 and the first five moments of the B cross section.
1. Introduction
We study b-quark fragmentation and B-hadron production in e+ e annihilation using a model
which implements power corrections via an analytic effective coupling constant. Such a model
will allow us to make hadron-level predictions, directly comparable with the accurate experimental data from the SLD [1] and LEP [24] Collaborations.
E-mail address: gennaro.corcella@roma1.infn.it (G. Corcella).

doi:10.1016/j.nuclphysb.2007.04.014
U. Aglietti et al. / Nuclear Physics B 775 (2007) 162201
163
Let us describe in physical terms the process we are interested in:

e+ e Z 0 B + Xb ,
(1.1)
where Xb is a generic hadronic final state with a b quark. The corresponding parton-level process,
assuming that an arbitrary number of gluons is radiated, reads:
1 gn .
Z 0 bb bbg
(1.2)
rest frame, the b and the b are initially emitted back-to-back with a large virtuality, of
In the
the order of the hard scale Q = mZ . Then, they reduce their energy by emitting gluons, mainly
of low energy or at small angle (soft or collinear radiation). As long as the virtuality of the b
quarks is large enough, we can neglect the b and b masses. Considering, for simplicity, the case
of one single emission,
Z0
) + g(pg ),
e+ e Z 0 (q) b(pb ) + b(p
b
(1.3)
and defining the energy fractions

xi
2pi q
2Ei
=
,
2
mZ
mZ
(1.4)
g and xb + x + xg = 2, a straightforward kinematical computation yields:

where i = b, b,
b
1 xb =
xb xg
(1 cos bg
),
2
(1.5)
From Eq. (1.5), we see that collinear

with bg
being the angle between the gluon and the b.
one can show that
emission from the b (bg
= 0) implies xb = 1. Likewise, by crossing b b,
collinear radiation off the b, i.e. bg = 0, corresponds to xb = 1. Soft-gluon radiation (xg = 0)
yields xb = xb = 1. We can identify two main different mechanisms of energy loss of the b:
a direct loss, related to soft or collinear emissions off the b itself, and an indirect loss, when the
gluons are radiated off the b (hard and large-angle emissions are not enhanced). The asymmetry
between the b and the b jets is not dynamical, but is just related to the fact that, e.g., we decided
to measure the energy of the b, and not that of the b.

When the virtuality of the b has reduced to a value of the order of its mass, mass effects
become substantial, and when it becomes comparable with the hadronic scale, O(1 GeV), the b
approaches the non-perturbative phase of its evolution. The hadronization, for example, into a B
meson can be described as due to the radiation of a light q q pair, and the subsequent combination
of the b and the q to form a B = (bq)
state:
b B + q.
(1.6)
In the process (1.6), the color of the heavy b is coherently transferred to the light q; we shall assume that no more radiation is emitted after the colorless hadron has been formed. The hadronization transition (1.6) is sensitive to non-perturbative power corrections /mb , with being
the QCD scale, e.g., in the MS scheme. Such power correctionsfor the time beingcannot
be calculated from first principles, but are usually described by means of phenomenological
hadronization models, such as the Kartvelishvili [5] or the Peterson [6] models, containing parameters which are to be fitted to experimental data. In Ref. [7], furthermore, non-perturbative
effects in bottom fragmentation at large x were described as a shape function of mb (1 x) and
implemented using dressed gluon exponentiation [8].
164
In this paper, we shall follow a different approach to model power corrections to b-quark
fragmentation. We shall assume that the transition (1.6) can be described in terms of an effective
analytic coupling constant S , which incorporates non-perturbative power corrections and, unlike
the models above mentioned, does not present any free parameter to be fitted to data. Within our
model, whenever we use S instead of the standard S , the energy of the b quark will be identified
with the one of the observed B meson, i.e. Eb EB . The non-perturbative model which we shall
use hereafter was already used in [9] in the context of B-meson decays and interesting results
were found. The comparison with the data on the photon spectrum in radiative decays and on the
hadron-mass distribution in semileptonic decays showed good agreement, while discrepancies
were found for the purpose of the electron energy distribution.
An analogous model can be introduced for the fragmentation of a b into a baryon, such as, for
example, a hyperion b . The b quark emits two light uu and d d pairs and combines with the u
and the d to form the colorless b = (bud):
b b + u + d,
(1.7)
where the u d system is emitted with the same color state as the initial b. The process (1.7) will
be also described by means of an effective coupling constant: however, there is no reason a priori
to assume that the effective coupling constant for (1.7) is equal to that for (1.6).
Turning back to the parton-level process (1.2), the differential cross section has a perturbative
expansion containing large logarithms of the form:
Sn lnk
m2Z
m2b
(n = 1, 2, 3, . . . , , k = 1, 2, . . . , n),
(1.8)
which are related to collinear emissions of partons with transverse momenta ranging between
scales fixed by the particle masses:
2
S ln
m2Z
m2b
mZ
= S
m2b
2
dk
2
k
(1.9)
At higher orders, the leading contributions, k = n in Eq. (1.8), are associated with phase-space
regions ordered in transverse momentum:
k1 > k2 > > kn ,
(1.10)
leading to integrals of the form
Sn
m2b
2
kn1
mZ
2
dk1
2
k1
k1
m2b
2
dk2
2
k2
m2b
2
dkn
2
kn
1 n n m2Z
.
ln
n! S
m2b
Hence, in the first-order cross section, the collinear logarithm exponentiates.

The b-quark spectrum is also affected by large logarithms

k
ln (1 x)
(n = 1, 2, . . . , , k = 0, 1, 2, . . . , 2n 1),
Sn
1x
+
(1.11)
(1.12)
165
also called threshold logarithms, which are enhanced for x 1, i.e. for soft or collinear emission.
The plus distributions in Eq. (1.12) are defined as:

k
1 k
ln (1 x )
lnk (1 x)
ln (1 x)
lim (1 x )
dx .
(1 x )
1x
1x
1 x
0+
+
0
The contributions in (1.12) originate from the following double-logarithmic integrals:

1 1

ln(1 x)
d dt
=
(1 x t) ,
1x +
t
(1.13)
(1.14)
0 0
where 2Eg /mZ , is the normalized energy of the radiated gluon and t 2(1 cos ) 2 ,
with being the emission angle. The -function contains a typical kinematical constraint and
the plus regularization comes after including the virtual diagrams. To factorize the kinematical
constraint for multiple gluon emissions, a transformation to N -space is usually made:

1 1
1
d dt
N1
dx x
(1 x t)
t
0
0 0
1 1
=

d dt
1
(1 t)N1 1 ln2 N.
t
2
(1.15)
0 0
At higher orders, the large-x logarithms are associated with multiple soft emissions ordered in
angle:
1 > 2 > > n .
(1.16)
Color coherence [10], in fact, dictates that multiple soft radiation interferes destructively outside
the angular-ordered region (1.16) of the phase space. That produces integrals in moment-space
of the form
1 1
Sn
1 1
d2 dt2
dn dtn
(t1 > t2 > > tn )

2 t2
n tn
0 0
0 0
0 0

(1 1 t1 )N1 1 (1 2 t2 )N1 1 (1 n tn )N1 1
n
1 1

Sn
d dt
,
(1 t)N1 1
=
n!
t
d1 dt1
1 t1
1 1
(1.17)
0 0
with i 2Ei /mZ and ti (1 cos i )/2 i2 .1 Eq. (1.17) implies that also the threshold
logarithms have an exponential structure. Schematically, the inclusion of higher orders amounts
1 The integral on the l.h.s. of Eq. (1.17) is completely factorized into single-particle integrals except for the angularordering -function. This constraint can be eliminated by symmetrizing over the ti s:
1
1
(ti1 > ti2 > > tin ) = .
(t1 > t2 > > tn )
(1.18)
n! perm
n!
166
to the replacement:
2N
1 + S ln2 N eS ln
(1.19)
A consistent method to accomplish the resummation of mass (1.8) and threshold (1.12) logarithms is the formalism of perturbative fragmentation functions [11]. The basic idea is that
a heavy quark is first produced at large transverse momentum, k m, and can be treated in a
massless fashion, afterwards it slows down and fragments into a massive parton. This leads to
writing the energy distribution of the b quark, up to power corrections, as the following convolution:

1 d
x; m2Z , m2b = C x; m2Z , 2F D x; 2F , m2b ,
(1.20)
dx
where:
(1) C(x; m2Z , 2F ) is a coefficient function, obtained from a massless computation in a given
factorization scheme, describing the emission off a light parton. The coefficient function
contains large-x logarithms, as in Eq. (1.12), which are process-dependent;
(2) D(x; 2F , m2b ) is the perturbative fragmentation function, associated with the transition of a
massless parton into the heavy b.
The perturbative fragmentation function follows the DokshitzerGribovLipatovAltarelli
Parisi (DGLAP) evolution equations [12,13], which can be solved once an initial condition is
provided. Solving the DGLAP equations one resums the large logarithms (1.8) which appear in
the massive b-quark spectrum. Function D(x; F , mb ) can therefore be factorized as:

D x; 2F , m2b = E x; 2F , 20F D ini x; 20F , m2b ,
(1.21)
where:
(1) E(x; 2F , 20F ) is an evolution operator from the scale F , typically of the order of the hard
scale mZ , down to 0F , of the order of the bottom-quark mass mb . In E(x; 2F , 20F ), the
mass logarithms (1.8) are resummed;
(2) D ini (x; 20F , m2b ) is the initial condition of the perturbative fragmentation function at the
scale 0F mb , first calculated in [11], and lately proved to be process-independent [14].
It also contains threshold logarithms (1.12), which are also process-independent, and were
resummed in [14] in the NLL approximation.
The approach of perturbative fragmentation functions, with the inclusion of NLL large-x resummation, has been applied to investigate b-quark production in e+ e annihilation [14], top
[17]. In this paper we shall go beyond the NLL
(t bW ) [15,16] and Higgs decays (H bb)
approximation and, as far as the perturbative calculation is concerned, we shall also include nextto-next-to leading logarithmic (NNLL) threshold contributions to both e+ e coefficient function
and initial condition of the perturbative fragmentation function. Also, the Mellin transforms of
our resummed expressions will be performed exactly, in order to resum constants and powersuppressed terms as well.
Most previous analyses convoluted the parton-level spectrum with a non-perturbative fragmentation function, which contains few parameters which are to be fitted to experimental data.
Afterwards, such models are used to predict the B spectrum in other processes, as long as one
167
consistently describes perturbative b production (see, e.g., Ref. [18]). An alternative method [19]
consists in using data in Mellin moment space, such as the DELPHI ones [4], and fit directly the
moments of the non-perturbative fragmentation function, without assuming any functional form
for the hadronization model in x-space.
In this paper we reconsider the inclusion of non-perturbative corrections to bottom-quark fragmentation. As discussed before, instead of fitting the parameters of a hadronization model, or the
moments of the non-perturbative fragmentation function, we model non-perturbative effects by
the use of an analytic effective coupling constant [2022], which does not contain any free parameter, extending the analysis carried out in [23] and in [9] in the framework of heavy-flavor
decays. Our modified coupling constant does not exhibit the Landau pole any longer, and includes all-order absorptive effects due to gluon branching. We shall then be able to compare our
predictions directly with data, without using any extra hadronization model.
The plan of our paper is the following. In Section 2 we review the main features of the MS
e+ e coefficient function. In Section 3 we discuss the massive computation, the perturbative
fragmentation approach and the resummation of the large mass logarithms. In Sections 4 and 5
we discuss NNLL large-x resummation in the coefficient function and initial condition of the
perturbative fragmentation function, respectively, pointing out the differences of our analysis
with respect to previous ones. In Section 6 we construct the analytic coupling constant without
the Landau pole. In Section 7 we present our model, based on an effective coupling constant as
the only source of non-perturbative corrections to b-quark fragmentation. In Section 8 we present
our results on the B-hadron spectrum in e+ e annihilation in x-space, and investigate how they
fare against SLD, ALEPH and OPAL data. In Section 9 we perform a similar analysis in N -space
and compare with the DELPHI moments. In Section 10 we summarize the main results of our
work and discuss the lines of development of our study. There is also Appendix A describing an
algorithm to compute numerically the Mellin transforms with the fast Fourier transform.
2. Massless-quark production and NLO coefficient function
In this section we consider massless-quark production and discuss the main features of the
NLO e+ e coefficient function, which will be used later on in the framework of perturbative
fragmentation functions. The NLO computation exhibits many properties of the higher orders,
so that the general structure of the perturbative corrections can be understood by looking in detail
into this case.
We study the production of a light-quark pair at the Z 0 pole, in the NLO approximation:

e+ e Z 0 (q) q(pq ) + q(p
q ) + g(pg ) ,
(2.1)
where (g(pg )) denotes a real (virtual) gluon, and define the light-quark energy fraction as:
x
2pq q
m2Z
(2.2)
The differential cross section, in dimensional regularization, can be read from the formulas
in [24]:

S (2R )CF 1 1 + x 2
m2
1 dq
1
= (1 x) +
+ A(x)
+ O S2 , (2.3)
ln 2Z
dx
2 1x +

F
168
where
1 1
(2.4)
E + ln(4),

(4 D)/2, D is the number of spacetime dimensions, R and F are the renormalization
and factorization scales, E = 0.577216 . . . is the Euler constant, S is the dimensionless strong
is the following function, independent of and F :

coupling constant, CF = 4/3 and A(x)
2

3
1
ln(1 x)
+
3 (1 x)
A(x) =
1 x + 4 [1 x]+
3
1 + x2
5 3x
1+x
ln(1 x) +
ln x +
,
2
1x
4

3 S (2R )CF
+ O S2
= 0 1 +
4
(2.5)
(2.6)
is the NLO total cross section, with 0 being the Born one. Eq. (2.3) presents a pole, 1/ , which
is remnant of the collinear singularity, and disappears in the total cross section. Subtracting the
collinear pole, one obtains what, in the perturbative fragmentation formalism, is called the MS
coefficient function [11]2 :

S (2R )CF 1 1 + x 2
m2
1 d q MS
= (1 x) +
ln 2Z + A(x)
+ O S2 .
(2.7)
dx
2 1 x + F
The coefficient function presents:
(1) a term of collinear origin,

m2
m2
S CF 1 + x 2
S (0)
ln 2Z ,
Pqq (x) ln 2Z =
2
2
1 x + F
F
(2.8)
(0)
where Pqq (x) is the leading-order AltarelliParisi splitting function, containing contributions enhanced in the threshold region x 1;
(2) two terms which are enhanced at large x and are independent of F :

3 S CF
1
S CF ln(1 x)
and
;
(2.9)
1x +
4 [1 x]+
(3) a term proportional to (1 x), i.e. a spike in the elastic point x = 1:

S CF 2
3 (1 x);
3
(4) terms dependent on x and divergent at most logarithmically for x 1:

S CF
1+x
1 + x2
5 3x
ln(1 x) +
ln x +
.
2
1x
4
(2.10)
(2.11)
2 Unlike the usual convention, where the coefficient function is expressed in terms of 1/ (d/dx), we have divided
0
by the NLO cross section , so that the total integral is 1. In the following, we shall compare with data, also normalized
to 1.
169
We observe that the latter contribution also contains a term, ln x, which is enhanced at
small x.
We wish to rearrange the coefficient function and put it into a form which will become more
useful in the following sections, when dealing with large-x resummation. To this goal, we rearrange the AltarelliParisi splitting function by using the identity:
1 + x2
1x
2
3
(1 + x) + (1 x).
[1 x]+
2
(2.12)
Moreover, we factorize the constants which multiply the term (1 x), and finally write the
MS coefficient function in the form:

1 d q
dx
MS

S (2R )CF 2
3 m2
= 1+
3 + ln 2Z
(1 x)
3
4 F

m2 (1 x)
S (2R )CF
3
1
ln Z 2
+
1x
4(1 x)+
F
+

S (2R )CF 1 + x 2
m2Z
1+x
+
ln x
ln(1 x) + ln 2
1x
2
F

5 3x
+
+ O S2 ,
4
(2.13)
which is equal to (2.13) at O(S ). Eq. (2.13) exhibits the logarithm ln[m2Z (1 x)/2F ], which
originates from the integration over the variable k 2 = (pq + pg )2 (1 x):
S m2Z (1 x)
=
ln
CF
2F
m2Z(1x)
dk 2
S
CF ,
2
(2.14)
2F
where the constant term CF S / has been brought inside the integral for reasons which will
2 = k 2 , the gluon transverse mobecome clear later. For soft and collinear radiation, k 2 Eg2 qg
mentum with respect to q. In principle, the lower value of k 2 would be zero, as massless quarks
can emit soft gluons at arbitrarily small angles. In dimensional regularization, using the MS fac2 = 2 . The upper limit in
torization scheme, the minimum k 2 is set by the factorization scale: kmin
F
2
2
the integral (2.14) can be obtained observing that k = (pq + pg ) (1 x) = (q pq )2 (1 x)
m2Z (1 x).
3. Heavy-quark production and NLO perturbative fragmentation function
Let us now consider the production of massive bottom quarks in the NLO approximation:

) + g(pg ) .
e+ e Z 0 (q) b(pb ) + b(p
b
(3.1)
The differential cross section for the production of a b quark of energy fraction x reads [11]:
170

m2
S (2R )CF 1 1 + x 2
1 db
ln(1 x)
ln Z2
= (1 x) +
dx
2 1 x + mb
1x +
2

7
1
1+x
+
2 (1 x) +
ln(1 x)
4 (1 x)+
3
2

mb p
7x
1 + x2
ln x +
+ O S 2R
,
+
(3.2)
1x
4
mZ
where p 1. The massive spectrum, unlike the massless one, is free from collinear singularities
because the quark mass acts a regulator. However, Eq. (3.2) presents a large mass logarithm,
S ln(m2Z /m2b ), which needs to be resummed to improve the perturbative prediction.
The resummation of ln(m2Z /m2b ) can be achieved by the use of the approach of heavy-quark
perturbative fragmentation functions [11], which factorizes the spectrum of a massive quark as
the following convolution:

1 db
x; m2Z , m2b
dx
1

dz 1 d i
MS MS x 2
mb p
2
2
2
=
(3.3)
Di
z, mZ , F
; ,m + O
.
z dz
z F b
mZ
i
In Eq. (3.3), 1/ (d i /dx) is the coefficient function, corresponding to the production of a massless parton i, Di (x, m2b , 2F ) is the heavy-quark perturbative fragmentation function, associated
with the transition of the massless parton i into a heavy b, and F is the factorization scale. In
which is negligible at
the following, we shall neglect b production via gluon splitting g bb,
0
the Z peak and suppressed at large x. Hence, in Eq. (3.3), i = b and 1/ (d b /dz) is the quark
coefficient function, presented in Eq. (2.7) in the MS factorization scheme. The perturbative
fragmentation function DbMS expresses the fragmentation of a massless b into a massive b.
Requiring the massive cross section to be independent of F , one obtains that the perturbative
fragmentation function follows the DGLAP evolution equations [12,13], which can be solved
once an initial condition is given. The initial condition of the perturbative fragmentation function
Dbini (xb , 20F , m2b ) was given in [11], and can be obtained inserting in Eq. (3.3) the massive
spectrum (3.2) and the MS coefficient function (2.7). It reads:

Dbini x; S 20R , 20R , 20F , m2b

S (20R )CF 1 + x 2 1 20F
1
+ O S2 .
ln 2 ln(1 x)
= (1 x) +
(3.4)
1x 2
2 +
mb
In [14], the process-independence of the initial condition (3.4) was established on general
grounds. The solution of the DGLAP equations is typically obtained in Mellin moment space.
At NLO, and for an evolution from 0F to F , it is given by:

ini 2 2

S 0R , 0R , 20F , m2b ,
Db,N 2F , m2b = EN S 20F , S 2F Db,N
(3.5)
ini is the Mellin transform of Eq. (3.4) and E is the DGLAP evolution operator
where Db,N
N
[12,13]:

EN S 20F , S 2F

(0)
S (20F ) S (20F ) S (2F ) (1) 21 (0)
PN
+
ln
P
P
.
= exp
(3.6)
N
20
0 N
4 2 0
S (2F )

(0)
171
(1)
In Eq. (3.6), PN and PN are the Mellin transforms of the LO and NLO splitting functions; 0
and 1 are the first two coefficients of the QCD -function:
33 2nf
153 19nf
,
,
1 =
12
24 2
which enter in the NLO expression of the strong coupling constant:

1
1 ln[ln(Q2 /2 )]
S (Q2 ) =
.
1
0 ln(Q2 /2 )
02 ln(Q2 /2 )
0 =
(3.7)
(3.8)
As discussed in [11], Eq. (3.5) resums leading (LL) Sn (2F ) lnn (2F /20F ) and next-to-leading
(NLL) Sn (2F ) lnn1 (2F /20F ) logarithms of the ratio of the two factorization scales (collinear
resummation). For F mZ and 0F mb , as we shall assume hereafter, one resums the large
logarithm ln(m2Z /m2b ), which appears in the NLO massive spectrum (3.2), with NLL accuracy.
As we did for the coefficient function, we rearrange the initial condition of the perturbative
fragmentation function into a form which will be convenient when discussing soft-gluon resummation. We use the identity (2.12) and the relation

ln(1 x)
7
1 + x2
(1 + x) ln(1 x) (1 x).
(3.9)
ln(1 x) = 2
1x
1x +
4
+
Furthermore, we factorize the coefficient of the term (1 x) and write Dbini in the following
form, which is equivalent to (3.4), up to terms of O(S2 ):

Dbini x; S 20R , 20F , m2b

S (20R )CF
3 20F
1 + ln 2
= 1+
(1 x)
4
mb

S (20R )CF
20F
1
1
+
ln
1 x m2b (1 x)2 + (1 x)+

2
S (20R )CF
1 1 2
+
(3.10)
+
ln(1
x)
+
O
S .
(1 + x) ln 0F
2 2
m2b
The logarithm ln[20F /(m2b (1 x)2 )] comes again from an integral over k 2 = (pb + pg )2 (1 x):
2
2
S
CF
=
ln 2 0F
mb (1 x)2
0F
dk 2
S
CF .
2
(3.11)
m2b (1x)2
2 , the transverse momentum of the gluon

As in (2.14), for soft and small-angle radiation k 2 k
relative to the b. Following [14], the lower limit of the k 2 -integration is easily found by considering the dead-cone effect [10]. In fact, unlike the coefficient function, where quarks are treated
as massless, soft radiation off a massive b quark is suppressed at angles lower than3 :
mb
.
min
(3.12)
Eb
3 The bremsstrahlung spectrum off a heavy quark reads: d (d/) [d 2 /( 2 + 2 )], where is the energy
S
min
of the radiated soft gluon.
172
To the logarithmic accuracy we are interested in, we can neglect emission inside the dead cone
and use Eq. (3.12) to obtain the lower limit on the emitted-gluon transverse momentum:
2
2
2
kmin
Eg2 min
m2b (1 x)2 .
kmin
(3.13)
Comparing Eq. (3.11) with (2.14), we observe that, in the MS scheme, the factorization scale
squared 2F is the lower limit on the gluon transverse momentum in the coefficient function; 20F
is instead the upper limit on k 2 in the initial condition. The operator E(S (20F ), S (2F )), given
in Eq. (3.6), describes the evolution between these two scales.
Before closing this section, we point out that, although we shall use NLO coefficient function
and initial condition, and evolve the perturbative fragmentation function with NLL accuracy, in
principle we could go beyond such approximations. In fact, the NNLO e+ e coefficient function was calculated in [25,26] and NNLO corrections to the initial condition of the perturbative
fragmentation function in [27,28]. Furthermore, NNLO contributions to the non-singlet time-like
splitting functions were computed in [29], while in the singlet sector they are still missing. In any
case, we shall delay the inclusion of such NNLO corrections to future work.
4. Large-x resummation in the coefficient function
In this section we perform threshold resummation in the e+ e coefficient function to next-tonext-to-leading logarithmic accuracy, and combine the resummed result with the first-order one
presented in Section 2.
4.1. Resummed coefficient function in Mellin space
According to Eq. (2.7), the NLO e+ e coefficient function contain terms of the form

1
ln(1 x)
and S
,
S
1x +
[1 x]+
(4.1)
which become large in the limit x 1, corresponding to soft or collinear gluon radiation. Allorder resummation in the coefficient function can be performed following the general lines of
Refs. [30,31], and was implemented in [14] in the NLL approximation.
Threshold resummation is typically performed in N -space [14,31], where kinematical constraints factorize and the x 1 limit corresponds to N . The Mellin transform of the NLO
coefficient function, which we denote by CN , can be found in [11]. At large N , it exhibits single
and double logarithms of the Mellin variable:

S (2R )CF 1 2
m2
3
ln N +
+ E ln 2Z ln N
CN S 2R , 2R , 2F , m2Z = 1 +
2
4
F

1
+ Q 2F , m2Z + O
,
(4.2)
N
with the constant terms given by:

2

m2
3
5
1
3
2
Q F , mZ
E ln 2Z + 2 3 + E2 + E .
4
12
2
4
F
(4.3)
173
Besides the contributions (4.1), the NLO coefficient function (2.7) also contains a contribution
ln(1 x). However, its Mellin transform behaves like ln N/N and is therefore O(1/N) at
large N . The other terms in Eq. (2.7) are suppressed at large N .
The resummed coefficient function has the following generalized exponential structure:

(C)

(C)
N S 2R , 2R , 2F , m2Z = exp GN S 2R , 2R , 2F , m2Z ,
(4.4)
where [14]
(C)
GN

S 2R , 2R , 2F , m2Z
1
=
0
1
dz
1z
zN1
m2Z(1z)

2

dk 2 2
A S k + B S mZ (1 z) .
k2
(4.5)
2F
(C)
The exponent GN [S (2R ), 2R , 2F , m2Z ] resums the large logarithms appearing in Eq. (4.2):
LL:
Sn lnn+1 N;
NLL:
Sn lnn N ;
NNLL:
Sn lnn1 N.
(4.6)
As in [31], the integration variables are z = 1 xg , xg being the gluon energy fraction, and
2 , the
k 2 = (pb + pg )2 (1 z). In soft approximation, z x; for small-angle radiation k 2 k
gluon transverse momentum with respect to the b.
The functions A(S ) and B(S ) can be expanded as a series in S as:

S n (n)
A(S ) =
(4.7)
A ,
n=1

S n (n)
B(S ) =
(4.8)
B .
n=1
In the NLL approximation, one needs to include the first two coefficients of A(S ) and the first
of B(S ); to NNLL accuracy, A(3) and B (2) are also needed. Their expressions read:
A(1) = CF ,
(4.9)

1
5
67
A(2) = CF CA
(4.10)
(2) nf ,
2
18
9

67
11
245 11
A(3) = CF CA2
+ (3) (2) + (4)
96
24
36
8

n2f
7
5
209
55 (3)
(4.11)
C A nf
+ (3) (2) CF nf
,
432 12
18
96
2
108
3
B (1) = CF ,
(4.12)
4

5
3155 11
B (2) = CF CA
+ (2) + (3)
864
12
2

3
3
3
247 (2)
CF
+ (3) (2) + nf
,
(4.13)
32 2
4
432
6

x
where CA = 3, nf is the number of active flavors and (x)
n=1 1/n is the Riemann zeta
function. The first two coefficients of function A(S ) have been known for long time [31]; more
174
recent is the calculation of A(3) [32]. Function B(S ) is associated with the radiation off the
unobserved massless parton, e.g. the b if one detects the b: the coefficient B (1) was given in [31],
B (2) was computed in [37].
We can already observe that the integral over the transverse momentum in Eq. (4.5) is a
generalization of the NLO integral in Eq. (2.14), where

S
S
S (k 2 )
(4.14)
= A(1)
A(1)
A S k 2 .
Analogously, function B(S ) generalizes the coefficient of the 1/(1 x)+ term in (2.13):
CF

S
S
3
= B (1)
B S m2Z (1 x) .
CF
(4.15)
4
A delicate point of our approach concerns the integral over z in Eq. (4.5). Calculations which use
the standard coupling constant S (k 2 ), such as [14], typically perform the z-integration approximating the term (zN1 1) in such a way that only logarithmically-enhanced contributions
lnk N are kept in the exponent. To NLL accuracy, such an approximation reads [31]:

eE
,
zN1 1 1 z
(4.16)
N
where (x) is the Heaviside function. Beyond NLL, the prescription (4.16) can be generalized as
discussed in [33]. In fact, the integrations in z and k 2 in (4.5) involve the infrared region z 1 for
any value of N . As observed in [34], performing such integrations exactly, when using truncated
expressions for A(S ) and B(S ), will lead to a factorial divergence, corresponding to a power
correction which, in our case, would be /mZ . Ref. [35], however, remarkably pointed out
that this contribution is spurious as it is actually related to the fact that one employed truncated
expressions for A(S ) and B(S ). It was shown in [35] that the higher-order coefficients of such
functions also present factorial singularities leading to contributions /mZ , which cancel the
analogous terms coming from the exact integration, so that one is just left with a power correction
2 /m2Z . It was hence argued on general grounds in [33,36] that, when considering truncated
expressions for functions A(S ) and B(S ), employing the step function (4.16) or its extensions
beyond NLL [33] is a better approximation than performing the z-integration exactly, since the
transformation (4.16) and its generalizations do not lead to any spurious factorial growth of the
cross section.
However, all such results are valid as long as one uses the standard coupling constant in the
resummed formulas. As we shall detail in Sections 6 and 7, we will model non-perturbative corrections by means of an effective QCD coupling constant, based on an extension of the work in
[2022], which does not present the Landau pole any longer and includes power-suppressed contributions. As a result, some power corrections will be unavoidably transferred to the resummed
coefficient function when employing the effective coupling in expressions like Eq. (4.5), independently of how one performs the longitudinal-momentum integration. To our knowledge, there is
currently no analysis, such as the ones in [35,36], on factorial divergences and power corrections
in the coefficient function (4.5), when using an effective analytic coupling constant along with
truncated A(S ) and B(S ). We cannot therefore draw any firm conclusion on whether, within
our model, it is better to perform the z-integration in an exact or approximated way. Given the
phenomenological aim of the present paper, we prefer to postpone such an investigation. In the
following, as already done in [9] when the effective coupling was used in the context of heavyflavor decays, we shall present results obtained performing the Mellin transform (4.5) exactly. In
175
other words, doing the z-integration in (4.5) exactly should be seen, for the time being, as part of
the non-perturbative model which we shall propose.
Due to its complexity, we cannot express the result of the exact longitudinal-momentum integration (4.5) in a closed analytic form, but we shall perform it numerically. The integration over
k 2 could in principle be made analytically, and the result expressed in terms of polylogarithms.
For simplicitys sake, however, we perform also the k 2 -integration numerically.
For 0 z 1, the argument k 2 of S (k 2 ) in Eq. (4.5) varies from zero to the hard scale
squared m2Z ; this implies that the number of active quark flavors does change in the k 2 integration. In the following, we shall include correctly the variation of nf at the quark-mass
thresholds when doing the numerical integration.
Let us now see with an explicit analytic computation the difference between an exact Mellin
transform and an approximation using the step function. The expansion of the exponent of the
resummed coefficient function reads, to O(S ):
(C) 2 2 2

N S R , R , F , m2Z
S
S (2R )
=1+
1
dz z
0
S (2R )CF
=1+
N1

(1)
m2 (1 z)
1
ln Z 2
1z
F

+
+B
(1)
1
1z

(4.17)
+

m2Z
3
1 2
S (N 1) + S2 (N 1) +
ln 2 S1 (N 1) .
2 1
4
F
(4.18)
In Eq. (4.18) we have defined the harmonic sums S1 (N ) and S2 (N ), which are given by
S1 (N ) 0 (N + 1) 0 (1),
(4.19)
S2 (N ) 1 (N + 1) + 1 (1),
(4.20)
where
d k+1 ln (x)
(4.21)
dx k+1
are the polygamma function and (x) is the Euler gamma function. Using the large-N expansions:

1
,
S1 (N ) = ln N + E + O
(4.22)
N

2
1
+O
,
S2 (N ) =
(4.23)
6
N
k (x)
we can write for large N :

S (2R )CF 1 2
m2
3
S 2R , 2R , 2F , m2Z = 1 +
ln N +
+ E ln 2Z ln N
S
2
4
F

1
,
(4.24)
+ Q 2F , m2Z + O
N
(C)
where Q (2F , m2Z ) collects the constant terms

2 E2
m2
3
+
+ E E ln 2Z .
Q 2F , m2Z
12
2
4
F
(4.25)
176
By comparing Eq. (4.18) with Eq. (4.24), we explicitly see that performing the Mellin transform
exactly, rather than with the step-function approximation, amounts to including also constants
(C)
and contributions power-suppressed at large N in the exponent GN . If one used the -function
approximation, as in [14], the O(S ) expansion of the resummed coefficient function would contain only the logarithmically-enhanced contributions S ln N and S ln2 N . The resummed
coefficient function in the original x-space is finally obtained by an inverse Mellin transform:
(C)
x; S 2R
, 2R , 2F , m2Z
c+i

dN N (C) 2 2 2
x N S R , R , F , m2Z ,
2i
(4.26)
ci
(C)
where the (real) constant c is chosen so that all the singularities of N lie to the left of the
integration contour. The inverse transform (4.26) is also made exactly, in a numerical way.
4.2. Matching of resummed and NLO coefficient function
We wish to implement the matching of the resummed coefficient function with the exact
O(S ) one, in order to obtain a good approximation in the whole kinematical range. We can
perform the matching in N -space and then invert the final result to x-space; alternatively, we can
invert the resummed coefficient function as in (4.26), and then match it to the NLO x-space result.
Given the high accuracy of our approach and the delicate issue of the exact Mellin transform, we
have matched resummed and NLO coefficient function in both N - and x-spaces, and checked the
consistency of our results. We discuss the matching in both spaces.
4.2.1. N -space
We would like to write the NNLL-resummed coefficient function as:
2 2 2

res
S R , R , F , m2Z
CN

2 2 2

2
= K (C) S 2R , 2R , 2F , m2Z (C)
N S R , R , F , mZ

+ dN(C) S 2R , 2R , 2F , m2Z ,
(4.27)
where:
(1) K (C) [S (2R ), 2R , 2F , m2Z ] is a hard factor, introduced for the sake of including subleading
terms, which corresponds to the difference between the constant terms which are present in
the exact NLO coefficient function and the ones contained in the O(S ) expansion of the
resummed result;
(C)
(2) N [x; S (2R ), 2R , 2F , m2Z ] is the resummed coefficient function, presented in (4.5);
(C)
(3) dN [S (2R ), 2R , 2F , m2Z ] is a remainder function, collecting the left-over NLO contributions, which are suppressed at large N .
The hard factor reads:

S (2R )CF 2
Q F , m2Z ,
K (C) S 2R , 2R , 2F , m2Z = 1 +
with

3 m2
2
Q 2F , m2Z Q 2F , m2Z Q 2F , m2Z = ln 2Z +
3,
4 F
3
(4.28)
(4.29)
177
where Q(2F , m2Z ) and Q (2F , m2Z ) have been defined in Eqs. (4.3) and (4.25), respectively. If
we had used the step-function approximation, no constants would have been resummed, and the
hard factor would have contained all the constant terms present in the NLO coefficient function,
i.e. Q(2F , m2Z ).
(C)
In N -space, the remainder function dN is obtained subtracting from the exact NLO coefficient function in Mellin space [11] the O(S ) expansion of K (C) (C) . We obtain:

(C)
dN S 2R , 2R , 2F , m2Z

= CN S 2R , 2R , 2F , m2Z

2 2 2

2
K (C) S 2R , 2R , 2F , m2Z (C)
N S R , R , F , mZ S

S (2R )
1
1
1 (0)
1
1 m2Z
1
=
CF ln 2
+
+ (N )
+
2 F N + 1 N
2
N +1 N

E 1
1
7
5
2 (1) (N ) +
+
+
2 N
N +1
4N
4(N + 1)

3 1
1
+
(4.30)
+
.
2 N 2 (N + 1)2
4.2.2. x-space
Likewise, we wish to write the NNLL-resummed coefficient function in x-space in a form
analogous to Eq. (4.27):

C res x; S 2R , 2R , 2F , m2Z

= K (C) S 2R , 2R , 2F , m2Z (C) x; S 2R , 2R , 2F , m2Z

(4.31)
+ d (C) x; S 2R , 2R , 2F , m2Z .
To get the remainder function and the constants, the factorized expression (2.13) of the coefficient
function turns out to be particularly useful. First, we need the O(S ) expansion of the resummed
result in x-space, that can be read from the integrand function in the Mellin transform (4.17):

(C)
x; S 2R , 2R , 2F , m2Z
S

S (2R )CF
m2Z (1 x)
3
1
= (1 x) +
(4.32)
ln
.
1x
4[1 x]+
2F
+
Then, by comparing Eq. (4.32) with Eq. (2.13), we obtain the constants K (C) , i.e. the coefficient of the term S (1 x) in Eq. (2.13), obviously equal to the ones in Eq. (4.28), and the
remainder function:

d (C) x; S 2R , 2R , 2F , m2Z

= C x; S 2R , 2R , 2F , m2Z

K (C) S 2R , 2R , 2F , m2Z (C) x; S 2R , 2R , 2F , m2Z
S

2
2
S (R )
m
1
1
1 + x2
=
(4.33)
CF
(5 3x) (1 + x) ln(1 x) + ln 2Z +
ln x .
4
2
1x
F
In (4.33), we denoted by C[x; S (2R ), 2R , 2F , m2Z ] the NLO MS coefficient function (2.7). Of
course, the Mellin transform of Eq. (4.33) yields Eq. (4.30).
178
5. Large-x resummation in the initial condition

In this section we resum in the NNLL approximation the threshold logarithms which appear
in the initial condition of the perturbative fragmentation function, and we match the resummed
expression with the exact NLO one. As discussed in [14], large-x resummation in the initial
condition is process-independent.
5.1. Resummed initial condition in Mellin space
The initial condition (3.4) present terms [ln(1 x)/(1 x)]+ and 1/(1 x)+ which are
to be resummed to all orders. In N -space, the Mellin transform of the NLO initial condition in
Eq. (3.4) exhibits single and double logarithms of N :

2 2

S (20R )CF
m2b
ini
2
2
2
DN S 0R , 0R , 0F , mb = 1 +
ln N + ln 2 2E + 1 ln N
0F

2
1
,
+ Y 0F , m2b + O
(5.1)
N
where the constant terms are given by:
Y
20F , m2b

m2
2
3
2
=1
+ E E + E
ln 2b .
6
4
0F
The resummed initial condition has a generalized exponential structure [14],

(D)

(D)
N S 20R , 20R , 20F , m2b = exp GN S 20R , 20R , 20F , m2b ,
(5.2)
(5.3)
where
2 2

2
2
G(D)
N S 0R , 0R , 0F , mb
1
=
0
zN1 1
dz
1z
0F

2

dk 2 2
2
,
A S k + D S mb (1 z)
k2
(5.4)
m2b (1z)2
(D)
with k 2 and z defined as in (4.5). As in the coefficient function, the LLs in the exponent GN are
Sn lnn+1 N , the NLLs Sn lnn N , and so forth. To NNLL accuracy, we need A(1) , A(2) and
A(3) , given in the previous section, and the first two coefficients of

S n (n)
D ,
D(S ) =
(5.5)
n=1
namely
D (1) = CF ,

nf
9
(2)
55
(2)
D = C F CA
(3) +
+
.
108 4
2
54
(5.6)
(5.7)
Function D(S ), called H (S ) in [14], is characteristic of the fragmentation of heavy quarks and
resums soft-gluon radiation which is not collinear enhanced. Its O(S ) coefficient can be found
179
in [14], while D (2) can be read from the formulas in [27]. Moreover, as discussed in [38], D(S )
coincides with the function which resums large-angle soft radiation in heavy-flavor decays [23,
39,40]. It is also equal to function S(S ), which plays the same role in top-quark decay [16] and
massive deep inelastic scattering [41].
The coefficient D (2) depends on the renormalization condition on the b mass and Eq. (5.7)
gives its value in the on-shell scheme. If m
b is the b-quark mass in another scheme, related to the
pole mass mb via

S (1)
m
b = 1+
(5.8)
k
mb ,
the following relation has to be fulfilled, with the coefficients A(i) clearly unchanged:
2
0F

dk 2 2
A S k + D S m2b (1 z)2
k2
m2b (1z)2
2
0F
=

dk 2 2 2
b (1 z)2 .
A S k + D S m
2
k
(5.9)
m
2b (1z)2
By solving the above equation order by order, one obtains the coefficients D (i) 4 :
D (1) = D (1) ,
D (2) = D (2) + 2k (1) A(1) .
(5.10)
(5.11)
As in the case of the coefficient function, we identify the integral dk 2 A[S (k 2 )]/k 2 in
Eq. (5.4) as a generalization of the NLO integral in Eq. (3.11). Function D(S ) generalizes the
single logarithmic term, 1/(1 x)+ , that we found in the NLO computation (3.10):

S
S
CF
(5.12)
= D (1)
D S m2b (1 x)2 .
Following the arguments discussed in Section 4.1, the Mellin transform in Eq. (5.4) will be again
performed exactly and the term (zN1 1) will not be approximated. The subleading terms
which are resummed in this way can be obtained after expanding Eq. (5.3) to O(S ):

(D) 2 2
N S 0R , 0R , 20F , m2b
S
S (20R )
=1+
1
dz z
0
S (20R )CF
=1+
N1

(1)
2
1
ln 2 0F
1 z mb (1 z)2
D (1)
+
(1 z)+
+

20F
2
1 ln 2 S1 (N 1) S1 (N 1) S2 (N 1) ,
mb
(5.13)
(5.14)
4 Since m
b depends in general on a renormalization scale m , one could estimate its contribution to the theoretical error
on the cross section by varyingin additional to the usual factorization and renormalization scalesalso m around mb
inside a conventional range.
180
and taking its large-N limit:

(D) 2 2
N S 0R , 0R , 20F , m2b
S

S (20R )CF
20F
2
=1+
ln N + 1 2E ln 2 ln N
mb

1
+ Y 20F , m2b + O
,
N
with
Y
20F , m2b

1 ln
20F
m2b

E E2
2
.
6
(5.15)
(5.16)
5.2. Matching of resummed and NLO initial condition

We follow a matching procedure analogous to the one for the coefficient function in order to
obtain a reliable result throughout the full N (x) range.
5.2.1. N -space
We would like to write the NNLL-resummed initial condition of the perturbative fragmentation function as:

ini,res 2
S 0R , 20R , 20F , m2b
DN

(D)

= K (D) S 20R , 20R , 20F , m2b N S 20R , 20R , 20F , m2b

(D)
+ dN S 20R , 20R , 20F , m2b .
(5.17)
The multiplying factor is obtained subtracting from the constants terms that are present in the
NLO result, the ones which have been resummed:

S (20R )CF 2
Y 0F , m2b ,
K (D) S 20R , 20R , 20F , m2b = 1 +
where

3 2
,
Y 20F , m2b Y 20F , m2b Y 20F , m2b = 1 + ln 0F
4
m2b
and Y (20F , m2b ) and Y (20F , m2b ) given in Eqs. (5.2) and (5.16).
The remainder function for the initial condition reads:

(D)
dN S 20R , 20R , 20F , m2b
2 2

ini
S 0R , 0R , 20F , m2b
= DN

(D)

K (D) S 20R , 20R , 20F , m2b N S 20R , 20R , 20F , m2b
S

S (20R )CF
1
1 20F
1
1
1
=
ln 2
+
(0) (N )
+
2
N +1 N
mb N + 1 N

1
1
1
3
1
1
E
+
+
2
.
N
N +1
2N
2(N + 1) N
(N + 1)2
(5.18)
(5.19)
(5.20)
181
Multiplying the NNLL-resummed coefficient function by the DGLAP evolution operator and the
NNLL initial condition, we obtain the moments of the b-quark energy distribution in e+ e bb:

Nb S 20R , S 2R , 20R , 2R , 20F , 2F , m2b , m2Z
2 2 2

res
S R , R , F , m2Z EN S 20F , S 2F
= CN

ini,res 2
S 0R , 20R , 20F , m2b .
DN
(5.21)
5.2.2. x-space
The O(S ) expansion of the resummed initial condition in x-space can be read from the
integrand function of Eq. (5.13):

(D)
x; S 20R , 20R , 20F , m2Z
S

S (20R )CF
2
1
1
= (1 x) +
(5.22)
ln 2 0F
.
1 x mb (1 x)2 + [1 x]+
The constant K (D) is the coefficient of the S (1 x) term in (3.10), obviously equal to (5.18).
The remainder function can be well obtained from Eq. (3.10) and reads:

d (D) x; S 20R , 20R , 20F , m2b

= D ini x; S 20R , 20R , 20F , m2b

K (D) S 20R , 20R , 20F , m2b (D) x; S 20R , 20R , 20F , m2b
S

S (20R )CF
1 1 20F
=
(5.23)
(1 + x) ln(1 x) + ln 2 .
2 2
mb
On can check that the Mellin transform of Eq. (5.23) agrees with Eq. (5.20). The NNLLresummed initial condition in x-space is finally given by:

D ini,res x; S 20R , 20R , 20F , m2b

= K (D) S 20R , 20R , 20F , m2b (D) x; S 20R , 20R , 20F , m2b

+ d (D) x; S 20R , 20R , 20F , m2b .
(5.24)
6. Effective coupling constant
In this section we introduce an effective QCD coupling constant (i) having no Landau pole
and (ii) resumming absorptive effects related to parton branching to all orders.
6.1. Space-like coupling constant
If we denote by q the typical 4-momentum entering in the renormalization conditions for
the QCD coupling constant,5 the standard LO coupling constant reads:

S,LO q 2 =
1
1
=
.
0 ln[(q 2 i)/2 ] 0 [ln(|q 2 |/2 ) i(q 2 )]
5 One can consider, for example, the symmetric point p 2 = p 2 = p 2 = q 2 in the q qg

correlation function.
q
g
q
(6.1)
182
There is a minus sign in front of the momentum squared q 2 , because of the opening of decay
channels for the gluon (g q q,
g gg) in the time-like region q 2 > 0. In order to have a
renormalized real S , one usually considers a space-like configuration of the reference momenta:
q 2 < 0. To avoid explicit minus signs, the LO expression of S is usually written as:

S,LO Q2 =
1
,
0 ln(Q2 /2 )
(6.2)
where Q2 q 2 > 0 in the space-like region. The specific properties of the QCD coupling
constant involved in soft-gluon resummation in e+ e annihilation are related to:
(1) an integration up to small momentum scales of the coupling constant, because of multiple
parton radiation, giving rise to sub-jets with arbitrarily small masses. In the resummed expressions, (4.5) and (5.4), one can indeed observe that the scale of S approaches zero once
x 1;
(2) the kinematical configurations are always time-like, as we are considering multiple emissions
in the final-state of e+ e processes.
Let us deal with the above issues by discussing first the analyticity properties of the standard
coupling constant (6.2). S (Q2 ) exhibits a cut for Q2 < 0, associated with the branching of a
time-like gluon (q 2 > 0) into physical states, and the Landau pole for Q2 = 2 . While the
former singularity has a clear meaning, the Landau pole is not physical and it just reflects the
unreliability of the perturbative expansion for Q2 2 .
It was therefore suggested than one can replace the usual expression (6.2) with an analytic
coupling S (Q2 ), which has the same discontinuity as S (Q2 ) along the cut Q2 0, but is
analytic elsewhere in the complex plane. As in [2022], we write the analytic coupling constant
S (Q2 ) using the following dispersion relation:

S Q
1
=
2i

0
ds
Discs S (s),
s + Q2
where the discontinuity is defined as:

Discs F (s) = lim F (s + i) F (s i) .
0+
(6.3)
(6.4)
Eq. (6.3) holds for Q2 > 0, i.e. in the space-like region q 2 < 0: as in [21], we shall refer to it
as our space-like analytic coupling constant. For Q2 < 0, the integrand function in Eq. (6.3)
presents a pole in the domain of integration. However, we can still give sense to Eq. (6.3) for
negative values of Q2 , introducing a small imaginary part: Q2 Q2 + i.
Inserting in (6.3) the LO expression (6.2), we get the LO space-like coupling constant:

2
1
1
.
S Q2 =
(6.5)
0 ln(Q2 /2 ) Q2 2
In (6.5) the Landau pole has been subtracted by a power-suppressed term, relevant at small Q2
and negligible at large Q2 2 , where S (Q2 ) still exhibits the same behavior as S (Q2 ).
Likewise, including in the integrand function of (6.3) the higher-order expressions of S (s),
one can get the space-like S (Q2 ) to higher accuracy.
183
6.2. Time-like coupling constant including absorptive effects

Turning back to our calculation, we have resummed soft and/or collinear multiple radiation
in the final state of e+ e annihilation, i.e. a time-like parton cascade. Also, in our resummed
expressions, (4.5) and (5.4), the coupling constant is evaluated at a scale k 2 , which is roughly
the transverse momentum of the emitted parton with respect to the radiating one. In Ref. [42], it
was in fact shown that, in the framework of resummed calculations, the momentum-independent
coupling constant is to be replaced by the following integral over the discontinuity of the gluon
propagator:
i
S
2
k 2
ds Discs
S (s)
.
s
(6.6)
The integral (6.6) is typically performed neglecting the imaginary part, i , in the denominator
of S (s) (see for example Eq. (6.1)), i.e. assuming
|s|
(6.7)

2
in the integrand function of (6.6). As a result, the integral (6.6) turns out to be approximately
equal to S evaluated at the upper integration limit:
ln
i
2
k 2
ds Discs

S (s)
S k 2 .
s
(6.8)
In our analysis, we wish to go beyond the assumption (6.7) and account for the terms i in the
denominator of S ; this way, as pointed out in [23], one includes absorptive effects due to gluon
branching, that are important especially in the infrared region. The new feature of our model
is the fact that we avoid the Landau pole in the integral by using in the integrand function the
space-like analytic coupling constant S (s), just defined in Eq. (6.3).
As a result, our model consists in using the following time-like effective coupling constant:

i
S k 2 =
2
k 2
ds Discs
S (s)
.
s
(6.9)
In fact, Eq. (6.9) makes sense only for k 2 > 0: the above integral would be zero for negative values of k 2 . We also remark a difference in our notation: the effective coupling S (k 2 ) is function
of the square of a four-momentum k 2 ; the standard S (q 2 ) in Eq. (6.2) is instead function of
minus a squared four-momentum. At large k 2 , S (k 2 ) will be roughly equivalent to the standard
S (k 2 ); at small k 2 it will include non-perturbative power-suppressed effects. The goal of this
paper is precisely to investigate whether including non-perturbative corrections using Eq. (6.9)
everywhere in our calculation is suitable to reproduce the experimental data on B-hadron production, without adding any further hadronization model.
Inserting Eqs. (6.5) in the integrand function of (6.9), we obtain the LO time-like analytic
coupling constant:

2
1
k2
k2
ln ln 2 + i ln ln 2 i .
S,LO k =
(6.10)
2i0
184
Likewise, starting from the NLO S (Q2 ) in Eq. (3.8), we obtain the NLO time-like coupling
constant:

1 1 ln[ln(k 2 /2 ) + i] + 1 ln[ln(k 2 /2 ) i] + 1
S,NLO k 2 = S,LO k 2 + 3
.
ln(k 2 /2 ) + i
ln(k 2 /2 ) i
0 2i
(6.11)
At NNLO, the standard coupling constant reads:

2
1
1 ln[ln(k 2 /2 )]
S k =
1
0 ln(k 2 /2 )
02 ln(k 2 /2 )

12 ln2 [ln(k 2 /2 )] ln[ln(k 2 /2 )] 1 2
1
+ 3 2
,
+ 4
(6.12)
0
0 ln (k 2 /2 )
ln2 (k 2 /2 )
and its time-like analytic counterpart:

ln2 [ln(k 2 /2 ) + i] ln2 [ln(k 2 /2 ) i]
[ln(k 2 /2 ) i]2
4i05 [ln(k 2 /2 ) + i]2

2 0 2
1
1
.
+ 1
(6.13)
[ln(k 2 /2 ) + i]2 [ln(k 2 /2 ) i]2
4i05

S,NNLO k 2 = S,NLO k 2
12
In (6.13) we have also included the third coefficient of the -function:

1
1415 2 205
2857 3
2
C
C +
CA CF CF nf
2 =
54 A
18
64 3 54 A

79
11
+
CA + CF n2f .
54
9
(6.14)
In Eqs. (6.10), (6.11) and (6.13) we have used a complex notation, which yields quite compact
expressions for the time-like coupling constant. However, we can rearrange the above equations
and express S (k 2 ) as a real function of k 2 , as done in [39].
Before closing this section, we point out a few more issues. Expanding S (k 2 ) for
ln(k 2 /2 ) , it is possible to relate standard and analytic time-like coupling constants:

(0 )2 3 2

S k + O S4 .
S k 2 = S k 2
(6.15)
3
We also need to modify the matching condition for the strong coupling constant when running
from nf to nf 1 active flavors. In terms of the standard S (k 2 ), it reads [43]:
2
2
2

11 3
q = S(nf 1) m
q
q + O S4 ,
S(nf 1) m
S(nf ) m
2
72
(6.16)
where m
q is the running quark mass in the MS renormalization scheme, i.e. m
q = m
MS
q ).
q (m
When using the analytic effective coupling S , we shall have to modify Eq. (6.16) according to
Eq. (6.15), observing that 0 , given in Eq. (3.7), depends on nf . We obtain:

2
2
2

17 nf
11
3
S(nf ) m
(6.17)
q = S(nf 1) m
q
q + O S4 .
S(n
1) m
2
f
54
72
Of course, Eqs. (6.16) and (6.17) can be also expressed in terms of the pole quark masses.
185
7. Modelling non-perturbative corrections

In this section we describe our model for non-perturbative effects in bottom-quark fragmentation, based on the effective QCD coupling constant considered above. Some properties of the
model have already been anticipated in the previous section; here we present a systematic discussion.
We shall use the effective coupling constant S (k 2 ) defined through Eq. (6.9) in place of the
standard one, in order to include power corrections. The dominant non-perturbative effects in
b-fragmentation occur for
1
x 1,
mb
(7.1)
where is of the order of the QCD scale, i.e. O(). Within our model, such contributions
are associated with soft interactions of the b quark in the fragmentation into a B hadron and are
analogous to the well-known Fermi motion of a decaying b quark inside a B. In the perturbative
fragmentation approach, such effects become relevant when we evaluate at large x the quantities
C = mZ 1 x and S = mb (1 x),
(7.2)
whose squares are the limits of the dk 2 /k 2 integration in the resummed coefficient function (4.5)
and initial condition (5.4), respectively, as well as the scales of functions B(S ) and D(S ). It
is interesting to evaluate C, S and the corresponding values of S for x = 0.8, since, as we shall
show in the following section, the B-hadron spectrum in e+ e annihilation is peaked about this
value of x. We find6 :

C 40 GeV, S C 2 0.13;
(7.3)
S 1 GeV, S S 2 0.33.
It is therefore clear that non-perturbative effects are more relevant in the initial condition (5.4),
where S plays a role, rather than in the coefficient function (4.5), depending on C. In order to deal
with such effects, we shall insert S in place of the standard S in the resummed initial condition
(5.4) and, as part of our model, we shall perform the Mellin and inverse Mellin transforms exactly.
As discussed in Section 4.1, the issue of the integration over z when using the effective coupling
is delicate and does deserve a deeper investigation in the next future. However, it was pointed
out in [44] that, in order to include the power-suppressed corrections originated by the effective
coupling, O(1/N) or O(1 x), performing the z-integration exactly is necessary. In fact,
an approximate Mellin transform, using the step function (4.16) or the formulas beyond NLL in
[33], would suppress most of such effects [44]. The Mellin transforms were performed exactly
even in Ref. [9], where the effective coupling was used to describe non-perturbative effects in
B-meson decays.
In the coefficient function, the use of S (k 2 ) and the exactness of the Mellin transform are
less crucial than in the initial condition of the perturbative fragmentation function. In any case,
for practical convenience, we shall still use the effective coupling constant and perform the zintegration exactly. Besides, S (k 2 ) will be employed everywhere in our calculation, including
the constant terms and the remainder functions presented in Eqs. (4.30), (4.33), (5.20) and (5.23).
We also remark that there is not a unique way to construct a model based on the analytic
coupling constant. In fact, we have to make two choices:
6 For x = 0.9, the corresponding numbers are: C 30 GeV,
S (C 2 ) 0.14, S 0.5 GeV, S (S 2 ) 0.44.
186
(1) we can include the absorptive effects, which are always present in time-like kinematics, and
use S (k 2 ); alternatively, we do not include such effects and employ the space-like S (k 2 );
(2) we can perform a power expansion of the higher orders or not.
The latest point deserves some more comments. By power expansion we mean that higher
orders have in front a power of the effective coupling constant:
Sn
2
k =
i
2
k 2
S (s)
ds Discs
s
n
.
(7.4)
The nth power is taken after the discontinuity of S (s) is computed and the integral over
the gluon virtuality s is performed. Adopting the non-power expansion choice, as originally
proposed in [21], consists instead in evaluating first the discontinuity of Sn (s), and then
integrating over s. Formally, Sn (k 2 ) will have to be replaced according to:

i
Sn k 2
Sn k 2
2
k 2
ds Discs
Sn (s)
.
s
(7.5)
Unlike (7.4), Eq. (7.5) is linear in the discontinuity.

We have followed an empirical criterion, leaving to the future the task of a theoretical justification: we select the model which gives spectra closer to the data, without adding any further
non-perturbative fragmentation function. We can anticipate that have found that the power expansion of the coupling constant which includes the absorptive effects, i.e. S (k 2 ), leads to the
best description of the experimental data. To be more general, we stress that our choices in modelling non-perturbative effects are partly related to the accuracy of the perturbative resummed
calculation that we are using. Considering, for example, function A[S (k 2 )], we have used its
expansion to the third order and the NNLO S (k 2 ). However, if we were to know function A to
a different level of approximation or, in principle, even to any order, a different effective coupling
constant may still yield the same A[S (k 2 )] and the same B-hadron spectrum.
Since Eq. (6.15) is basically a change of scheme for the coupling constant, the coefficients of
the terms of O(S3 ) in functions A(S ), B(S ) and D(S ), appearing in the resummed expressions (4.5) and (5.4), are to be modified. In our NNLL approximation, we need to replace A(3)
according to:
(0 )2 (1)
A(3) A (3) = A(3) +
A ,
3
(7.6)
where we have made use of Eq. (6.15). The other coefficients entering in functions A(S ), B(S )
and D(S ) at NNLL are instead left unchanged when replacing S (k 2 ) with S (k 2 ).
Let us now discuss another delicate point of our model. In standard resummations, which use
the standard coupling constant and resum only the logarithms of the Mellin variable N , there is a
clear counting of the terms in S (k 2 ) which are to be kept or dropped. For example, in the NLL
approximation:
2
S,LO
(k 2 )

S,NLO (k 2 )
.
A S k 2 A(1)
+ A(2)
(7.7)
That is because
2
2
k
S,LO
1
2
ln (k 2 /2 )
187
(7.8)
and therefore it has, for k 2 , the same asymptotic behavior as S,NLO (k 2 ), which contains
a correction term behaving like ln(ln(k 2 /2 ))/ ln2 (k 2 /2 ) 1/ ln2 (k 2 /2 ). In the standard resummation schemewhich is a kind of minimal schemeonly logarithms are resummed and
the asymptotic expansion of the coupling constant is used to organize the series of the infrared
logarithms. On the other hand, the large-k 2 expansion of our time-like S (k 2 ) is much more
complicated; even the lowest order S,LO (k 2 ), proportional to 1/0 , presents terms of any order
for k 2 :
c3
c1
c5
+
S,LO (k 2 )
(7.9)
+
+ .
ln(k 2 /2 ) ln3 (k 2 /2 ) ln5 (k 2 /2 )
In fact, we observe that: (i) in our model we would like to include as many contributions as
possible; (ii) there is no real reason to neglect higher-order corrections to S (k 2 ), proportional
to 1 , 2 and so on. Therefore, we find it safe using S,NNLO (k 2 ) everywhere in the resummed
expressions. For example:

A(2) 2
2 A (3) 3

A(1)
A S k 2
(7.10)
k + 3 S,NNLO k 2 .
S,NNLO k 2 + 2 S,NNLO
In this way we include all the logarithmic terms of the standard NNLL resummation, plus some
subleading contributions. Another possibility might have been:
A(2) 2
2 A (3) 3 2

A(1)
k + 3 S,LO k .
S,NNLO k 2 + 2 S,NLO
A S k 2
(7.11)
However, at small k 2 , where we are mostly sensitive to non-perturbative effects, higher-order

corrections to the time-like coupling constant are negative and sizable. Comparing, in fact,
Eqs. (6.10), (6.11) and (6.13) for low values of k 2 , we find:

S,NNLO k 2 < S,NLO (k 2 ) < S,LO k 2 .
(7.12)
Hence, if we used Eq. (7.11) rather than Eq. (7.10), the term proportional to A (3) would be enhanced at large x. We can anticipate that this would worsen the comparison with the experimental
data.
8. Phenomenologyx-space
In this section we present results on the B-hadron energy spectrum using the resummed
partonic calculation based on the perturbative fragmentation formalism, and modelling nonperturbative effects by means of the time-like effective coupling constant. The basic assumption
of our model is that, whenever we use S (k 2 ) instead of S (k 2 ), the b-quark energy fraction x
will have to be replaced by its hadron-level counterpart xB :
xB
2pB q
,
m2Z
with pB being the momentum of a b-flavored hadron produced in e+ e annihilation.
(8.1)
188
The B-hadron spectrum in moment space can be obtained from Eq. (5.21), after replacing the
standard coupling constant with the analytic time-like one:

(B)
N 2R , 20R , 20F , 2F , m2b , m2Z
2 2 2

res
S R , R , F , m2Z EN S 20F , S 2F
= CN

ini,res 2
DN
(8.2)
S 0R , 20R , 20F , m2b .
In Eq. (8.2) we assume that the coefficient function and the initial condition are resummed in the
NNLL approximation, and that the NNLO analytic coupling constant (6.13) is used. In order to
recover the x-space results, we perform the inverse Mellin transform of Eq. (8.2):

(B) xB ; 2R , 20R , 20F , 2F , m2b , m2Z
c+i

dN N (B) 2 2
xB N R , 0R , 20F , 2F , m2b , m2Z ,
2i
(8.3)
ci
where c is a positive constant. We point out that the inverse transform (8.3) is computed in the
standard mathematical way: no prescription to deal with the Landau pole, such as the minimal
prescription [36], is needed. That is because our effective coupling S (k 2 ) does not present the
Landau pole any longer. Since the inverse transform is made in a numerical way, we have checked
that the results are stable with respect to change of the integration contour, i.e. of c.
8.1. B-hadron spectrum and comparison with experimental data
Following [18], we consider data from SLD [1], ALEPH [2] and OPAL [3] Collaborations
at the Z 0 pole. ALEPH reconstructed only B mesons, while the SLD and OPAL samples also
contain a small fraction of b-flavored baryons.7 In principle, one should consider such data separately, as the hadron content is different; also, from the theoretical viewpoint, as pointed out in the
introduction, there is no real reason why the same model should describe both meson and baryon
production. However, the SLD and OPAL samples are inclusive and the baryons are anyway very
little. In fact, Ref. [18] found that it is possible to describe all data points fitting the Kartvelishvili
non-perturbative fragmentation model [5] or the cluster and string models implemented in HERWIG [45] and PYTHIA [46]. It is therefore reasonable using the same hadronization model, in
our case the effective coupling constant, for the comparison with all three experiments, and investigating how it fares against all data and against each experimental sample. As in the previous
analyses, when doing the comparison, we neglect the correlations between the data points and
sum the experimental systematic and statistical errors in quadrature.
Unlike Refs. [1518], we shall not perform a fit to the data, since we do not have any tunable
parameter in our model, apart of course from the ones contained in our perturbative calculation.
Rather, we shall investigate the theoretical uncertainty on our prediction, by varying the parameters in our computation, such as scales and quark masses. The default values of our parameters
will be:
R = F = mZ ;
0R = 0F = mb ,
(8.4)
7 A naive estimation based on the 1/N expansion, where N is the color number, would predict a fraction of about
c
c
1/Nc2 10% b-flavored hyperions compared to B mesons.
189
where R and F are the renormalization and factorization scales in the coefficient function
(4.27), respectively, and 0R and 0F in the initial condition (5.17). Consistently with the
Particle Data Group [47], we set mZ = 91.19 GeV and S (m2Z ) = 0.119.8 For the MS quark
masses, entering in the matching condition (6.16), we choose: m
b = 4.2 GeV, m
c = 1.25 GeV,
m
s = 0.1 GeV, while the up- and down-quark masses will be neglected. The choice of mb , the bquark mass entering in the initial condition of the perturbative fragmentation function, deserves
some extra comment. In fact, using the pole or the MS b mass in the initial condition is irrelevant
at NLO and when resumming threshold logarithms up to NLL accuracy. Beyond such approximations, the choice of the renormalization scheme makes a difference. Refs. [27,28], which
calculate the initial condition to NNLO, use the pole mass in their computation. Although in the
matching we are using the NLO initial condition, we are still relying on Ref. [27] for the coefficient D (2) in the NNLL resummation (5.4), and therefore we should use the pole mass as well.
However, since we are aiming at predicting the B-hadron energy distribution, it is not uniquely
determined whether, after using the analytic coupling constant to include power-suppressed effects, mb should be the b-quark or the B-hadron mass. We believe that it is safe adopting a quite
conservative choice, i.e. 4.7 GeV < mb < 5.3 GeV, which includes the present estimations for
the b pole mass as well as b-flavored hadron masses [47]. Our default value will be mb = 5 GeV.
Before presenting our results, we point out that this kind of systematic analysis of the theoretical uncertainty was not performed, e.g., in Refs. [1618]. In fact, when using a hadronization
model, the fitting procedure would possibly adjust the free parameters to reproduce the data,
even when varying the inputs of the parton-level computation. As long as a reliable hadronization
model is used, changing the perturbative quantities will just lead to different best-fit parameters
in the non-perturbative fragmentation function.
In Fig. 1 we show the data points and our prediction, investigating its dependence on the factorization scales F and 0F . In Fig. 2 we look instead at the dependence on the renormalization
scales R and 0R . For R and F we choose the values mZ /2, mZ and 2mZ ; for 0R and 0F ,
mb /2, mb and 2mb . We vary each scale separately, keeping all other quantities to their default
values, in order to avoid an excessive number of runs. In Fig. 1 we notice that the dependence of
our prediction on 0F , the factorization scale entering in the initial condition of the perturbative
fragmentation function, is rather large, especially at small xB and around the peak. At small xB ,
the spectrum varies up to a factor of 2 if 0F changes from mb /2 to 2mb ; around the peak the
impact of the choice of 0F is about 20%. The effect of the value chosen for F is instead pretty
small, and well within the band associated with the variation of 0F . A possible explanation of
the fairly large dependence on 0F could be the fact that, although we are working in the NNLL
approximation, we are still matching the resummation to the NLO exact result, and not to the
NNLO one. This generates a mismatch between the NNLL terms in the resummed expressions
( S2 ln N , S3 ln2 N, . . .) and the remainder functions, given in Eqs. (4.30) and (5.20), which
have been included only to NLO. This mismatch is more evident in the initial condition, where
0F plays a role, since the coupling constant is larger, being evaluated at scales that are smaller
than in the coefficient function. We should expect a milder dependence on such a factorization
scale if we matched the resummed initial condition to the exact NNLO result [27]. Moreover,
the scale 0F also enters in the DGLAP evolution operator (3.6), which we have implemented
to NLL accuracy, using NLO splitting functions. Ref. [26] has recently calculated the NNLO
corrections to the time-like splitting functions, which include a contribution analogous to the one
8 This choice corresponds to
S (m2Z ) = 0.117, as can be obtained from Eq. (6.15).
190
Fig. 1. B-hadron spectrum yielded by our resummed calculation using the analytic time-like coupling constant S (k 2 )
to model non-perturbative effects. We investigate the dependence on the factorization scales. Solid lines: 0F = mb /2,
mb and 2mb ; dashed lines: F = mZ /2, mZ and 2mZ . The other parameters are kept to their default values.
Fig. 2. Dependence on the renormalization scales R and 0R . Solid lines: R varied between mZ /2 and 2mZ ; dashed
lines: 0R between mb /2 and 2mb .
191
A(3) entering in the resummed expressions (4.5) and (5.4). The fact that we have not included
such effects may be a further source of mismatch, which may be fixed implementing the splitting functions to NNLO accuracy. In any case, a peculiar feature of our model is that, since we
are not using any hadronization model, our theoretical error includes, at the same time, uncertainties of both perturbative and non-perturbative nature. The partonic calculations in Refs. [14,
16,17], which are NLL/NLO and use the standard coupling constant, yield indeed a very mild
dependence on the quantities which enter in the perturbative calculation. However, in order to
predict hadron-level observables, these perturbative computations need to be convoluted with a
non-perturbative fragmentation function, whose parameters, after being fitted to SLD and LEP
data, exhibit errors which are typically of the order of 10%, in such a way that the hadron spectra
still present fairly large uncertainties.
Turning back to Fig. 1, let us observe that our distributions become negative and oscillate at
very small and at very large xB . In fact, the coefficient function (2.7) presents a term of the form
S (2R )CF
ln x,
(8.5)
which is enhanced at small x and has not been resummed in our analysis. In any case, that does
not affect much the comparison with the data, since the latter are at x > 0.12. The coefficient
function (2.7) and the initial condition (3.4) also contain terms of the type
S (20R )CF
S (2R )CF
(8.6)
ln(1 x) and 2
ln(1 x),
which become large for x 1 and have not been resummed. It is at present not known how to
accomplish this task. The contributions in Eq. (8.6) are responsible of the oscillating behavior
at large xB . It is conceivable that the inclusion of NNLO corrections to the coefficient function
and the initial condition may partly stabilize the distribution at the endpoints. In any case, we
are aware that our simple model, based on an extrapolation of the perturbative behavior to small
energy scales, is expected to fail for very large x. It is therefore safe discarding a few data points
at large xB when comparing with the data and, e.g., limiting ourselves to xB 0.92. We find
that our default parameters give a good description of the ALEPH data ( 2 /dof = 21.4/16), but
reproduce rather badly the OPAL ( 2 /dof = 162.7/18) and SLD ( 2 /dof = 109.1/20) ones.
The overall 2 /dof, computed as if all measurements were coming from one experiment is also
quite large: 2 /dof = 293.2/54. A much better description of SLD, OPAL and the overall sample is obtained setting 0F = mb /2. We obtain: 2 /dof = 24.8/16 (ALEPH), 30.4/18 (OPAL),
47.8/20 (SLD) and 103.0/54 (overall). Such values of 2 /dof are perfectly acceptable, since we
are comparing data from different experiments and, as already discussed, SLD and OPAL, unlike
ALEPH, also reconstructed a small fraction of b-flavored baryons.
The theoretical uncertainties due to the values chosen for S (m2Z ) and mb are explored in
Fig. 3: we consider S (m2Z ) = 0.117, 0.119 and 0.121,9 and mb = 4.7, 5.0 and 5.3 GeV. The
impact of the choice of these quantities is comparable and well visible throughout all the xB range, though smaller than the one due to the variation of 0F . The effect is about 10% at
average values of xB and grows up to 35% at large x. While changing S (m2Z ) and mb does not
improve much the comparison with OPAL and SLD, an excellent description of the ALEPH data
( 2 /dof = 11.9/16) is obtained for mb = 5.3 GeV, a value compatible with B-meson masses,
9 The corresponding range for the analytic time-like coupling constant is 0.115 <
S (m2Z ) < 0.119.
192
Fig. 3. Dependence of the B-energy distribution on S (m2Z ) and on mb . Solid lines: standard coupling S (m2Z ) varied
from 0.117 to 0.121; dashed lines: b pole mass varied from 4.7 to 5.3 GeV.
Table 1
Results of the comparison of our model with LEP and SLD data, for the most significant values of 0F and mb . We have
set S (m2Z ) = 0.119, while the other parameters have very little impact
0F
mb
2 /dof (ALEPH)
2 /dof (OPAL)
2 /dof (SLD)
2 /dof (overall)
mb
mb /2
mb
5 GeV
5 GeV
5.3 GeV
21.4/16
24.8/16
11.9/16
162.18/18
30.4/18
116.1/18
109.1/20
47.8/20
84.2/20
293.2/54
103.0/54
212.2/54
and still S (m2Z ) = 0.119. Nevertheless, the comparison with the other experiments gets worse
for this value of the bottom-quark mass, as we get 2 /dof = 116.1/18 for OPAL, 84.2/20 for
SLD and 212.2/54 overall. As for the MS masses m
b, m
c and m
s , the effect of their variation
on the xB -spectrum is very little: we do not show such plots for the sake of brevity. The most
relevant values of 2 /dof when we vary the perturbative parameters are collected in Table 1.
8.2. Relevance of NNLL effects
Before turning to the analysis in moment space, we wish to estimate the impact on our prediction of the NNLL threshold-resummation corrections.
In Fig. 4 we present the experimental points and the spectra yielded by our model using the
parameters which give the overall best fit to the data (0F = mb /2 and mb = 5 GeV), but within
two approximations schemes: NNLL soft resummation along with NNLO effective coupling
constant (solid line), and NLL soft resummation with NLO coupling constant (dashed). In the
NLO/NLL prediction, we still perform the Mellin transforms of the resummed expressions ex-
193
Fig. 4. The solid line is our best prediction for B-hadron production (j = B), obtained using NNLL threshold resummation and the NNLO time-like coupling, super-imposed to the data. The dotted curve is obtained by dropping the NNLL
terms, i.e. working within NLL accuracy. The dashed line is the parton-level spectrum of Ref. [14] (j = b), which uses
the standard S (k 2 ) and NLL large-x resummation.
actly, i.e. without the step-function approximation (4.16). By dropping the NNLL terms, i.e. the
contributions proportional to A (3) , B (2) and D (2) in the resummed expressions (4.5) and (5.4),
we see that the spectrum gets shifted to larger xB , and the agreement with the experimental data
gets worse.10 For xB 0.92 we obtain: 2 /dof = 11.18 (SLD), 7.01 (ALEPH), 16.27 (OPAL)
and 12.05 (overall). The inclusion of the NNLL threshold contributions to the coefficient function and to the initial condition of the perturbative fragmentation function, together with the
NNLO corrections to S (k 2 ), is therefore crucial to get reasonable agreement with the data and
the 2 /dof quoted in Table 1. In fact, when using the time-like coupling constant, the O(S3 )
coefficient A(3) in the NNLL-resummed coefficient function and in the initial condition gets enhanced according to Eq. (7.6). The change A(3) A (3) shifts the peak of the spectrum towards
a lower value of xB . If we had used the space-like coupling constant S (k 2 ), given in Eq. (6.3),
instead of S (k 2 ), we would have obtained the same A(3) and a rather poor description of the
data.
It is also interesting to gauge the overall impact of the inclusion of power corrections by means
of our model. In Fig. 4 we also show the NLL-resummed parton-level spectrum of Ref. [14],
where the standard NLO coupling constant is used, the Mellin transforms are performed by the
step-function approximation, and the inversion to x-space is done using the minimal prescription
[36] to avoid the Landau pole. We see that the distribution of Ref. [14] is peaked at very large x
10 Similar conclusions are reached by comparing thrust, heavy-jet mass and C-parameter distributionscomputed with
the effective time-like coupling within NLL accuracywith LEP1 data [48].
194
and is very far from the data for xB > 0.4. In fact, the b-quark spectrum of [14] is expected to be
convoluted with a non-perturbative fragmentation function, whose parameters need to be fitted
to reproduce the data. Our spectrum is also broader than the one of [14]: one can actually show
that this is due to the fact that we have performed the Mellin transforms exactly.
9. PhenomenologyN -space
We would like to present the results yielded by our approach in Mellin moment space, and
compare them with the measurements of the DELPHI experiment [4]. It was argued in [19] that
working in N -space is theoretically preferable, since one does not need to assume any functional
form for the non-perturbative fragmentation function and can fit directly its moments. Moreover,
as shown in [7,18], the moments obtained fitting a non-perturbative model discarding data points
at small and large xB are quite uncorrect, since the tails of the distributions play a crucial role
to obtain the right N -space results. Our case is clearly different, since we are not tuning a nonperturbative fragmentation function, and therefore we do not have the problems related to the
fits. However, our distributions are still negative and unreliable at very small and large xB , and
we were forced to discard few data points to obtain acceptable values of 2 /dof. It is therefore
still interesting to present the moments obtained using the analytic coupling constant and explore
the theoretical uncertainties, by changing the parameters which enter in the calculation, along
the lines of our x-space analysis. We calculate the B spectrum in Mellin space directly from
Eq. (8.2), and choose the same default values for scales, quark masses and S (m2Z ) as in x-space.
In Table 2 we present the results of our study: we quote the experimental moments measured
by DELPHI, the central values yielded by our calculation, denoted by (NB )th , and the errors due
to all the quantities which we vary. The data correspond to N = 2, 3, 4 and 5; the first moment
B
is N=1
= 1, since both data and our spectra are normalized to unity. We estimate the overall
theoretical error summing in quadrature the errors due to the variation of each parameter.
Table 2
B from DELPHI [4] and moments [ B ] yielded by our calculation. We quote the uncertainties due to
Moments N
N th
the parameters which enter in the perturbative calculations, varied as discussed in the text. The theoretical total error is
b according to Ref. [14]
estimated as the sum in quadrature of the partial errors. We also present the moments N
x
x 2
x 3
x 4
B
e+ e data N
B
[N ]th
B ( )
N
R
0.7153 0.0052
0.5401 0.0064
0.4236 0.0065
0.3406 0.0064
0.6867 0.0403
0.0014
0.5019 0.0472
0.0011
0.3815 0.0465
0.0009
0.2976 0.0462
0.0007
B ( )
N
F
0.0066
0.0067
0.0059
0.0051
B ( )
N
0R
0.0022
0.0028
0.0031
0.0033
B ( )
N
0F
B (m )
N
b
B (m
b)
N
B (m
c)
N
B (m
s)
N
B
N (S (m2Z ))
b
N
0.0364
0.0414
0.0398
0.0364
0.0111
0.0145
0.0153
0.0150
0.0004
0.0005
0.0006
0.0006
0.0003
0.0005
0.0006
0.0006
0.0004
0.0007
0.0008
0.0008
0.0113
0.0158
0.0173
0.0176
0.7734 0.0232
0.6333 0.0311
0.5354 0.0345
0.4617 0.0346
195
We observe that the experimental moments exhibit very little errors and that our central values
are smaller by about 510% than the DELPHI ones. This result confirms what we had found in
xB -space: our xB -spectra go to zero more rapidly than the experimental data for large xB , and
become negative for xB > 0.96; for low and intermediate xB , our curves lie above the data. It
is therefore reasonable that our moments are smaller than the measured ones. However, we find
that, within the uncertainties due to masses and scales, our calculation is in fair agreement with
the data. As observed for the xB -spectrum, the uncertainty due to the choice of 0F is pretty
large, and the ones due to S (m2Z ) and mb are smaller but visible. The other scales and the MS
masses, entering in the matching of S at different flavor numbers, have very little impact on
the moments of the B cross section. As far as the uncertainties are concerned, we observe that
the DELPHI data are for N 5, hence the terms lnk N in the resummed exponents are not so
dominant with respect to other contributions, such as the constants. Therefore, even in N -space,
we expect that the further inclusion of NNLO contributions will have a significant impact on the
prediction and decrease the theoretical error.
We also present in Table 2 the moments yielded by the NLL-resummed calculation of [14],
which uses the NLO standard S (k 2 ), whose x-space results have already been displayed in
Fig. 4. We estimate the error on the moments of Ref. [14] as we did for the one on our calculation,
varying the perturbative parameters in the same range. We find that the N -space results of [14],
denoted by Nb to stress that they are a parton-level result, are very far from the data and much
larger than the moments obtained using the analytic coupling constant. This result is in agreement
with the plots in Fig. 4, and confirms the remarkable impact of non-perturbative effects in N space as well. Moreover, we observe that the errors on the moments of [14] are smaller than
the ones yielded by our calculation. In fact, Ref. [14] employs NLL threshold resummation and
DGLAP evolution, and matches the resummation to the NLO results. As already discussed in the
x-space analysis, using NNLL large-x resummation but still NLL DGLAP evolution and NLO
remainder functions, as we did, may lead to a mismatch which produces larger uncertainties.
We expect that the inclusion of NNLO corrections to the initial condition and to the splitting
functions will lead to a weaker dependence on the input parameters of our analysis. In any case,
the perturbative moments of [14] need to be multiplied by the moments of a non-perturbative
fragmentation function extracted from the data, which will lead to larger uncertainties on the
hadron-level moments.
10. Conclusions
We studied B-hadron energy distributions in e+ e annihilation at the Z 0 peak by means of
a model having as the source of non-perturbative corrections an effective QCD coupling constant.
The physical idea behind our model is that the main non-perturbative effects, related to soft
interactions in hadron bound states, are not very strong coupling phenomena, but can be described by an effective coupling constant of intermediate strength, typically S 0.30.5. That
implies that, within our model, these non-perturbative correctionsthe Fermi motion in B decays being the best-known examplecan be obtained from perturbation theory by means of an
extrapolation.
We modelled power corrections using a NNLO effective S (k 2 ) constructed removing first
the Landau pole, according to an analyticity requirement, and then resumming to all orders the
absorptive effects related to time-like gluon branching. The resulting S (k 2 ) significantly differs
from the standard S (k 2 ) at small scales. We described b-quark production within the framework
of perturbative fragmentation functions, resumming threshold logarithms in the coefficient func-
196
tion and in the initial condition of the perturbative fragmentation function to NNLL accuracy, and
matching the resummed results to the NLO ones. The perturbative fragmentation function was
evolved using NLL DGLAP equations. When implementing threshold resummation in N -space,
as part of our model, we decided to perform the Mellin transforms exactly, which leads to the
further inclusion in the Sudakov exponent of terms lnk N of higher order (N3 LL, N4 LL, . . .),
along with some constants and power-suppressed terms O(1/N).
We presented results on the B-hadron energy distribution, relying on this model to include
power corrections, without introducing any non-perturbative fragmentation function with tunable
parameters. When studying the B spectra, we investigated the dependence of our prediction on
the quantities which enter in the perturbative calculation, such as quark masses, renormalization
and factorization scales, and S (m2Z ). We observed that the xB -distributions become negative
and oscillating at very small and large xB , which is due terms ln x and ln(1 x), that
are present in the NLO remainder functions and have not been resummed. We have therefore
discarded a few data points at large xB , where our calculation is anyway unreliable, and limited
ourselves to xB 0.92. We compared our results with OPAL, ALEPH and SLD data on bflavored hadron production and found that, within our theoretical uncertainties, our calculation
is able to reproduce quite well the ALEPH and OPAL data, while it is marginally consistent
with SLD. As for the dependence on the perturbative parameters, the effect of the choice of
the factorization scale 0F , appearing in the initial condition of the perturbative fragmentation
function, is quite large and the data are better described for low values of 0F . The dependence
on S (m2Z ) and on mb is also quite visible: for example, a value of mb consistent with B-meson
masses, which is reasonable since our model assumes EB Eb , yields an excellent description
of the ALEPH data. The other parameters have instead very little impact on the xB -spectrum.
We also compared our results in Mellin space with the moments measured by the DELPHI
Collaboration. We found that the central values of our N -space results are smaller than the experimental ones, but nonetheless, within the theoretical uncertainties, our moments are consistent
with the data. In particular, a fairly large uncertainty is still due to the choice of 0F , as was
observed in the x-space analysis.
In summary, we find it remarkable that, within our theoretical uncertainties, we have been able
to describe the LEP data in both x- and N -spaces, by modelling non-perturbative corrections via
the analytic time-like coupling constant and without tuning any parameter to such data. It is
also pretty interesting that a model which succeeded in reproducing the photon- and hadronmass energy distribution in B-meson decays [9] has led to a reasonable fit of B-production data,
although they are quite different processes, characterized by different energy scales.
Our model looks therefore quite promising and we plan to further apply it to other processes
and observables. A straightforward extension of the study here presented is the investigation of B
production in top and Higgs decays, using the perturbative calculations in [1517], and modelling
power corrections as in the present paper. We can also use our model along with Monte Carlo
event generators: in Ref. [18], the hadronization models of HERWIG and PYTHIA were tuned to
the same data as the ones here considered, and then used to predict B production in top and Higgs
decays. We may thus think of using the parton shower algorithms of HERWIG and PYTHIA to
describe perturbative b-production, with our analytic coupling constant in place of the cluster
and string models. At the end of the cascade, we can assume Eb EB and compare the results
with the data. The spectra yielded by Monte Carlo event generators are positive definite and do
not exhibit the problem of becoming negative. However, parton shower algorithms are equivalent
to an LL/LO resummation, with the further inclusion of some NLLs [49]. Since we have learned
from this analysis that the inclusion of corrections of higher orders is crucial to reproduce the
197
data, we may have to add to the Monte Carlo Sudakov form factor contributions analogous to the
NLLs and NNLLs that are missing, in particular the one A (3) .
Besides, it will be really worthwhile to use the recent calculations of the NNLO initial condition of the perturbative fragmentation function [27,28] and of the NNLO splitting functions
in the non-singlet sector [29] to fully promote our formalism to NNLO/NNLL accuracy. This
way we could explore whether the uncertainties on our predictionsespecially the one due to
the scale 0F get finally reduced. If the theoretical uncertainties get substantially reduced, one
may even think of extracting S (m2Z ) from b-fragmentation data, as already done from B-decay
spectra [9].
Our study could also be improved by including NNNLL contributions in the resummation
exponents. Such corrections involve the coefficients A(4) , B (3) and D (3) : at present only B (3) is
exactly known. In particular, we expect a relevant effect due to the possible inclusion of D (3) ,
the O(S3 ) coefficient of function D(S ), which resums large-angle soft radiation in the initial
condition. In fact, we have observed that the scale of S in the resummed initial condition is
smaller than in the coefficient function, and therefore power corrections are more relevant. Also,
our model enhances the coefficients of S from the third order on. We can therefore employ
our model to implement non-perturbative corrections to DrellYan or deep inelastic scattering
processes, whose coefficient function was recently calculated to NNNLO accuracy [50]. In e+ e
annihilation, threshold contributions to the NNNLO coefficient function were computed in [51];
the full O(S3 ) corrections have not been calculated yet.
From the theoretical viewpoint, we plan to investigate in more detail the issue of the Mellin
transform of the resummed cross section and the power corrections which are inherited by the B
spectrum when one uses the effective S (k 2 ) and does the longitudinal-momentum integration in
an exact or approximated way. Within our model, we chose to perform it exactly, driven by the
results in [9,44], but nonetheless we believe that a thorough study of this point, along the lines of
[35,36], should be necessary.
Other issues which we plan to investigate in detail are the treatment of the higher orders of the
effective S and the comparison between time- and space-like coupling constants. We have obtained reasonable agreement with the data by using the time-like effective coupling constant and
taking the powers Sn (k 2 ) after performing the integral of the dispersion relation. Nonetheless,
this conclusion is somehow related to the level of approximation of the perturbative calculation
which we have employed, and may not hold if we used a different accuracy. Therefore, a more
solid understanding of these aspects of our model is mandatory.
Furthermore, our model has some built-in universal features implying that, to put it into a
stringent check, one should consider several observables from different processes. To this goal,
however, one might need both NNLL resummation from the theoretical side and accurate data on
the experimental one. Shape-variable data at the Z 0 peak, for example, are rather accurate, but
for these quantities NNLL resummation is still in progress, hence preventing an analysis within
our model. In [9] this model was applied to semi-inclusive B decays, where the situation is
somehow complementary with respect to shape variables in e+ e annihilation: NNLL threshold
resummation is well established, but data are not very accurate yet, because of large backgrounds.
Finally, we could also push our model to the low-energy direction, to try to describe, for
example, charm production at the Z 0 peak or below it, where accurate experimental data are
available. Due to the large value of S (m2c ) 0.35 and to a soft scale S mc (1 x), smaller by
a factor of three than in b-production, we expect a full NNLL/NNLO analysis to be necessary.
The comparison with D-hadron data will be crucial to investigate possible deviations from our
model. In fact, an extension of the formulation here presented may consist in adding to S (k 2 ) a
198
correcting term: S (k 2 ) = S (k 2 ) + S (k 2 ). This way, S (k 2 ) will still be the effective coupling

constant here discussed, while S (k 2 ) will depend on the hadronic state which we wish to
describe, and vary, e.g., from mesons to baryons or from Bs to Ds. Investigating the charm
sector may therefore help in shedding light on our model and understanding whether such a
correcting term is necessary or not. This study is in progress as well.
Acknowledgements
We are indebted to S. Catani for discussions on soft-gluon resummation. We also acknowledge
M. Cacciari for discussions on the perturbative fragmentation approach and for providing us with
the computing code to obtain the results of Ref. [14].
Appendix A. Numerical evaluation of Mellin transforms
In this appendix we present a method for computing numerically the Mellin transform,
through the fast Fourier transform (FFT), of a function of the form

(y)
f (y) =
(A.1)
,
y +
where y = 1 z and (y) is a regular function of y or a function having at most a logarithmic
singularity for y 0.
To calculate the integer moments of the cross section N typically the first few moments
with N = 2, 3, 4, 5, . . .the integral defining the Mellin transform does not present any convergence problem and can be done directly. In order to obtain the cross section in x-space, it
is however necessary to perform an inverse Mellin transform by integrating N along a vertical
line in N -space. The numerical computation of N in the complex N -plane is non-trivial for
Im N 1 because the kernel zN1 develops fast oscillations with z, which affect the convergence of the integral.
In detail, we have to compute the integral
1
gN =
1
dy (1 y)
N1
f (y) =

dy
(1 y)N1 1 (y)
y
(A.2)
for N lying on a vertical line in the complex plane,

N = c + i,
(A.3)
with c > 0 and real. It is convenient to treat analytically the infrared cancellation between real
and virtual contributions related to the first and the second term in square brackets on the r.h.s.
of Eq. (A.2). We then take a derivative with respect to :
dgc+i
c ()
=i
d
1
dy
ln(1 y)
(1 y)c1+i (y).
y
(A.4)
Note that the infrared singularity 1/y is now regulated by the ln(1 y) factor coming from the
differentiation.
199
In order to express c () as the Fourier transform (FT) of some function, we change variables
to
t ln(1 y),
(A.5)
and express Eq. (A.4) as an integral over t :

c () =
dt eit c (t),
(A.6)
with

t
(A.7)
ect 1 et .
t
1e
c is therefore the Fourier transform of the function c (t) defined above and vanishing for t < 0.
For a fast numerical evaluation, it is convenient to transform the FT above into a fast Fourier
transform (FFT). We cut the improper integral at a large but finite tmax :
c (t) i
tmax
dt eit c (t).
c ()
(A.8)
In practice, because of the exponential dependence y exp[t], we found that tmax = 2030
already gives a good accuracy. The above integral can be easily approximated by means of a
constant sampling:
tmax
t
(A.9)
,
n
where n is the number of points in which the function c (t) is evaluated. We have found that an
accuracy O(103 104 ) is reached already with n = 104 105 points. We then have:

n1

1
i/2t
ikt
t
e
c k +
c () = e
(A.10)
t .
2
k=0
The Mellin transform is obtained by a numerical integration of c ():

gc+i = gc +
d c ( ).
(A.11)
gc is the Mellin transform in the fixed point c on the positive N axis; it is a constant which is
computed directly just once. Eq. (A.11) is our final result for the numerical computation of the
Mellin transform.
Let us now discuss the numerical computation of the inverse Mellin transform with a similar
method:
c+i
f (y) =
ci
dN
1
(1 y)N gN =
2i
(1 y)c
+
d
(1 y)i gc+i .
2
(A.12)
Since f (y) is real, one immediately obtains the following property of its Mellin transform:
(gN ) = gN .
(A.13)
200
The inverse Mellin transform can therefore also be written as:

1
Re d ei ln(1y) gc+i .
f (y) =
(1 y)c
(A.14)
Apart from constant factors, Eq. (A.14) expresses the inverse Mellin transform as the inverse
Fourier transform ln(1 y) of gc+i . The transformation to the inverse fast Fourier transform can be made as in the direct case.
We have implemented the above algorithm within the Mathematica system with a gain of CPU
time by over an order of magnitude with respect to the direct numerical evaluation. A typical run
on a standard PC takes O(1) minute. We have also checked our numerical results with direct
integration in Fortran.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
SLD Collaboration, K. Abe, et al., Phys. Rev. Lett. 84 (2000) 4300.

ALEPH Collaboration, A. Heister, et al., Phys. Lett. B 512 (2001) 30.
OPAL Collaboration, G. Abbiendi, et al., Eur. Phys. J. C 29 (2003) 463.
DELPHI Collaboration, G. Barker, et al., DELPHI 2002-069, CONF 603.
V.G. Kartvelishvili, A.K. Likehoded, V.A. Petrov, Phys. Lett. B 78 (1978) 615.
C. Peterson, D. Schlatter, I. Schmitt, P.M. Zerwas, Phys. Rev. D 27 (1983) 105.
M. Cacciari, E. Gardi, Nucl. Phys. B 664 (2003) 299.
E. Gardi, Nucl. Phys. B 622 (2002) 365.
U. Aglietti, G. Ferrera, G. Ricciardi, Nucl. Phys. B 768 (2007) 85.
See, for example: Yu.L. Dokshitzer, V.A. Khoze, A.H. Mueller, S.I. Troian, Basics of Perturbative QCD, Frontires,
Gif-sur-Yvette, France, 1991, 274 p.;
R.K. Ellis, W.J. Stirling, B.R. Webber, QCD and Collider Physics, Cambridge Monographs on Particle Physics,
Nuclear Physics and Cosmology 8 (1996) 1435.
B. Mele, P. Nason, Nucl. Phys. B 361 (1991) 626.
L.N. Lipatov, Sov. J. Nucl. Phys. 20 (1975) 95;
V.N. Gribov, L.N. Lipatov, Sov. J. Nucl. Phys. 15 (1972) 438;
Yu.L. Dokshitzer, Sov. Phys. JETP 46 (1977) 641.
G. Altarelli, G. Parisi, Nucl. Phys. B 126 (1977) 298.
M. Cacciari, S. Catani, Nucl. Phys. B 617 (2001) 253.
G. Corcella, A.D. Mitov, Nucl. Phys. B 623 (2002) 247.
M. Cacciari, G. Corcella, A.D. Mitov, JHEP 0212 (2002) 015.
G. Corcella, Nucl. Phys. B 705 (2005) 363;
G. Corcella, Nucl. Phys. B 715 (2005) 609, Erratum.
G. Corcella, V. Drollinger, Nucl. Phys. B 730 (2005) 82.
M. Cacciari, P. Nason, Phys. Rev. Lett. 89 (2002) 122003.
B. Ermolaev, M. Greco, S. Troyan, Phys. Lett. B 522 (2001) 57.
D. Shirkov, Nucl. Phys. B (Proc. Suppl.) 152 (2006) 51.
N.G. Stefanis, W. Schroers, H.C. Kim, Phys. Lett. B 449 (1999) 299;
N.G. Stefanis, W. Schroers, H.C. Kim, Eur. Phys. J. C 18 (2000) 137.
U. Aglietti, G. Ricciardi, Phys. Rev. D 70 (2004) 114008.
G. Altarelli, R.K. Ellis, G. Martinelli, S.-Y. Pi, Nucl. Phys. B 160 (1979) 301.
P.J. Rijken, W.L. van Neerven, Nucl. Phys. B 487 (1997) 233.
A.D. Mitov, S. Moch, Nucl. Phys. B 751 (2006) 18.
K. Melnikov, A.D. Mitov, Phys. Rev. D 70 (2004) 034027.
A.D. Mitov, Phys. Rev. D 71 (2005) 054021.
A.D. Mitov, S. Moch, A. Vogt, Phys. Lett. B 638 (2006) 61.
G. Sterman, Nucl. Phys. B 281 (1987) 310.
S. Catani, L. Trentadue, Nucl. Phys. B 327 (1989) 323.
S. Moch, J.A.M. Vermaseren, A. Vogt, Nucl. Phys. B 688 (2004) 101.
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
201
S. Catani, D. de Florian, M. Grazzini, P. Nason, JHEP 0307 (2003) 028.

G.P. Korchemsky, G. Sterman, Nucl. Phys. B 437 (1995) 415.
M. Beneke, V.M. Braun, Nucl. Phys. B 454 (1995) 253.
S. Catani, M.L. Mangano, P. Nason, L. Trentadue, Nucl. Phys. B 478 (1996) 273.
E. Gardi, JHEP 0502 (2005) 053.
U. Aglietti, G. Ferrera, G. Ricciardi, Phys. Rev. D 74 (2006) 034004.
G. Corcella, A.D. Mitov, Nucl. Phys. B 676 (2004) 346.
D. Amati, A. Bassetto, M. Ciafaloni, G. Marchesini, G. Veneziano, Nucl. Phys. B 173 (1980) 429.
W. Bernreuther, Ann. Phys. 151 (1983) 127;
W. Bernreuther, W. Wetzel, Nucl. Phys. B 197 (1982) 228;
W. Bernreuther, W. Wetzel, Nucl. Phys. B 513 (1998) 758, Erratum.
G. Corcella, I.G. Knowles, G. Marchesini, S. Moretti, K. Odagiri, P. Richardson, M.H. Seymour, B.R. Webber,
JHEP 0101 (2001) 010.
T. Sjostrand, S. Mrenna, P. Skands, JHEP 0605 (2006) 026.
S. Eidelman, et al., Phys. Lett. B 592 (2004) 1.
U. Aglietti, G. Corb, unpublished.
S. Catani, G. Marchesini, B.R. Webber, Nucl. Phys. B 349 (1991) 635.
J. Blmlein, V. Ravindran, Phys. Lett. B 640 (2006) 40.
Alternative approach to the regularization

of odd-dimensional AdS gravity
Pablo Mora
Instituto de Fsica, Facultad de Ciencias, Igu 4225, Montevideo, Uruguay
Received 30 August 2006; received in revised form 6 March 2007; accepted 13 April 2007
Abstract
In this paper I present an action principle for odd-dimensional AdS gravity which consists of introducing another manifold with the same boundary and a very specific boundary term. This new action allows
and alternative approach to the regularization of the theory, yielding a finite Euclidean action and finite
conserved charges. The choice of the boundary term is justified on the grounds that an enhanced almost
off-shell local AdS/Conformal symmetry arises for that very special choice. One may say that the boundary
term is dictated by a guiding symmetry principle. Two sets of boundary conditions are considered, which
yield regularization procedures analogous to (but different from) the standard background subtraction and
counterterms regularization methods. The Noether charges are constructed in general. As an application
it is shown that for SchwarzschildAdS black holes the charge associated to the time-like Killing vector is
finite and is indeed the mass. The Euclidean action for SchwarzschildAdS black holes is computed, and it
turns out to be finite, and to yield the right thermodynamics. The previous sentence may be interpreted in
the sense that the boundary term dictated by the symmetry principle is the one that correctly regularizes the
action.
PACS: 04.50.+h; 11.10.Kk; 04.70.-s; 04.60.-m
E-mail address: pablmora@gmail.com.

doi:10.1016/j.nuclphysb.2007.04.010
P. Mora / Nuclear Physics B 775 (2007) 202223
203
1. Introduction
The AdS/CFT correspondence [13]1 has originated a great interest in AdS gravities for asymptotically AdS space-times. Semi-classical saddle point calculations in the gravity side would
correspond to strong coupling properties of the dual conformal theory.
As the gravitational action diverges, suitable regularization methods are required in order to
obtain sensible results.
There are essentially two main approaches to the regularization for the case of the standard
Dirichlet boundary conditions, the background subtraction [3,5,6] method and the counterterms method [712].
In this paper we present an alternative formalism for the regularization of General Relativity
(GR) with a cosmological constant (also called AdS gravity) in odd-dimensional space-times.
To that end I consider an action functional for odd-dimensional AdS gravity which consists of
introducing another manifold with the same boundary and a very specific boundary term. Two
possible sets of boundary conditions will be considered, different from the standard Dirichlet
boundary conditions, which lead to action principles for which the action is regularized in a way
analogous to the counterterms and to the background subtraction methods respectively, but
differing from those methods in several aspects.
On the counterterms side it is shown that the action and formalism of Ref. [13] can be
recovered as a particular case. The approach of Ref. [13] has been studied and developed further
in Refs. [14,15], where among other things the relationship with the above mentioned standard
counterterms approach is discussed. To avoid confusion between the approach of Refs. [1315]
and the counterterms of Refs. [712], R. Olea used the word kounterterms to describe the
former, a convention that I will follow here.
The construction proposed in this paper also makes possible an approach to the regularization in the spirit of the background subtraction methods, which however is not the same as the
HawkingPage approach, as it will be shown below.
It is worthwhile to mention that the problem of regularizing AdS gravity by introducing an
action principle with suitable boundary terms and non-Dirichlet boundary conditions has been
solved in the even-dimensional case in Refs. [16,17].
The idea of the alternative kounterterms approach to the regularization of odd-dimensional
AdS gravity introduced in Ref. [1315] was to borrow the boundary term used to regularize
ChernSimons gravities2 in Ref. [19], with a very specific relative coefficient between the standard AdS gravity bulk term and the borrowed boundary term, and using the same boundary
conditions. Perhaps surprisingly such approach did work, yet one can hardly avoid to wonder if
there is some profound reason for that to happen.
An important clue is that the boundary term in Ref. [19] can be understood as coming from
the extension from ChernSimons forms to transgression forms,3 as mentioned in Ref. [19] and
discussed in detail in Refs. [23,24]. Transgression forms involve two gauge potentials A and A,
and ChernSimons forms are just transgression forms with A = 0.
The extension of ChernSimons to transgression forms as a device to regularize the theory
was done in 2 + 1 dimensions in Ref. [25] (where the second field A was understood as a fixed
reference background), and in the context of actions for extended objects in Refs. [26,27] (here
1 A recent overview of the status of the AdSCFT correspondence is given in Ref. [4].
2 For a review of ChernSimons gravity see Ref. [18].
3 For mathematical background on transgression forms see Refs. [2022].
204
with both fields taken as dynamical). Afterwards, in the above mentioned Refs. [19,23,24], it
was shown how to use transgressions to properly regularize ChernSimons gravity theories in
arbitrary dimension. Other works using transgressions as actions are Refs. [2831].
The key point in the transition from ChernSimons to transgression forms is gauge invariance.
While ChernSimons forms are quasi-invariant, changing by a closed form under gauge transformations, transgression forms are truly gauge invariant. For instance the motivation for that
transition in Refs. [26,27] was to have a truly gauge invariant action even in the case of branes
with boundaries. Thus the results of Refs. [19,23,24] can be construed as follows: the boundary
terms dictated by the gauge principle turn out to be the ones that properly regularize the action.
The question that naturally arises is then: is there a symmetry principle which somehow explains why the boundary term of Ref. [13] works? The answer is affirmative, and that is one of
the main results of this paper.
By analogy with the case of the transgressions for the AdS group [24] we introduce an action
for odd-dimensional AdS gravity with two sets of fields, that is that in addition to the vielbein
ea and the spin connection ab with support in a manifold M, we have a vielbein ea and a
with a common boundary with M (that is
spin connection ab with support in a manifold M
M M).
The complication of having an additional set of fields is compensated by:
(i) The arising of an enhanced almost off-shell local AdS/Conformal symmetry for a very
special choice of the boundary term.
(ii) The fact that both the background subtraction and the kounterterms approach can be regarded as particular cases of this framework. We show that the action principle of Ref. [13] arises
for a specific choice of the second field (regarded as a reference configuration or vacuum).
The background subtraction regularization corresponds to a different choice of the second field
or reference configuration. It is however very important to emphasize again that in the kounterterms case there is no extra input of information required besides the original A configuration,
as the A is constructed from it in a direct way.
The Noether charges are constructed in general for the action given, and the charge of Ref. [13]
is a particular case of the general formula.
As computations in the kounterterms side of the present framework were done in Refs.
[1315], here I do some background subtraction computations as an application, and to show
that the proposed method does indeed work. It is shown that for SchwarzschildAdS black holes
the charge associated to the time-like Killing vector is finite and is indeed the mass. The Euclidean action for SchwarzschildAdS black holes with different horizon topologies is computed,
with suitable backgrounds for each case, and it turns out to be finite, and to yield the right thermodynamics.
The important point of the relationship between the standard counterterms approach and our
kounterterms approach has been discussed in Refs. [14,15] therefore I refer the reader to those
papers on that regard.
2. The action
2.1. General setting
The reader be warned that the presentation of this section is somehow indirect. I first write
the AdS transgression form derived in Ref. [24], and the transgression action discussed there.
Afterwards I consider an action for AdS gravity with a boundary term borrowed from the AdS
205
transgression with a coefficient to be determined, plus a doubling of he fields analogous to the

one in the transgression. We will show that with the proper coefficient the resulting action has an
enhanced symmetry and is properly regularized.
2.1.1. Review of AdS transgressions
We briefly review in this sub-subsection some results from Ref. [24] that we will use in what
follows.
The AdS transgression in dimension d = 2n + 1 is4 [24]
1
T2n+1 =

2 2 n
dt R + t e
1
e
n

dt R + t 2 e2 e + d2n
(1)
where
1
2n = n
1
dt

n1
ds et tR + (1 t)R t (1 t) 2 + s 2 et2
.
(2)
Here ea and ea are the two vielbeins and ab and ab the two spin connections, R = d + 2
and R = d + 2 are the corresponding curvatures, = and et = te + (1 t)e.
Written
in a more compact way
1
2n = n
1
dt
n1
ds et R st
(3)
where
R st = tR + (1 t)R t (1 t) 2 + s 2 et2 .
The action for transgressions for the AdS group is taken to be [24]
1
ITrans =

2 2 n
dt R + t e
1
e
0
M
M 0
n

dt R + t 2 e2 e +

2n
(4)
are two manifolds with a common boundary, that is the boundaries of M

where M and M
Notice that this is a generalization from he simpler case where

and M coincide M M.
M M, which is physically motivated by he fact that both sheets may no even have the same
is the AdS space-time).
topology (for instance if M is a black hole space-time and M
We write the transgression action as

ITrans = LLLCS L LLCS +
(5)
2n
M
4 I will use a compact notation where stands for the Levi-Civita symbol
a1 ad and wedge products of differential
forms are understood. For instance:
Red2 a1 a2 ad R a1 a2 ea3 ead2 .

This clarification should be enough to follow what follows, but if there is any doubt on notation check [13,19,24].
206
where the LanczosLovelockChernSimons Lagrangian LLLCS is

1
LLLCS =
n

dt R + t 2 e2 e.
(6)
The transgression form is invariant under gauge transformations for the AdS gauge group
SO(d 2, 2). These are the Lorentz transformations
ab = dab ca cb cb ac ,
ea = ab eb
(7)
and gauge translations

ab = eb a ea b ,
ea = da ba b .
Writing the AdS gauge connection
(8)
as5
ab
Jab + ea Pa
2
where Jab and Pa are the generators of the AdS group (for Lorentz transformations and translations respectively) and the gauge parameter
A=
ab
Jab + a Pa ,
2
the AdS gauge transformations take the compact form
=
A = D = d A + A.
Invariance under Lorentz transformations is immediate, as all the ingredients of the action are
Lorentz covariant and are contracted with the Levi-Civita invariant tensor ( = is Lorentz
covariant too, unlike or ),
but invariance under translations is very non-trivial, yet true.
The transgression action is also invariant under the AdS gauge transformations, but it is only
agree on M. The variation of the boundary
necessary that the transformations on M and M
term cancels the variations of both bulk terms.
2.1.2. Extended action for general relativity
For general relativity we consider the action

1
1 d
1 d2 1 d
d2
IGR =
+ e
+ e + n
2n
Re
R e
d 2
d
d 2
d
M
(9)
with the same 2n . We will see in several ways that the proper value for the constant n is
1
1
2
n1 . The action I
n = n(d2)f
GR is explicitly invariant
(n1) with f (n 1) = 0 dt (t 1)
5 Actually as a gauge connection has dimensions of (length)1 we should write
A=
ea
ab
Jab + Pa
2
l
where l is the AdS radius. We choose l = 1 trough all the present paper, as it is straightforward to reintroduce l
everywhere using dimensional analysis.
207
under Lorentz transformations, just like ITrans . We write IGR as

IGR = LEH L EH + n
2n
(10)
with the EinsteinHilbert Lagrangian LEH given by

1
1 d
d2
+ e .
Re
LEH =
d 2
d
(11)
Notice he doubling of the fields analogous to the transgression. In a saddle point evaluation of
the Euclidean action, or in the evaluation of the Noether charges the configuration A will be the
one of interest while the configuration A will be a reference vacuum, giving the background
subtraction or the counterterms approach, depending on the nature of that vacuum.
It is important however to emphasize that there is no little flag labeling one of the configurations as the dynamical one and the other one as a background, and I believe in general both
must be treated in the same footing. In particular one could consider saddle point configurations
namely the configuration of interest and the vacuum, interchanged,
with the roles of A and A,
considering the Euclidean action with the opposite sign.6
2.2. Variation of the action and field equations
The variation of the transgression action yields

n

ITrans =
R e + n R n1 T
R e + n R n1 T
1

2 2 n1
dt R + t e
1
e + n

2 2 n1
dt R + t e

e
+ 2n
(12)
2
2
where R = R + e and R = R + e . From this variation we can read the transgression field
equations [24]
R n = 0,
R n = 0,
R n1 T = 0,
R n1 T = 0.
(13)
For general relativity the variation of the action is

d3

d3
d3
Re e + T e
R e e + T ed3
IGR =
M

+
M
ed2
ed2
+
+ 2n

d 2
d 2

(14)
6 The idea is that the time variables of observers in each configuration must be regarded as having opposite signs, so
that each of them would see its piece of the action as the positive one, and none would see undesirable effects as an
exponentially enhanced probability of black hole nucleation from its past to its future (I am grateful to an anonymous
referee for rising this question).
208
giving the standard field equations for general relativity with a cosmological constant
d3 = 0,
Re
R ed3 = 0,
T ed3 = 0,
T ed3 = 0.
(15)
T ed3
= 0 implies T = 0 (and
=0
If the vielbein is invertible the equation of motion
implies T = 0).
It is of course clear that the equations of both theories are quite different and that solutions for
one theory will not in general be solutions for the other.
T ed3
2.3. Boundary conditions

When we obtained the field equations in the previous section, we should have supplemented
the action with suitable boundary conditions that make the boundary contribution to the variation
of the action vanish, so that the action is truly an extremum when the field equations hold. Here
we present two such conditions, though there may be others.
The first one, considered in Refs. [13,19], which we called background independent configuration, corresponds to the reference configuration A chosen as
e = 0,
ij = ij
and
1j = 0,
(16)
e1 )
and i, j are
where 1 corresponds to the direction normal to the boundary (the normal being
different from 1. Then ij = 0 and 1i = 1i , T = 0 and R = R 2 for the components with
support in the boundary. This configuration is the most natural and economical because given e
vanishes, which can be interand no further information is required, and the bulk term in M
preted in the sense that the second manifold is not even necessary. It is straightforward to check
that for this configuration the general relativity action equation (9) reduces to the one discussed
in Ref. [13], where it was shown that it gives a well defined action principle for asymptotically
locally AdS (ALAdS) space-times.
The second one, discussed for transgressions in Ref. [24], corresponds to
A A A 0
with a fast enough fall-off to kill the boundary term when the coordinate along the direction
normal
to the boundary approach the boundary. Looking at Eq. (14) we see that in this case

0 just as in Ref. [24], while the remaining part of the boundary contribution to the
2n
M
variation

ed2
ed2
+

d 2
d 2
M
will clearly vanish if e e 0 and 0 fast enough towards the boundary, as assumed.
The fulfillment of this condition was explicitly checked for the configurations considered in the
concrete examples below.
I find it tempting to name the second condition boundary without boundary condition, as
with opposite orientation (lets call it M
) joined at
one may regard the manifolds M and M
M as a single topological manifold M M , which however would not be a smooth manifold

in general. If the boundary were at a finite distance boundary condition would mean that there is
no discontinuity across the boundary. In fact for a boundary at a finite distance the fields would
209
have no discontinuity and the boundary term would be zero, meaning the boundary is a fictitious
one. I find the situation somewhat reminiscent of the method of images in electrostatic, where
a physical situation with a boundary is replaced by a configuration without a boundary in a wider
region.
2.4. Enhanced symmetry: Local AdS symmetry for general relativity
As mentioned before both ITrans and IGR are explicitly invariant under Lorentz transformations. The transgression action is in addition invariant off-shell under gauge translations and
hence under the whole AdS group. We will exploit that off-shell invariance to show that the GR
action with the boundary term given and for a particular value of the coefficient n is invariant
under the AdS group when certain conditions are fulfilled.
We only need to consider gauge translations generated by the gauge parameter a . The variation of ITrans is in this case

1
n1 2
n
n
e
ITrans =
R + R + 2n dt R + t 2 e2
M
1
2n

n1 2

2
2
dt R + t e
e + 2n .
(17)
But ITrans = 0 owing to the off-shell gauge invariance of the transgression action, for any field
configuration. Then

1

n1 2
n
n
2n =
e
+ R R 2n dt R + t 2 e2
M
1
+ 2n

2 2 n1 2
dt R + t e
e .
(18)
On the other hand the variation of the GR action under gauge translations is

d4
IGR = (d 3)RT e (d 3)R T ed4

M

ed1
ed1
d3 + R ed3 + n 2n
2
2
Re
+
d 2
d 2
(19)
where 2n is given above.

We will ask for the following conditions:
(i) Vanishing of the bulk terms
ed4 = 0,
(d 3)RT
(d 3)R T ed4 = 0.
(20)
(21)
This conditions are certainly fulfilled if the torsions vanish T = T = 0, which seems like a
natural and rather weak condition, but there may be other interesting configurations for which
210
the torsion does not vanish but the bulk terms are still zero, for instance some generalization to
AdS gravity of the configurations studied by Chanda and Zanelli [32], having R = R = 0 but
non-zero torsions.
(ii) Asymptotic solution to GR equations of motion. We require that

d3 R ed3 = 0
(22)
Re
M
d3 = 0 and R ed3 = 0 are satisfied asymptotically

which is satisfied if the GR equations Re
(not necessarily in all the bulk) with a fast enough fall-off when approaching the boundary, but
may also be satisfied with weaker conditions owing to cancellation between both terms.
(iii) Asymptotically AdS configurations. We require that the configuration are asymptotically
AdS, R = 0 and R = 0, or R = e2 and R = e2 , with a fast enough fall-off when approaching
the boundary to make

n

R R n = 0
M
and to make

1
dt R + t e
2n
M

=
1
e + 2n

2 2 n1 2
dt R + t e
1
2n

2 2 n1 2
n1 d1
dt t 1
e + 2n
1

n1 d1
dt t 2 1
e

2nf (n 1)ed1 + 2nf (n 1) ed1 .
(23)
It is important to remark that condition (iii) does not imply condition (ii), even though the AdS
configurations are solutions of the GR equations, because an asymptotic fall off fast enough to
make zero the terms required in (iii) may not be fast enough to kill the terms required in (ii).
1
It turns out that if (i), (ii) and (iii) are satisfied and n = n(d2)f
(n1) then the GR action is
invariant under gauge translations
IGR = 0
which together with its Lorentz invariance implies that under this conditions IGR is invariant
under the full AdS group (which is the conformal group in d 1 dimensions).
Notice that the term

ed1
ed1
2
2
d 2
d 2
M
in IGR is not automatically zero, so there is a non-trivial cancellation of this term with
M n 2n . It is this very non-trivial cancellation what fixes the relative coefficient n to the
unique value found, and there lies the heart of the invariance presented in this subsection.
211
A slightly different case is the above mentioned background independent configuration, considered in Refs. [13,19], where e = 0, ij = ij and 1j = 0, where 1 corresponds to the
direction normal to the boundary (the normal being e1 ) and i, j are different from 1. Then
ij = 0 and 1i = 1i , T = 0 and R = R 2 for the components with support in the boundary.
For the background independent configuration conditions (i) and (ii) are the same, being even
and
more easily fulfilled for e and .
The condition (iii) is modified for e and because R = R,
we require for those fields

n
R = 0.
M
That is enough to make

IGR = 0
which implies that under this conditions IGR is invariant under the full AdS group. It must be
emphasized however that under generic AdS gauge transformations the gauge condition e = 0
is not preserved.
The previous considerations are valid for the Euclidean GR action with periodic boundary
conditions in the Euclidean time, but also for the Lorentzian GR action provided that the gauge
parameters in the initial and final time space-like hypersurfaces vanish. The vanishing of the
gauge parameters for the initial and final hypersurfaces in the Lorentzian case is required because
otherwise the boundary terms of the variation coming from those hypersurfaces would spoil the
AdS invariance of the GR action.
In order to be more explicit about what it means a fast enough fall-off in conditions (ii) and
(iii) we can be more specific about the gauge parameter and afterwards look at the topological
black hole solutions considered in the next sections [3335] as examples to show that the class
of configurations satisfying both conditions is not only not empty but rather quite wide.
We may consider a gauge parameter which goes at most as the radial coordinate r as r ,
that is = O(r). For instance a parameter such that it is covariantly constant for AdS, D = 0
does satisfy that condition.7 In fact a dependence of on a higher power of r will generate gauge
transformations that are singular at the boundary.
d3 and R ed3 fall off asymptotically faster
In that case condition (ii) is satisfied if Re
than 1/r. The behaviour in the bulk is clearly not constrained by condition (ii).
Concerning condition (iii), to make8

n

R R n = 0
M
it will be necessary that R n and R n would fall off faster than 1/r. That is a quite weak
condition for asymptotically locally AdS space-times, easily met by the black hole solutions
considered in the next sections, for which the components of R and R with support at the boundary scale with r as O(1/r d3 ) (everything else in the integral being just angular factors).
7 In that case, with the coordinates and notation of next section we should have 1 = 0m = 0, 0 = 01 = C (1) r,
n
m (m) , where the C (a) s are arbitrary constants.
m = 0m = C (m) r and m
n e = n C
8 Remember that in these expressions n = d1 .
2
212
Concerning the validity of Eq. (23) what we need is that asymptotically R = 0 and R = 0 with
a fast enough fall-off to allow us to replace R by e2 and R by e2 in the integrals. The leading
order of the components with support at the boundary of e2 and e2 go as r 2 for AdS asymptotics,
2 2
2
2 2
2
and we have n 1 = d3
2 factors (R + t e ) and a factor e (or (R + t e ) and a factor e ). We
2
2
can write R = R e (R = R e ), then if we suppose = O(r) as before, the term with one R
and d 3 es vanishes because of (ii) above (and the same holds for the tilde fields for all the
paragraph) then the next order corresponds to two R and d 5 es, to kill which we must require
d4
that the components of R and R with support at the boundary fall-off with r faster than 1/r 2 .
As with condition (ii), condition (iii) does not impose any restriction on the behaviour on the
bulk.
If we consider as an example the black hole solutions of the next sections, there is no problem
d3 = 0 and
with condition (ii), as those configurations actually satisfy the field equations Re
R ed3 = 0, that is far more than what we need to require.
The first part of condition (iii) is also verified, as it was just mentioned. The second part of
condition (iii) is also verified because as we already said the components of R and R with support
at the boundary scale with r as O(1/r d3 ).
For the background independent configuration conditions (i) and (ii) lead to the same requirements again. The condition (iii) which is modified for e and to

R n = 0
leads to a required fall-off faster than 1/r for R n if = O(r), which is satisfied for the above
mentioned black hole configurations.
3. Euclidean action and thermodynamics for Schwarzschild black holes
In this section I will evaluate the Euclidean action for Schwarzschild black holes with different
asymptotic topologies, with suitable reference backgrounds. To that end I will first evaluate the
Euclidean action with A and A taken to be two black hole like configurations with the same
asymptotic topology, and eventually chose the right A so that for a given black hole configuration
A the whole Euclidean geometry is non-singular.
I will consider the action of Eq. (9) and the black hole solutions of Refs. [3335]. These
solutions have line element
ds 2 = 2 (r) dt 2 +
dr 2
2
+ r 2 dd2
2 (r)
(24)
with
2 =
2GM
+ r2
r d3
(25)
2
where dd2
is the line element of the (d 2)-dimensional manifold of constant curvature
proportional to = 1, 0, 1. The event horizon r+ is given by (r+ ) = 0.
213
3.1. Evaluation of the Euclidean action

In order to evaluate the Euclidean action for two black hole configurations with masses M
and M respectively the relevant non-vanishing ingredients are
e0 = dt,
e1 =
1
dr,

em = r em ,

2
=
dt,
1m = em ,
mn = mn ,
2
2
2

01
0m
R =
dt dr,
R =
dt em ,
2
2

1 2
1m
R =
dr em ,
R mn = 2 em en
2

01
(26)
with
2 =
2GM
+ r2
r d3
(hence ( 2 ) = r + (d 3) rGM
,
d2 ), for = 1, 0, 1. Similar expressions hold with e e,
R R, and M M. We then have

2 mn
em ,
em en ,
= ( )
mn = ( )
2 2
2 2
2 0m

dt,
dt em ,
01 =
= ( )
2
2
2
2

dt,
et0 = t + (1 t)
etm = r em ,
2 0m

2 mn
dt em ,
et
(27)
= r t + (1 t)
et
= r 2 em en .
We will need the components of R st = tR + (1 t)R t (1 t) 2 + s 2 et2 with group indices mn
and 0m. Those are

2 + s 2 r 2 em en ,
(R st )mn = t + (1 t)

2
2

s 2 dt em .
(R st )0m = t
(1 t)
+ r t + (1 t)
2
2
(28)
The bulk contribution to the Euclidean action can be evaluated using the equations of motion
R + e2 = 0 and it is
2n
2n
2(d 3)!d2 r+
IEbulk = 2(d 3)!d2 r+
(29)
where d2 is the volume of the constant curvature manifold corresponding to the sections of
fixed unity radius and fixed Euclidean time, which in the spherically symmetric case we will also
call d2 , coming from integration over the angular variables.
214
The boundary term is

1
2n = n
1
dt

m
m
m m
ds 21m1 0m2 m2n1 1m1 et0 R st 2 3 R st 2n2 2n1
m
m
0m
m m
+ 4(n 1)1m1 m2 0m3 m2n1 1m1 etm2 R st 3 R st 4 5 R st 2n2 2n1

m
m
m m
+ 201m1 m2 m2n1 01 etm1 R st 2 3 R st 2n2 2n1 .
(30)
Inserting the expressions for the terms of this equation and taking in account signs coming from
bringing the to its standard order, commuting differentials and an additional sign coming from
the orientation of the boundary (or equivalently bringing the differential dr to the front from the
canonical order dt dr em1 em2n1 dr dt em1 em2n1 ) we get
1

2n = n2(d 2)!d2
M
1
dt

t + (1 t)
ds ( )

2 + s 2 r 2 n1
t + (1 t)

2
2

2

+ 2(n 1)r( ) t
(1 t)
+ r t + (1 t) s
2
2

2 + s 2 r 2 n2
t + (1 t)

2 2

2 + s 2 r 2 n1 .
r t + (1 t)
+
(31)
2
2
We will drop terms that give a vanishing contribution when r and keep only divergent or
finite contributions in that limit. To that end we notice that
2

GM
= r + (d 3) d2
2
r
and that for r we get
r +
GM
2r r d2
and hence
2 2

G(M M)
G(M M)

= (d 3)
,

,
d2
d2
2
2
r
r

2 + s 2 r 2 s 2 1 r 2 + O(r),
t + (1 t)
2 2

O r 3(d2) ,
( )
2
2
2

r 2 ( )
+ O r 3(d2) ,
r( )
2
2

+ O r 3(d2) .

r 2 ( )
r( )
2
(32)
215
The boundary term is then

1
1
2n = n2(d 2)!d2
ds
0

t + (1 t)
dt ( )

2 + s 2 r 2 n1
t + (1 t)

t + (1 t)
+ 2(n 1)r 2 s 2 1 ( )

n2

2
2
2
+s r
t + (1 t)
1
+ n2(d 2)!d2

s 2 1 n1 .
ds (d 3)G(M M)
(33)
The integral in the parameter t can be done trough the substitution

2 + s2r 2
u = t + (1 t)
and the result is
1
2n = (d 2)!d2

2 +s 2 r 2

ds un + 2nr 2 s 2 1 un1 2 +s 2 r 2
1
+ n2(d 2)!d2

s 2 1 n1 .
ds (d 3)G(M M)
(34)
M
2 + s 2 r 2 = (s 2 1)r 2 + 2G
and 1
and that
Notice that 2 + s 2 r 2 = (s 2 1)r 2 + 2GM
r d3
r d3
2n is evaluated at the boundary where r . Keeping only terms that will give a divergent or
finite contribution we can expand

n

n1
2GM n 2
= s 1 r 2n + n2GM s 2 1
+ ,
(s 2 1)r 2 + d3
r

2
2 2GM n1 2
n1 2n2
n2
2GM
s 1 r + d3
= s 1
r
+ (n 1) 2 s 2 1
+ . (35)
r
r
The divergent contributions cancel between the upper and lower limits of the integrals. The resulting 2n is then
2n = (d 2)!d2 f (n 1)n2G(M M)
(36)
M
where
1
f (n 1) =
0

n1
(n 1)!2n1
.
ds s 2 1
= (1)n1
(2n 1)!!
216
With the choice n =
1
nf (n1)(d2)
the total action reads

2n
2n

.
r+ G(M M)
IETotal = 2(d 3)!d2 r+
We can replace =
IETotal =
1
2G(d2)!d2
(37)
to get

d1
d1
1
d2 (M M)
d2
r+
r+
.
d2 (d 2)G
d2 (d 2)
(38)
3.2. Determination of and a suitable reference background

The Euclidean time period is determined by requiring that the Euclidean solution be nonsingular. For generic black hole metrics of the form
ds 2 = 2 (r) dt 2 +
dr 2
2
+ r 2 dd2
2 (r)
with the an event horizon r+ given by (r+ ) = 0 it turns out that

=
4
(2 ) (r+ )
For a general relativity black hole of mass M it is

=
4
(d 1)r+ +
(d3)
r+
This implies
d3

r+
2
+ r+
.
2G
In order to have a sensible thermodynamics the background configuration must be chosen as
For = 0 and = 1 the proper configurations are the zero mass black
having an arbitrary .
holes, with M = 0 and r+ = 0, which correspond to AdS for = 1. This configurations have
as it can be checked that this Euclidean
= , what amounts to an ill-defined or arbitrary ,
In particular, for the evaluation of the Euclidean action
configurations are non-singular for any .
of the previous section to make sense we must take = .
In the case = 1 the situation is a little more complicated. Requiring an ill-defined =
yields in this case

d 3
r+ =
d 1
M=
and as r is positive we must pick the positive root. This value of r+ gives
M =

d3
d 3 2
1
M0 .
(d 1)G d 1
Notice that M0 is negative. Again for the evaluation of the Euclidean action to make sense we
need to take = .
217
3.3. Black hole thermodynamics

We then get
IETotal =
d1
1
d2 M
d2
d2
M0 1,
r
+
d2 (d 2)G +
d2 (d 2)
d2
(39)
which coincides with the result obtained in [13], except for a different M independent term (the
one that yields the vacuum energy) in that case.
To discuss the black hole thermodynamics we use that the Euclidean action I is related with
the free energy F as I = F , while the free energy is related to the energy E and the entropy S
as F = E T S = E S/. Equivalently
I = E + S.
(40)
Hence
I
E=
I
r
= + .
(41)
r+
The result of this calculation is

E=
d2
(M M0 1, ).
d2
(42)
The entropy can be calculated from

S = I + E
(43)
and it is
S=
2n1
2n1
2r+
d2
A
d2 r+
=
=
(2n 1)G d2
4GN
4GN
(44)
in agreement with the BekensteinHawking entropy. We used that the standard Newton constant
GN and the constant G are related as [35]
G=
8
GN .
(d 2)d2
3.4. Discussion: Boundary terms versus the HawkingPage approach

As it was already pointed out in Ref. [24] the background subtraction procedure used here is
not the same as the one proposed by Gibbons and Hawking [5], or by Hawking and Page [6].
In those papers, the actions for two different configurations (for instance, for a black hole and
Minkowski or AdS space) are subtracted, with the additional condition that the metrics match
at a very large finite radius r0 (eventually taken to infinity). In that case two different Euclidean
time intervals and are involved, because the condition

ds 2 r = ds 2 r
0
218
implies
0 )(r
0)
(r0 )(r0 ) = (r
then, even though when r0 , there is an extra contribution to the total bulk action
(the difference of the bulk actions for the configuration of interest and the background) coming
from the difference of the s [3,6].
In our approach there is always only one , as it must be in order to integrate the boundary
term B2n , where both sets of vielbein and spin connections appear entangled, but we do have an
extra contribution coming from that boundary term.
It is worthwhile to emphasize that boundary term contributions are absent in the Hawking
Page approach to asymptotically AdS space-times, as the GibbonsHawking term is zero in that
case.
It is instructive to compare both methods for the concrete example of the Schwarzschild
AdS black hole with spherical symmetry, as the extra contribution coming in the HawkingPage
method from the differing s comes in our approach from the boundary term, and both methods
agree in that case.
When it comes to the conserved charges, discussed in the next section, we will see for the concrete example of the Schwarzschild black hole that the boundary term contribution is necessary
to obtain a result for the mass in agreement with the one obtained from the thermodynamics. See
in particular Section 4.2 below and the comment in the last paragraph of that section. It is hard
to see where could this contribution come from in the HawkingPage approach, as the Noether
charges are defined as integrals in the boundary of spatial sections, therefore no integral in the
Euclidean time is done and or are not involved in the result.
4. Conserved charges from Noethers theorem
4.1. Noethers charges
The action is

I =
M

1
1 d
1 d2 1 d
d2

+ e
+ e + n
2n
Re
R e
d 2
d
d 2
d
(45)
where
1
2n = n
1
dt

n1
ds et tR + (1 t)R t (1 t) 2 + s 2 et2
.
(46)
Applying Noethers
n is a constant relative factor and the boundaries coincide M M.
theorem to this action we get the conserved current associated to the invariance under diffeomorphisms generated by the vector field ,
j = dQ
(47)
219
with9
Q =
ed2 I ed2 I + n I 2n
(d 2)
(48)
which is to be integrated at the spatial boundary S, which for instance for topological black
holes is d2 . Here
1
I 2n = n
1
dt
n1
ds I et R st
1
1
+ (n 1)
dt
0
1
dt
1
n1
ds I et R st

n2
ds et I R st R st
(49)
where R st = tR + (1 t)R t (1 t) 2 + s 2 et2 .

4.2. Black hole mass
We will evaluate the charge corresponding to = t for two black hole configurations with
masses M and M respectively the relevant non-vanishing ingredients are the same used in the
evaluation of the Euclidean action for black holes. We have
2 2
d2

d2
d2
e I e I =
d2 (d 2)!r
.
(d 2)
(d 2)
2
2
S
(50)
At the boundary r we get

(d 3)
ed2 I ed2 I =
d2 (d 2)!2G(M M).
(d 2)
(d 2)
(51)
Furthermore we have the following non-vanishing contribution to I 2n ,

1
I 2n = n
1
dt

m
m
m m
ds 201m1 m2n1 I 01 etm1 R st 2 3 R st 2n2 2n1
m
m
m m
21m1 0m2 m3 m2n1 1m1 I et0 R st 2 3 R st 2n2 2n1

m
m
0m
m m
+ 4(n 1)1m1 m2 0m3 m2n1 1m1 etm2 I R st 3 R st 4 5 R st 2n2 2n1 .
(52)
9 The contraction operator I is defined by acting on a p-form as

p
I p =
1
1 p1 dx 1 dx p1
(p 1)!
and being an anti-derivative in the sense that acting on the wedge product of differential forms p and q of order p and
q respectively gives I (p q ) = I p q + (1)p p I q .
220
Inserting the expressions for the terms of this equation and taking in account signs coming from
bringing the to its standard order we get
1

I 2n = 2n(d 2)!d2
S
dt
0

1
ds
2
2

0
2

+ s 2 r 2 n1
r t + (1 t)

t + (1 t)
t + (1 t)
2 + s 2 r 2 n1
+ ( )

2
2
+ 2(n 1)r( )
(1 t)
2
2

2

2
+ s 2 r 2 n2 .
s t + (1 t)
+ r t + (1 t)
(53)
Notice that the integrals in s and t are just the same we did in the evaluation of the Euclidean
action, then we can directly write down the result
(54)
I 2n = (d 2)!d2 f (n 1)n2G(M M)
S
1
n1
where f (n 1) = 0 ds (s 2 1)n1 = (1)n1 (n1)!2
(2n1)!! . With the choice n =
the total charge reads
Q = d2 (d 2)!2G(M M).
1
nf (n1)(d2)
(55)
S
1
We can replace = 2G(d2)!
to get
d2

d2
Q =
(M M)
d2
(56)
which is the expected result. Notice that without the I 2n contribution the charge would be

(d 3) d2
(57)
Qbulk
=
(M M).
(d 2) d2
S
It is worthwhile to emphasize the significance of this result: the contribution to the Noethers conserved charge coming from the boundary term is required to get a value of the mass in agreement
with the one coming from he thermodynamics.
5. Discussion and conclusions
The results of this paper show that the boundary term suitable for properly regularizing odddimensional AdS gravity may be regarded as dictated or suggested by a symmetry principle,
which could be seen as the reason for that particular term to work. The symmetry principle
invoked is invariance under local transformations of the AdS group, and it holds almost offshell.
221
The calculations of the thermodynamics and Noether charges of AdSSchwarzschild black

holes with different topologies are new not in the results, but are quite different in the methods
used, as the contribution of the boundary terms is crucial in our approach, as discussed in Section 3.4 and the last paragraph of Section 4.2. Our method provides a regularization procedures
which allows those calculations to be made in a uniform and systematic way for any solution.
The present work rises several questions and could be extended in several directions:
The fact that the action considered, with the precise boundary term chosen, has that extra
symmetry suggest that it may be relevant in the study of the AdSCFT correspondence, as the
AdS group in dimension d is the conformal group in dimension d 1.10 It is clear however that
the possible relevance of the setup considered here to the AdSCFT correspondence, beyond
the remarks of Ref. [13], is just an interesting open question, which would require to address
several issues, such as studying the action presented here for Dirichlet boundary conditions.
Concrete calculations that could shed light on this matter could be done concerning the issue
of the conformal anomaly of the boundary CFT, in the spirit of Refs. [7,1012]. One could
consider the situation in which A for AdS gravity corresponds to the configuration used in our
kounterterms approach, and read the conformal anomaly from the variation of the action under
radial diffeomorphisms (which induce Weyl transformations in the boundary for the boundary
conditions of Ref. [19]) with A kept fixed. This would correspond to the picture of the anomaly
as arising from the non-invariance of the regulator (or subtraction procedure), as A in the our
kounterterms approach could be seen as a regulator.
One direction that seems both interesting and accessible has to do with the extension to the
supersymmetric case. It seems reasonable to guess that the boundary terms that would result
from the transgression forms for suitable supersymmetric extensions of the AdS groups would
regulate, with proper coefficients relative to the bulk, different versions of standard supergravities which are extensions of AdS gravities. In this respect it is worthwhile to remark that the
issue of picking the right boundary terms in standard supergravities related to the M-theory is
an important one, even with possible implications for phenomenology, as it can be seen in the
recent papers by Moss [36] and references therein. The procedure followed in Ref. [36] was to
compute the boundary terms order by order in an expansion in powers of some parameter, which
is not guaranteed to end.
It is also possible and worth exploring that the transgression action of Eq. (4) would be even
better suited for eventual application to the AdSCFT correspondence, as its invariance under
gauge transformations for the AdS/conformal group is exact (and of course off-shell) without
further requirements or conditions. That would not be surprising if one believes that the effective
field theory description of the M-theory with corrections of higher order in the curvature included
could in fact be a ChernSimons supergravity, which was originally proposed in Ref. [37] and
also explored in Refs. [3840].
Finally, the action has an obvious symmetry under the interchange of A and A and change of
sign of the action, which may have non-trivial consequences worth exploring. It is intriguing that
Linde [41,42] studied a model of gravity coupled with scalar fields where the field content (in10 If that were the case, it would however be puzzling that the conformal symmetry that would be induced if one chooses
boundary conditions at infinity that do not break the symmetry and integrates out the bulk degrees of freedom would have
a local symmetry with the conformal group as gauge group, while the CFT side of the AdSCFT correspondence involves
a globally symmetric conformal field theory. It may be that integrating out the bulk degrees of freedom corresponding
to A while keeping the degrees of freedom associated to the configuration A of Refs. [13,19] as boundary degrees of
freedom of the effective theory would reduce the symmetry from a local gauge redundancy to a global symmetry.
222
cluding gravity)is duplicated (as it is for us) and a similar symmetry, which Linde calls antipodal
symmetry, is the key to a way to solve the cosmological constant problem.11
Acknowledgements
I am very grateful for discussions with R. Olea, R. Troncoso and J. Zanelli on the subject
of this paper, as well as for the warm hospitality of the members of the Centro de Estudios
Cientficos, C.E.C.S., of Valdivia, Chile during several visits while this work was being done.
I acknowledge funding for my research from the Program Fondo Nacional de Investigadores
(FNI, DINACYT, Uruguay).
References
[1] J. Maldacena, JHEP 9807 (1998) 023, hep-th/9806087.
[2] J. Maldacena, Adv. Theor. Math. Phys. 2 (1998) 231, hep-th/9711200;
J. Maldacena, Int. J. Theor. Phys. 38 (1999) 1113, hep-th/0309246.
[3] E. Witten, Adv. Theor. Math. Phys. 2 (1998) 253, hep-th/9802150.
[4] G. Horowitz, J. Polchinski, Gauge/gravity duality, gr-qc/0602037.
[5] G.W. Gibbons, S.W. Hawking, Phys. Rev. D 15 (1977) 2753.
[6] S.W. Hawking, D.N. Page, Commun. Math. Phys. 87 (1983) 577.
[7] M. Henningson, K. Skenderis, JHEP 9807 (1998) 023, hep-th/9806087.
[8] R. Emparan, C.V. Johnson, R.C. Myers, Phys. Rev. D 60 (1999) 104001, hep-th/9903238.
[9] V. Balasubramanian, P. Krauss, Commun. Math. Phys. 238 (1999) 413, hep-th/9902121.
[10] S. de Haro, S.N. Solodukhin, K. Skenderis, Commun. Math. Phys. 217 (2001) 595, hep-th/0002230.
[11] K. Skenderis, J. Mod. Phys. A 16 (2001) 740, hep-th/0010138.
[12] I. Papadimitriou, K. Skenderis, JHEP 0508 (2005) 004, hep-th/0505190.
[13] P. Mora, R. Olea, R. Troncoso, J. Zanelli, Vacuum energy in odd-dimensional AdS gravity, hep-th/0412046.
[14] O. Miskovic, R. Olea, Phys. Lett. B 640 (2006) 101, hep-th/0603092.
[15] R. Olea, Regularization of odd-dimensional AdS gravity: Kounterterms, hep-th/0610230.
[16] R. Aros, M. Contreras, R. Olea, R. Troncoso, J. Zanelli, Phys. Rev. Lett. 84 (2000) 1647;
R. Aros, M. Contreras, R. Olea, R. Troncoso, J. Zanelli, Phys. Rev. D 62 (2000) 044002.
[17] R. Aros, C. Martnez, R. Troncoso, J. Zanelli, JHEP 0205 (2002) 020, hep-th/0204029.
[18] J. Zanelli, Lecture notes on ChernSimons (super-)gravities, hep-th/0502193.
[19] P. Mora, R. Olea, R. Troncoso, J. Zanelli, Finite action principle for ChernSimons AdS gravity, JHEP 0406 (2004)
036, hep-th/0405267.
[20] S.S. Chern, Complex Manifolds without Potential Theory, second ed., Springer, Berlin, 1979.
[21] T. Eguchi, P. Gilkey, A. Hanson, Phys. Rep. 66 (1980) 213.
[22] M. Nakahara, Geometry, Topology and Physics, Adam Hilger, 1991.
[23] P. Mora, Transgression forms as unifying principle in field theory, PhD thesis, Universidad de la Repblica, Uruguay,
2003, hep-th/0512255.
[24] P. Mora, R. Olea, R. Troncoso, J. Zanelli, Transgression forms and extensions of ChernSimons gauge theories,
JHEP 0602 (2006) 067, hep-th/0601081.
[25] R. Aros, M. Contreras, R. Olea, R. Troncoso, J. Zanelli, Charges in 2 + 1 Dimensional Gravity and Supergravity,
presented at the Strings99 Conference, Postdam, Germany, July 1999.
[26] P. Mora, H. Nishino, Phys. Lett. B 482 (2000) 222, hep-th/0002077.
[27] P. Mora, Nucl. Phys. B 594 (2001) 229, hep-th/0008180.
[28] A. Borowiec, M. Ferraris, M. Francaviglia, J. Phys. A 36 (2003) 2589.
11 In Refs. [41,42] both fields actually see each other in the bulk indirectly through the common volume element in the
action, while in the models discussed here they only see each other in the boundary. However in the presence of branes
with boundaries both fields interact at the branes boundaries [26,27], so these could mediate a bulk to bulk interaction. It
is worthwhile to remark that while antipodal symmetry in Lindes model is postulated ad hoc in the models discussed
here it is a natural byproduct of the construction of an action with enhanced symmetry.
223
[29] A. Borowiec, L. Fatibene, M. Ferraris, M. Francaviglia, Covariant Lagrangian formulation of ChernSimons and
BF theories, hep-th/0511060.
[30] F. Izaurieta, E. Rodriguez, P. Salgado, On transgression forms and ChernSimons (super)gravity, hep-th/0512014.
[31] R. Aros, Charges and the boundary in ChernSimons gravity, gr-qc/0601120.
[32] O. Chanda, J. Zanelli, Phys. Rev. D 55 (1997) 7580, hep-th/9702025.
[33] R.B. Mann, Class. Quantum Grav. 14 (1997) L109, gr-qc/9607071.
[34] R. Emparan, Phys. Lett. B 432 (1998) 74, hep-th/9804031.
[35] J. Crisstomo, R. Troncoso, J. Zanelli, Phys. Rev. D 62 (2000) 084013.
[36] I. Moss, Nucl. Phys. B 729 (2005) 179, hep-th/0403106.
[37] R. Troncoso, J. Zanelli, Int. J. Theor. Phys. 38 (1999) 1181, hep-th/9807029.
[38] P. Horava, Phys. Rev. D 59 (1999) 046004, hep-th/9712130.
[39] M. Banados, Phys. Rev. Lett. 88 (2002) 031301, hep-th/0107214.
[40] H. Nastase, Towards a ChernSimons M theory of OSp(1|32) OSp(1|32), hep-th/0306269.
[41] A. Linde, Phys. Lett. B 200 (1988) 272.
[42] A. Linde, Inflation and Quantum Cosmology, Academic Press, 1990, see pp. 142146.
Nuclear Physics B 775 [FS] (2007) 225282
Analytic theory of the eight-vertex model

Vladimir V. Bazhanov, Vladimir V. Mangazeev
Department of Theoretical Physics, Research School of Physical Sciences and Engineering,
Australian National University, Canberra, ACT 0200, Australia
Received 27 September 2006; accepted 20 December 2006
Available online 13 January 2007
Abstract
We observe that the exactly solved eight-vertex solid-on-solid model contains an hitherto unnoticed arbitrary field parameter, similar to the horizontal field in the six-vertex model. The parameter is required to
describe a continuous spectrum of the unrestricted solid-on-solid model, which has an infinite-dimensional
space of states even for a finite lattice. The introduction of the continuous field parameter allows us to
completely review the theory of functional relations in the eight-vertex/SOS-model from a uniform analytic point of view. We also present a number of analytic and numerical techniques for the analysis of the
Bethe ansatz equations. It turns out that different solutions of these equations can be obtained from each
other by analytic continuation. In particular, for small lattices we explicitly demonstrate that the largest and
smallest eigenvalues of the transfer matrix of the eight-vertex model are just different branches of the same
multivalued function of the field parameter.
PACS: 02.30.Ik; 05.30.-d; 11.25.Hf
Keywords: Eight-vertex model; Bethe ansatz; Q-operator; Functional relations
1. Introduction
The powerful analytic and algebraic techniques discovered by Baxter in his pioneering papers
[14] on the exact solution of the eight-vertex lattice model laid the foundation for many important applications in the theory of integrable systems of statistical mechanics and quantum field
theory.
E-mail addresses: vladimir.bazhanov@anu.edu.au (V.V. Bazhanov), vladimir@maths.anu.edu.au (V.V. Mangazeev).

doi:10.1016/j.nuclphysb.2006.12.021
226
V.V. Bazhanov, V.V. Mangazeev / Nuclear Physics B 775 [FS] (2007) 225282
This paper concerns one of these techniquesthe method of functional relations. Over the last
three decades, since Baxters original works [14], this method has been substantially developed
and applied to a large number of various solvable models. However, the status of this method
in the eight-vertex model itself with an account of all subsequent developments has not been
recently reviewed. This paper is intended to (partially) fill this gap. Here we will adopt an analytic
approach exploiting the existence of an (hitherto unnoticed) continuous field parameter in the
solvable eight-vertex solid-on-solid model of Ref. [3].
For the purpose of the following discussion it will be useful to first summarize the key results of [14]. Here we will use essentially the same notations as those in [1]. Consider the
homogeneous eight-vertex (8V) model on a square lattice of N columns, with periodic boundary
conditions. The model contains three arbitrary parameters u, and q = ei , Im > 0, which
enter the parametrization of the Boltzmann weights (the parameter q enters as the nome for
the elliptic theta-functions). The parameters and q are considered as constants and the spectral parameter u as a complex variable. We assume that the parameter is real and positive,
0 < < /2, which corresponds to the disordered regime [5] of the model.
The row-to-row transfer matrix of the model, T(u), possesses remarkable analytic properties.
Any of its eigenvalues, T(u), is both (i) an entire function of the variable u, and (ii) satisfies
Baxters famous functional equation,
T(u)Q(u) = f(u )Q(u + 2) + f(u + )Q(u 2),
(1.1)
where1
N
f(u) = 4 (u|q)
(1.2)
and Q(u) is an entire quasi-periodic function of u, such that

Q(u + ) = (1)N/2 Q(u),
Q(u + 2 ) = q2N e2iuN Q(u).
(1.3)
These analytic properties completely determine all eigenvalues of the transfer matrix T(u). Indeed, Eq. (1.1) implies that the zeroes u1 , u2 , . . . , un , of Q(u) satisfy the Bethe ansatz equations,
f(uk + )
f(uk )
Q(uk + 2)
Q(uk 2)
Q(uk ) = 0, k = 1, . . . , n.
(1.4)
These equations, together with the periodicity relations (1.3), define the entire function Q(u)
(there will be many solutions corresponding to different eigenvectors). Once Q(u) is known the
eigenvalue T(u) is evaluated from (1.1).
The entire functions Q(u) appearing in (1.1) are, in fact, eigenvalues of another matrix, Q(u),
called the Q-matrix. Originally it was constructed [1] in terms of some special transfer matrices.
A different, but related, construction of the Q-matrix was given in [2] and later on used in the
book [5]. An alternative approach to the 8V-model was developed in [3,4] where Baxter invented
the eight-vertex solid-on-solid (SOS) model and solved it exactly by means of the co-ordinate
Bethe ansatz. This approach provided another derivation of the same result (1.1)(1.4), since the
8V-model is embedded within the SOS-model.
1 Here we use the standard theta-functions [6], (u|q), i = 1, . . . , 4, q = ei , Im > 0, with the periods and .
i
Our spectral parameter u is shifted with respect to that in [1] by a half of the imaginary period, see Section 2.5.1 for
further details.
227
Baxters Q-matrix (or the Q-operator) possesses various exceptional properties and plays an
important role in many aspects of the theory of integrable systems. A complete theory of the Qoperator in the 8V-model is not yet developed. However for the simpler models related with the
2 ) (where the fundamental L-operators [7] are intertwined by the Rquantum affine algebra Uq (sl
matrix of the six-vertex model) the properties of the Q-operator are very well understood [8]. In
this case the Q-operators (actually, there are two different Q-operators, Q+ and Q ) are defined
as traces of certain monodromy matrices associated with infinite-dimensional representations
of the so-called q-oscillator algebra. The main algebraic properties of the Q-operators can be
concisely expressed by a single factorization relation

T+
(1.5)
j (u) = Q+ u + (2j + 1) Q u (2j + 1)
where T+
j (u) is the transfer matrix associated with the infinite-dimensional highest weight representation of Uq (sl2 ) with an arbitrary (complex) weight 2j . Remarkably, this relation alone leads
to a simple derivation of all functional relations involving various fusion transfer matrices and
Q-operators [8,9]. For this reason Eq. (1.5) can be regarded as a fundamental fusion relation:
Once it is derived, no further algebraic work is required.
An important part of the theory of the Q-operators belongs to their analytic properties with respect to a certain parameter, which we call here the field parameter. In the context of conformal
field theory (considered in [8,9]) this is the vacuum parameter, which determines the Virasoro
highest weight ; in the six-vertex model it corresponds to the horizontal field. In fact, the very
existence of two different solutions [8,10] of the TQ-equation (1.1) can be simply illustrated by
the fact that the spectrum of the transfer matrix does not depend on the sign of the field, whereas
the spectrum of the Q-operator does.
It is well known that it is impossible to introduce an arbitrary field parameter into the zerofield or symmetric eight vertex model of [1] without destroying its integrability. However,
such parameter is intrinsically present in the solvable SOS-model. It does not enter the Boltzmann weights, but arises from a proper definition of the space of states of the model. To realize
this recall that the SOS-model [4] is an interaction-round-a-face model where the face variables
i (called the heights) take arbitrary integer values < i < +. Its transfer matrix acts in an
infinite-dimensional space of states even for a finite lattice. It has a continuous spectrum, parameterized by the eigenvalue of the operator which simultaneously increments all height variables,
i i + 1, on the lattice. Indeed, taking into account the results of [11,12], it is not difficult to
conclude that the calculations of [4] require only a very simple modification to deduce that the
eigenvalues of the SOS transfer matrix enjoy the same TQ-equation (1.1), but require different
periodicity properties
Q (u + ) = ei Q (u),
Q (u + 2 ) = q2N e e2iuN Q (u),
(1.6)
where the exponent is arbitrary. It is determined by the eigenvalue = e2i/ of the height
translation operator2 (the second exponent is dependent on ). It is natural to assume, that
2 In [4] Baxter restricted the parameter to the rational values L = m + m , L, m , m Z and considered
1
2
1
2
a finite-dimensional subspace of the whole space of states, regarding the values of heights modulo L. In this case the
phase factors = e2i/ take quantized values L = 1 (see [11,13] for further discussion of this point). Apart from
providing the conceptual advantage of a finite-dimensional space of states, the above restriction on and was not used
anywhere else in [4] and, therefore, can be removed. The transfer matrix of the 8V-vertex model (reformulated as the
SOS-model) acts only in the finite-dimensional subspace of the SOS space of states, corresponding to a discrete set of
exponents = k and = 0 (the value of N is assumed to be even).
228
the functions Q (u) solving these equations, are eigenvalues of the Q-operators for the SOSmodel. Of course, it would be very desirable to obtain their explicit definition (and generalize
the algebraic result (1.5) to the SOS-model), however, many properties of these operators can
already be deduced from the information about their eigenvalues.
In this paper we will develop the analytic theory of the functional relations for the SOS-model
starting from the eigenvalue equations (1.1) and (1.6). Bearing in mind that the TQ-equation (1.1)
arises from very non-trivial algebraic fusion relations [1], it is not surprising that it implies all
other functional relations. The required calculations are essentially the same as those in [8,9],
apart from trivial modifications arising in the context of lattice models.
The eigenvalues Q (u) are two linear independent Bloch wave solutions [8,10] of the finite
difference equation (1.1) for the same T(u). Their quantum Wronskian W(), defined as,
2i W()f(u) = Q+ (u + )Q (u ) Q+ (u )Q (u + ),
(1.7)
is a complicated function of , and q, depending on the eigenvalue T(u). The Bloch solutions Q (u) are well defined provided the exponent does not take some singular values
(see Eq. (2.17) below), where W() vanishes. Otherwise Eq. (1.1) has only one quasi-periodic
solution, while the second linear independent solution does not possess any simple periodicity
properties.
All singular cases (in fact, they split into different classes) can be effectively studied with a
limiting procedure starting from a non-singular value of . In the simplest case, when is generic
and approaches the points = k , k Z, the solutions Q+ (u) and Q (u) smoothly approach
the same value (which for even N coincides with the eigenvalue Q(u) of the 8V-model).
A more complicated situation occurs when the field tends to a singular value, say = 0,
while the parameter simultaneously approaches some rational fraction of , where the transfer matrix of the 8V-model has degenerate eigenvalues. The limiting value of T(u) is always
uniquely defined. However, if T(u) is a degenerate eigenvalue, the limiting values of Q (u) are
not uniquely defined. They have complete exact strings of zeroes whose position can be made
arbitrary by changing the direction of the two-parameter (, )-limit. Obviously, this reflects a
non-uniqueness of eigenvectors for degenerate states [13]. An immediate consequence of this
phenomenon is that, for rational , there is no unique algebraic definition of the Q-operator in
the symmetric 8V-model. This explains an important observation of [14], that Baxters two Qoperators, constructed in [1] and [2], are actually different operators, with different eigenvalues
for degenerate eigenstates.
Further, the eigenvalues Q (u), considered as functions of , have rather complicated analytic properties. Besides having the (relatively simple) singular points discussed above, they are
multivalued functions with algebraic branching points in the complex -plane. The corresponding multi-sheeted Riemann surface appears to be extremely complicated; we were only able to
numerically explore it for some particular eigenvalues.
It is easy to see that the roots of the Bethe ansatz equations (1.4), considered as functions of
satisfy a system of the first order ordinary differential equations, duk /d = Uk (u1 , u2 , . . . , un ),
where Uk are meromorphic functions of their arguments. Using these equations one can analytically continue any particular solution of (1.4) along a continuous path between two points,
corresponding to the same value of on different sheets of the Riemann surface. In general, the
resulting set of roots u1 , u2 , . . . , un differs from the initial one, but, of course, satisfies exactly
the same Eq. (1.4). In other words, different solutions of the Bethe ansatz equations and, therefore, different eigenvalues of the transfer matrix can be obtained from each other by the analytic
continuation in the parameter .
229
Guided by the above observation one might be tempted to suggest that the Bethe ansatz equations have only one solution, considered as a function of . Undoubtedly, that could be an elegant
resolution to the problem of completeness, which traditionally attracts a lot of attention in the
literature (see, e.g., the recent papers [13,15] and references therein). At the moment we cannot
prove or disprove the above assertion. The analytic structure of the eigenvalues of the eight-vertex
SOS model is quite complicated and certainly deserves further detailed studies.
This material was planned as an introductory part for an extended version of our previous work
[16,17] devoted to the connection of the 8V-model with the Painlev transcendents. However, in
the course of writing, we realized that a review of the theory of the functional relations in the
8V-model could be of interest to a much wider audience than originally intended and deserves a
separate publication. A detailed account of the results presented in [16,17] will be given in the
second paper of this series [18], which is totally devoted to the special = /3 case of the eight
vertex model with an odd number of sites. There we will consider remarkable connections of
this special model with various differential equations, including the celebrated Lam, Mathieu,
Painlev III and Painlev VI equations.
The organization of the paper is as follows. In Section 2 we present the analytic theory of
the functional relations in the 8V/SOS-model. In Section 3 we discuss some applications of the
quantum Wronskian relation. In particular, we show how it can be used for the analysis of the
degenerate states. In Section 4 we completely classify eigenvalues of the transfer matrix of the
symmetric 8V-model for small lattices of the size N 4. We then study the analytic properties of
the eigenvalues with respect to the field variable with a combination of analytic and numerical
techniques. In conclusion we briefly summarize obtained results. Basic properties of the 8Vmodel are reviewed in Appendix A.
2. Functional relations in the eight-vertex SOS model
2.1. Overview
In this section we will outline the analytic theory of the functional relations in the SOS-model
(which also covers the symmetric 8V-model). Actually most of the functional relations discussed
below are quite universal and apply to a wider class of related model. They include the six-vertex
model in a field [19], the restricted solid-on-solid (RSOS) model [20] and some integrable models
of quantum field theory: the c < 1 conformal field theory [8] and the massive sine-Gordon model
in a finite volume [21].
Let T(u) and Q(u) denote the eigenvalues of the transfer matrix and the Q-operator respectively and is an arbitrary real parameter in the range
0 < < /2.
(2.1)
In all subsequent derivations we will use only one general assumption about the properties of the
eigenvalues:
We assume that T(u) and f(u) are entire periodic function of the variable u,
T(u + ) = T(u),
f(u + ) = f(u),
(2.2)
and that the function Q(u) solving the TQ-equation,

T(u)Q(u) = f(u )Q(u + 2) + f(u + )Q(u 2),
(2.3)
230
is a also an entire (but not necessarily periodic) function of u.

For every particular model the above requirements are supplemented by additional, modelspecific analyticity properties of Q(u) (such as, for example, the imaginary period relation (1.3)
for the 8V-model). These properties will only be used in Sections 3 and 4; they are discussed at
the end of this section.
As explained in the introduction, once the additional analyticity properties are fixed, the
functional equation (2.3) completely determines all eigenvalues T(u) and Q(u). For certain applications, however, it is more convenient to use other functional equations in addition to (or instead
of) (2.3). We will show that all such additional functional relations in the SOS-model (and in the
related models mentioned above) follow elementary from two ingredients:
(i) the TQ-equation itself (Eqs. (2.2) and (2.3) above), and
(ii) the fact that for the same eigenvalue T(u) this equation has two different [8,10] linearly
independent solutions for Q(u) which are entire functions of u.
The only property of the function f(u) essentially used in this section is its periodicity (2.2).
For technical reasons we will also assume that f(u ) and f(u + ) do not have common zeroes.
This is a very mild assumption, excluding rather exotic row-inhomogeneous models, which are
beyond the scope of this paper.
2.2. General functional relations
Since T(u) is an entire function, Eq. (2.3) implies that the zeroes u1 , u2 , . . . , un of any eigenvalue Q(u) satisfy the same set of the Bethe ansatz equations
f(uk + )
f(uk )
Q(uk + 2)
Q(uk 2)
Q(uk ) = 0, k = 1, . . . , n,
(2.4)
where the number of zeroes, n, is determined by the model-specific analyticity properties.

For any given eigenvalue T(u) introduce an infinite set of functions Tk (u), k = 3, 4, . . . , ,
defined by the recurrence relation
Tk (u + )Tk (u ) = f(u + k)f(u k) + Tk1 (u)Tk+1 (u),
k 2,
(2.5)
where
T0 (u) 0,
T1 (u) f(u),
T2 (u) T(u).
(2.6)
This relation can be equivalently rewritten as

T(u)Tk (u + k) = f(u )Tk1 u + (k + 1) + f(u + )Tk+1 u + (k 1) ,
(2.7a)
or as
T(u)Tk (u k) = f(u + )Tk1 u (k + 1) + f(u )Tk+1 u (k 1) .
Using the definition (2.5) one can easily express Tk (u) in terms of T(u) as a determinant

1
Tk (u) = f(k) (u)
detMab (u + k)1a,bk1 , k 2,
(2.7b)
(2.8)
231
where the (k 1) by (k 1) matrix M(u)ab , 1 a, b k 1, is given by

Mab (u) = a,b T(u 2a) a,b+1 f u (2a + 1) a+1,b f u (2a 1) ,
(2.9)
while the normalization factor reads

f(k) (u) =
k3

f u (k 3 2) .
(2.10)
=0
Finally, expressing T(u) from (2.3) through the corresponding eigenvalue Q(u) one arrives to the
formula
Tk (u) = Q(u k)Q(u + k)
k1

f(u + (2 k + 1))
=0
Q(u + (2 k))Q(u + (2 k + 2))
(2.11)
valid for k = 1, 2, . . . , . Note, that the Bethe ansatz equations (2.4) guarantee that all the higher
Tk (u) with k 3 are entire functions of u as well as T(u). It is worth noting that these functions
are actually eigenvalues of the higher transfer matrices, obtained through the algebraic fusion procedure [22]. In our analytic approach this information is, of course, lost. Nevertheless it
will be useful to have in mind that the index k in the notation Tk (u) refers to the dimension of
the auxiliary space in the definition of the corresponding transfer matrix. Another convenient
scheme of notation for higher transfer matrices (used, e.g., in [9]) is based on (half-)integer spin
labels j , such that k = 2j + 1.
In a generic case Eq. (2.3) has two linear independent Bloch wave solutions Q (u), defined
by their quasi-periodicity properties,
Q (u + ) = ei Q (u),
(2.12)
where the exponent depends on the eigenvalue T(u). These solutions satisfy the quantum
Wronskian relation
2i W()f(u) = Q+ (u + )Q (u ) Q+ (u )Q (u + ),
(2.13)
where W() does not depend on u. Indeed, equating the two alternative expressions for T(u),
T(u)Q+ (u) = f(u )Q+ (u + 2) + f(u + )Q+ (u 2),
(2.14)
T(u)Q (u) = f(u )Q (u + 2) + f(u + )Q (u 2),
(2.15)
and
and writing W() as W(|u) (to assume its possible u-dependence, which cannot be ruled out just
from the definition (2.13)) one gets W(|u + ) = W(|u ). On the other hand, Eqs. (2.13),
(2.2) and (2.12) imply a different periodicity relation W(|v + ) = W(|v). For generic real ,
these two periodicity relations can only be compatible if W(|u) is independent of u,
W(|u) W().
(2.16)
When and are in general position, the eigenvalues Q (u) are locally analytic functions of
, therefore, by continuity, Eq. (2.16) at generic holds also when / is a rational number.
However, when takes special values (for example, in the symmetric 8-vertex model) Eq. (2.16)
for rational / cannot be established by the analytic arguments only.
232
Obviously, the condition (2.12) defines Q (u) up to arbitrary u-independent normalization

factors. Using this freedom, it is convenient to assume the normalization3 such that neither of
Q (u) vanishes identically (as a function of u) or diverges at any value of . Then the quantum
Wronskian W() will take finite values, but still can vanish at certain isolated values of the exponent . These values are called singular in the sense that there is only one quasi-periodic solution
(2.12), while the second linear independent solution of (2.3) does not possess the simple periodicity properties (2.12). As argued in [8], the singular exponents take values in the dangerous
set
dang = k +
2
,
2
k, Z.
(2.17)
However, each eigenvalue has its own set of singular exponents, being a subset of (2.17).
Evidently, Q(u) in (2.11) can be substituted by any of the two Bloch solutions Q (u), so
there are two alternative expression for each Tk (u). Further, multiplying (2.14) and (2.15) by
Q (u 2) and Q+ (u 2) respectively, subtracting resulting equations and using (2.13) one
obtains
2i W()T(u) = Q+ (u + 2)Q (u 2) Q+ (u 2)Q (u + 2).
(2.18)
The last result, combined with the determinant formula (2.8), gives
2i W()Tk (u) = Q+ (u + k)Q (u k) Q+ (u k)Q (u + k),
(2.19)
where k = 0, 1, 2, . . . , .
All the functional relations presented above are general corollaries of the TQ-equation (2.2),
(2.3).
2.3. Rational values of
Let us now assume that
2L = m,
1 m L 1, L 2,
(2.20)
where m and L are mutually prime integers. Evidently,

2k = (mod ),
1 k L 1.
(2.21)
Combining the expression (2.11) with (2.2) and (2.12), one immediately obtains the following
functional relation,

1
(2.22)
TL+k (u) = 2 cos(m)Tk u + m + TLk (u), k = 1, 2, . . . ,
2
which shows that, for the rational of the form (2.20), all higher Tk (u) with k L are the linear
combinations of a finite number of the lower Tk (u) with k L. This relation is a simple corollary
of the TQ-equation. It always holds for the rational values of and does not require the existence
of the second Bloch solution in (2.12) (indeed, Eq. (2.22) is independent of the sign of ). Setting
3 In the context of the 8V/SOS-model this is the most natural normalization. The eigenvalues Q (u) are factorized
in products of theta functions (see (3.15) below) and the variation of only affects positions of zeroes. Obviously, the
transfer matrix eigenvalues, T(u), do not have any singularities in .
k = 1 in (2.22) one obtains
TL+1 (u) = 2 cos(m)f u +

1
m + TL1 (u).
2
233
(2.23)
This allows one to bring Eq. (2.5) with k = L to the form

TL (u + )TL (u )

1
1
= f u + m + eim TL1 (u) f u + m + eim TL1 (u)
2
2
(2.24)
where the periodicity (2.2) of the function f(u) was taken into account. Thus, for the rational ,
Eq. (2.5) with k = 2, 3, . . . , L 1 together with Eq. (2.24) form a closed system of functional
equations for a set of L 1 eigenvalues {T(u), T3 (u), . . . , TL (u)}. Given that all Tk (u) with k 3
are recursively defined through T(u), this system of equation leads to a single equation involving
T(u) only. Indeed, substituting the determinant formulae (2.8) into (2.23) one obtains

ab (u)
detM
(2.25)
= 0,
1a,bL
where the L by L matrix reads
ab (u) = Mab (u) a,1 b,L f(u 3) 1 a,L b,1 f(u + )
M
(2.26)
with Mab (u) given by (2.9) and = eim .

2.3.1. Non-zero quantum Wronskian
Continuing the consideration of the rational case (2.20), let us additionally assume that both
quasi-periodic solutions (2.12) exist and that their quantum Wronskian (2.13) is non-zero. It
is worth noting that the functions Q (u) in this case cannot contain complete exact strings.
A complete exact string (or, simply, a complete string) is a ring of L zeroes u1 , . . . , uL , where
each consecutive zero differs from the previous one by 2, closing over the period ,
uk+1 = uk + 2,
k = 1, . . . , L,
uL+1 = u1 (mod ).
(2.27)
It is easy to see that any such string manifests itself as a factor in the RHS of (2.13), but not in its
LHS (unless, of course, W() = 0).
It follows from (2.12) that
Q (u + m) = eim Q (u).
Using (2.19) and (2.20) one easily obtains the two equivalent relations,

1
e+im Tk (u) + TLk u + m = C()Q+ (u + k)Q (u k),
2
and
e
im

1
Tk (u) + TLk u + m = C()Q+ (u k)Q (u + k),
2
(2.28)
(2.29a)
(2.29b)
where k = 0, 1, . . . , L and
C() =
sin(m)
.
W()
(2.30)
234
In particular, for k = 0 one gets,

1
TL u + m = C()Q+ (u)Q (u).
2
(2.31)
Quote also one simple but useful4 consequence of (2.29),

+im

Tk (u) + TLk (u + 12 m)
e
Q+ (u + k)
Q+ (u k)
log
log
= log
.
Q (u + k)
Q (u k)
eim Tk (u) + TLk (u + 12 m)
(2.32)
This first-order finite difference equation relates the ratio Q+ /Q with the eigenvalues of the
(higher) transfer matrices.
Introduce the meromorphic functions
(u) = eim
L1

f(u + (2 + 1))
=0
Q (u + 2)Q (u + (2 + 2))
such that

2

2

1
TL u + m = Q+ (u) + (u) = Q (u) (u).
2
(2.33)
(2.34)
With this definition all the relations (2.29) reduce to a single relation which again can be written
in two equivalent forms
+ (u) = C()
Q (u)
Q+ (u)
(u) = C()
Q+ (u)
Q (u)
(2.35)
Obviously,

2
+ (u) (u) = C() ,
+ (u)
=
(u)
Q (u)
Q+ (u)
2
.
(2.36)
2.3.2. The RSOS regime and its vicinity

Further reduction of the functional relation in the rational case (2.20) occurs for certain special
values of the field from the set
m = (r + 1),
r = 0, 1, 2, . . . .
(2.37)
Consider the effect of varying in the relation (2.31). The eigenvalue TL (u) in the LHS will
remain finite, so as the eigenvalues Q (u) in the RHS. The latter also do not vanish identically
(as functions of u) at any value of (see the discussion of our normalization assumptions before
(2.17) above). Therefore the coefficient C(), defined in (2.30), is always finite. This means
that in the rational case (2.20), the quantum Wronskian, W(), can only vanish at zeroes of the
numerator in (2.30). However, the converse is not true: W() does not necessarily vanish when
C() = 0. Here we are interested in this latter case where
C() = 0,
W() = 0
(2.38)
4 Namely this relation with k = 1 was used in [23] to show that for rational values of the expression for the non-linear
mobility for the quantum Brownian particle in a periodic potential obtained in [8] exactly coincide with that of [24] found
from the thermodynamic Bethe ansatz.
235
with from the set (2.37). By definition we call it the RSOS regime. The relations (2.29) and
(2.31) reduce to

1
(2.39a)
Tk (u) = (1)r TLk u + m , k = 1, . . . , L 1,
2
and
TL (u) = 0.
(2.39b)
All these relations can be written as a single relation (in two equivalent forms involving only
Q+ (u) or Q (u), respectively),
+ (u) = (u) = 0,
(2.40)
with (u) defined by (2.33).

The special truncation relations (2.39), exactly coincide with those appearing in the RSOSmodel [20]. These were obtained [25,26] by the algebraic fusion procedure [27] and hold for
all eigenvalues of the RSOS model. The above analysis shows that all eigenvalues of the RSOS
model are non-singular. The quantum Wronskian of the Bloch solutions (2.12) is always nonzero (otherwise the coefficient C() in (2.29) would not have vanished). For this reason the
solutions of the Bethe ansatz equations for the RSOS model cannot contain complete strings.
Since, as argued in [13] the complete strings are necessary attributes of degenerate states, one
arrives to a rather non-trivial statement: The spectrum of the transfer matrix in the RSOS model
is non-degenerate.
Consider now the vicinity of the RSOS regime, when and are approaching their limiting
values given by (2.20) and (2.37), respectively. Interestingly, one can express some - and derivatives
Tk (u|, ),
Tk (u) =
Tk (u|, ),
Tk (u) =
(2.41)
calculated at the RSOS point,

(, ) = m/2L, (r + 1)/m ,
(2.42)
in terms of the corresponding values of Q (u) and their first order u-derivatives
Q (u|, ).
u
Using (2.19) one obtains,

Tk (u) (1)r TLk (u + m/2) + Lv Tk (u)

L

=
Q+ (u + k)Q (u k) Q+ (u k)Q (u + k) ,
i W()

Tk (u) (1)r TLk (u + m/2)

m
=
Q+ (u + k)Q (u k) + Q+ (u k)Q (u + k) ,
2W()
Q (u) =
(2.43)
(2.44)
where the expressions in the RHS are calculated directly at the point (2.42). According to the
definitions (2.6), T0 (u) and T1 (u) do not depend on and at all, therefore, one can express
- and -derivatives of TL1 (u) and TL (u) at the RSOS point (2.42) in terms of the of values
Q (u) and Q (u).
236
2.4. Zero field case

Consider now the zero field limit = 0. Let us return to the case of an irrational / where the
spectrum of the transfer matrix is non-degenerate. The eigenvalues Q (u), corresponding to the
same eigenstate smoothly approach the same value at = 0. Moreover, adjusting a -dependent
normalization of Q (u) one can bring their small expansion to the form

0 (u)/2 + O 2 , 0,
Q (u) = Q0 (u) Q
(2.45)
where

Q0 (u) = Q (u)=0 ,
0 (u) = 2
Q

d Q+ (u)
d Q (u)
=
2
.
d =0
d =0
(2.46)
From (2.12) it follows that

Q0 (u + ) = Q0 (u),
0 (u + ) = Q
0 (u) + 2i Q0 (u).
Q
(2.47)
0 (u) is totally determined by Q0 (u),

It easy to see that the quasi-periodic part of Q
2iu
(per) (u).
Q0 (u) + Q
0
However, the periodic part

0 (u) =
Q
Q
0
(per)
(per) (u),
(u + ) = Q
0
(2.48)
(2.49)
can only be determined up to an additive term proportional to Q0 (u). Indeed, consider the effect
of an inessential normalization transformation
Q (u) e Q (u),
(2.50)
0 (u)
where is a constant. The value of Q0 (u) remains unchanged while the periodic part of Q
transforms as
Q
0
(per)
(u) Q
0
(per)
(u) 2 Q0 (u).
(2.51)
The quantum Wronskian relation (2.13) reduces to

0 (u ) Q0 (u )Q
0 (u + ) = 2i W
(0)f(u),
Q0 (u + )Q
(2.52)
where

d W()
W (0) =
.
d =0

(2.53)
The expression (2.19) now becomes

0 (u k) Q0 (u k)Q
0 (u + k).
2i W (0)Tk (u) = Q0 (u + k)Q
(2.54)
It is easy to see that at = 0 the TQ-equation (2.3) is satisfied if Q(u) there is replaced by either
0 (u). The same remark applies to the more general equation (2.11).
of Q0 (u) or Q
The Bethe ansatz equations (2.4) for the zeroes of Q0 (u) are the standard equations [1] arising
in the analysis of the symmetric 8V-model. Exactly the same equations also hold for the zeroes
0 (u) is an entire function of u, it
0 (u), but their usefulness is very limited. Even though Q
of Q
lacks the simple periodicity (cf. (2.47)) and, therefore, does not possess any convenient product
237
0 (u), makrepresentation. Moreover, the transformation (2.51) affects the position of zeros of Q
0 (u) useless. Fortunately,
ing them ambiguous. All this renders the Bethe ansatz equations for Q
0 (u). Once the zeros of Q0 (u) are
these equations are not really required for determination of Q
0 (u) is explicitly calculated from (2.52) (see (3.29) below).
known the function Q
Additional functional relations arise in the rational case (2.20). These relations are straightforward corollaries of (2.29), (2.31) and (2.35). For instance, Eq. (2.29) gives
Tk (u) + TLk (u + m/2) = C(0)Q0 (u + k)Q0 (u k),
= 0,
(2.55)
where
C(0) =
m
.
W (0)
(2.56)
All these relations (with different k) can be equivalently rewritten as a single relation
L1

f(u + (2 + 1))
=0
Q0 (u + 2)Q0 (u + (2 + 2))
= C(0),
= 0,
which is the = 0 version of (2.35). Setting k = 0 in (2.55) one gets

2
TL (u + m/2) = C(0) Q0 (u) , = 0.
(2.57)
(2.58)
Thus, at = 0 the eigenvalue TL (u + m/2) becomes a perfect square. It only has double zeroes,
whose positions coincide with the zeroes of Q0 (u).
As is well known, in the rational case (2.20) the transfer matrix of the 8V-model has a degenerate spectrum (for sufficiently large values of N 2L). We would like to stress here that the
above relations (2.55)(2.58) hold only for non-degenerate states. Actually, the assumption made
in the beginning of this subsection, that Q+ (u) coincides with Q (u) when = 0, is true only
for non-degenerate states. Removing this assumption and taking 0 limit in (2.31), while
keeping fixed by (2.20), one obtains
+ (u)Q
(u),
TL (u + m/2) = C(0)Q
= lim Q (u).
Q
0
(2.59)
(u) can only differ by positions of com + (u) and Q

For a degenerate state the eigenvalues Q
plete exact strings. This ambiguity does not affect any transfer matrix eigenvalues Tk (u), since
the complete strings trivially cancel out from (2.11). In principle, the complete strings can take
(u) they take rather distinguished positions. Indeed, due to
arbitrary positions, however, for Q
(2.59), the zeroes of Q (u) manifest themselves as zeroes of TL (u + m/2) which are uniquely
defined even for the degenerate states. From the above discussion it is clear that TL (u) has either
double zeroes or complete strings of zeroes. Further analysis of the degenerate case is contained
in Section 3.2.
2.5. Particular models
So far our considerations were rather general and covered several related models at the same
time. For each particular model, one needs to specify additional properties, namely, (i) the explicit
form of the function f(u) and (ii) detailed analytic properties of the eigenvalues Q (u). In this
section we will do this for three different models: the 8V/SOS-model, the 6V-model and the
c < 1 conformal field theory.
238
2.5.1. The symmetric eight-vertex model

The basic properties of the 8V-model are briefly reviewed in Appendix A. Readers who are
not well familiar with the subject will benefit from reading this appendix prior to the rest of
the paper. Our notations are slightly different from those in Baxters original papers [14]. The
variables q, , v and used therein (hereafter denoted as qB , B , vB and B ) are related to our
variables q, , v and as
B
vB

q2 = qB = e KB /KB ,
=
,
v=
,
= B ,
(2.60)
2KB
2KB
where KB and KB are the complete elliptic integrals associated to the nome qB . Here we the fix
the normalization of the Boltzmann weights as

1
= 22 (0|q)1 4 0|q2 ,
(2.61)
where
i (u|q),
i = 1, . . . , 4, q = ei ,
Im > 0,
(2.62)
are the standard theta functions [6] with the periods and .
We denote the transfer matrix T and the Q-matrix from [1,2] as TB (v) and QB (v), remembering that our variable v is related to vB by (2.60). Below we often use a shifted spectral parameter
,
2
simply connected to the variable v in (2.60). We also consider the redefined matrices

N
T(u) = i q1/4 eivN TB (v),
Q(u) = eivN/2 QB (v)
u=v
(2.63)
(2.64)
where N is the number of columns of the lattice. The eigenvalues T(u) and Q(u) of these new
matrices enjoy the following periodicity properties
T(u + ) = T(u),
T(u + ) = (q)N e2iuN T(u),
(2.65)
and
8V-model: Q(u + ) = seiN/2 Q(u),
Q(u + 2 ) = q2N e2iN u Q(u).
(2.66)
Here the quantum number s = 1, is the eigenvalue of the operator S, defined in (A.12). This
operator always commutes with T(u) and Q(u).
Baxters TQ-equation (Eq. (4.2) of [1] and Eq. (87) of [2]) now takes the form (2.3) with

N
(2.67)
f(u) = 4 (u|q) .
The main reason for the above redefinitions is to bring the TQ-equation to the universal form
(2.3), where T(u) and f(u) are periodic functions of u (see Eq. (2.2)) for an arbitrary, odd or
even, number of sites, N . This also helps to facilitate the considerations of the scaling limit in
our next paper [18].
Comparing the first equation in (2.66) with the periodicity of the Bloch solutions (2.12) one
concludes that the exponents read

0 (mod ), N = even,
(8V ) =
(2.68)
2 (mod ), N = odd.
239
Thus, for an even N the exponents of the symmetric 8V-model, with the cyclic boundary conditions, always belong to the dangerous set (2.17). For an odd N the exponents (2.68) fall into
this set only for certain rational values of / . A notable example is the case = /3, considered
in [1618].
The imaginary period relations in (2.65) and (2.66) certainly deserve a detailed consideration.
First, note that in (2.66) we only stated the periodicity with respect to the double imaginary period
2 , which always holds in all cases when the 8V-model has been exactly solved.5 Actually, this
is a rather overcautious statement which can be easily specialized further. For the following
discussion assume a generic (i.e., irrational) value of / . Then for even N the Bloch solutions
(2.12) always coincide (just as in the zero-field case of Section 2.4). For odd N there are always
two linearly independent Bloch solutions for each eigenvalue T(u), one with s = +1 and one with
s = 1 (remind that in this case each eigenvalue of the transfer matrix is double-degenerate [28]).
The existence of the imaginary period imposes rather non-trivial restrictions on the properties
of the eigenvalues. Indeed, the second relation in (2.65) immediately implies that the function
(u) = r qN/2 eiuN Q(u + ),
Q
(2.69)
where r is a constant, satisfies the TQ-equation (2.3) as well as Q(u). Further, if Q(u) is a Bloch
solution
Q(u + ) = ei Q(u)
(2.70)
(u) is also such a solution with the exponent

with some then Q
= + N (mod 2).
(2.71)
(u) is proportional to Q(u) or it is proportional to the

Obviously, there are two options, either Q
other linearly independent Bloch solution with the negated exponent . The first option is
realized for even N ,
8V-model, N even: Q(u + ) = r qN/2 eiuN Q(u).
(2.72)
The constant r = 1 is then the eigenvalue of the spin-reversal operator R (cf. (A.37)). The
second option requires the exponent to be a half-an-odd integer fraction of , it is realized for
odd N ,
8V-model, N odd:
Q (u + ) = qN/2 eiuN Q (u).
(2.73)
The above relations (2.72) and (2.73) were derived for irrational values of / , however
they also hold in the rational case (2.20), if no additional degeneracy of the eigenvalues of the
transfer matrix occurs (apart from the one related with the spin-reversal symmetry for odd N ).
The functional relation (2.31) can be then written in the form

1
TL u + m = AeiN u Q+ (u)Q+ (u + )
(2.74)
2
where A is a constant. This relation is identical to the one conjectured in [14].6
5 Ref. [1] applies to rational and arbitrary values of N , while Ref. [2] applies to arbitrary and even values N . It
is reasonable to assume that (2.66) holds in general, however, the case of an arbitrary and an odd N has never been
considered.
6 The conjecture of [14] also covers a special case of degenerate states for rational values of , where the relation (2.73)
holds for the eigenvalues of the Q-matrix of [1] for even N .
240
2.5.2. The solid-on-solid model

The main idea of this paper is to study deformations of the eigenvalues T(u) and Q(u) under
continuous variations of the exponents from their discrete values (2.68). As explained in the
Introduction the resulting eigenvalues correspond to the unrestricted SOS-model. We will therefore assume the more general periodicity relations (1.6) for the Bloch wave solutions Q (u),
which hold for both odd and even N ,
SOS-model: Q (u + ) = ei Q (u),
Q (u + 2 ) = q2N e e2iuN Q (u),
(2.75)
where the exponent is arbitrary. The second exponent is not an independent parameter, it is
determined by (see the discussion in Section 3.1.2 below).
The second relation in (2.75) can be further refined for even N
SOS-model, N even: Q (u + ) = qN/2 e/2 eiuN Q (u),
(2.76)
whereas the periodicity of T(u) remains the same (2.65) as in the 8V-model. However, there is
no a general SOS-model analog of (2.73), as it is specific to half-odd exponents only. As a result
Eq. (2.65) is replaced with
SOS-model, N odd: T(u + ) = T(u),
T(u + 2 ) = q4N e4iuN T(u).
(2.77)
Strictly speaking the use of the term SOS-model here is justified for even N only [3]. Nonetheless, we will use this term to indicate arbitrary values of the field parameter in general.
2.5.3. Six-vertex model in a horizontal field
The allowed vertex configurations of the six-vertex model form a subset of those shown in
Fig. 12. Namely, the Boltzmann weights 7 and 8 are equal to zero. The remaining six weights
will be parameterized as
1 = e+H i a,
2 = eH i a,
3 = e+H +i b,
4 = eH +i b,
5 = eiu2i c,
6 = eiu2i c,
(2.78)
where H stands for the horizontal field

a = h(u + ),
b = h(u ),
c = h(2),
h(u) = 1 e2iu .
(2.79)
The above parametrization is simply related to that given in Eq. (12) of [13] (where the vertical
field V is set to zero). The TQ-equation (Eq. (11) of [13]) takes the form (2.2), (2.3), where

N
(2.80)
f(u) = h(u) .
The Bloch solutions (2.12), corresponding to the eigenvectors of the transfer matrix with n upspins, can be written as

Q (u) = eiu/ A e2iu ,
(2.81)
where A+ (x) and A (x) are polynomials in x of the degrees n and (N n), respectively, and
=
iH N
+ (N 2n).
2
2
(2.82)
241
Introduce new variables7

x = e2iu ,
q = e2i ,
z = e2i/ .
(2.83)
Regarding x as a new spectral parameter instead of u and writing T(u) and f(u) as T(x) and f(x),
respectively, one can rewrite (2.3) in the form

T(x)A (x) = z1 f q 1 x A q 2 x + z1 f(qx)A q 2 x ,
(2.84)
where the polynomials A (x) are defined in (2.81). This form is particularly convenient for the
6V-model.
2.5.4. Conformal field theory
The continuous quantum field theory version of Baxters commuting transfer matrices of the
lattice theory was developed in [8,9,29]. These papers were devoted to the c < 1 conformal field
theory (CFT). The parameters and p used there define the central charge c and the Virasoro
highest weight ,
2

p
c1
1 2
,
=
+
.
c=16
(2.85)
24
They are related to our and as
= 2p/ 2 .
2 = 2 ,
(2.86)
The multiplicative spectral parameter used in those papers is related to our variable u as
2 = e2iu .
(2.87)
The eigenvalues of the CFT Q-operators Q (u) are entire functions of the variable u, satisfying
the periodicity relation (2.12). Their leading asymptotics at large positive imaginary u read
log Q (u) =
A
iu/(2)
+ O(1),
e
cos( 2
)
u +i,
| Re u| <
,
2
where A is a known constant [8]. Here we assumed that does not belong to the set

1
1
, k = 1, 2, . . . , .
=
2
2k
(2.88)
(2.89)
At these special values of the theory contains logarithmic divergences and the asymptotics
(2.88) should be replaced with
| Re u| < , (2.90)
log Q (u) = 2i(1)k Aue2iuk/ + Ce2iuk + O(1), u +i,
2
where C is a regularization-dependent constant. The factorization formulae read8
Q (u) = eiu/ A (u),
A (u) =

1 e2i(uuk ) ,
(2.91)
k=1
7 The parameter q should not be confused with the nome q in the 8V-model.
8 Here we assumed that 0 < < /4. When /4 < < /2 the product in (2.91) should contain the standard Weier-
strass regularization factors [8].
242
where the zeroes u

1 , u2 , . . . accumulate at infinity along the straight line
u = + iy, y +.
2
Finally, the function f(u) in the case of CFT should be set to one,9
f(u) 1.
(2.92)
(2.93)
With these specializations the functional relations given above become identical to those previously obtained in [8,9,29].
2.6. Related developments and bibliography
The literature on the functional relation in solvable models is huge; therefore it would not
be practical to mention all papers in the area. Our brief review is restricted only to a subset
of publications directly related to the eight-vertex/six-vertex models and associated models of
quantum field theory.
2.6.1. Transfer matrix relations
In the above presentation the entire functions Tk (u) with k 3 were defined by the recurrence relation (2.5), which allows one to express them solely in terms of T(u), as in (2.8). No
other additional properties of Tk (u) were used. However, as is well known, these functions are
eigenvalues of the higher transfer matrices, usually associated with the so-called fusion procedure. This algebraic procedure provides a derivation of the functional relations for the higher
transfer matrices based on decomposition properties of products of representations of the affine
quantum groups. Originally, all these transfer matrix relations were obtained essentially in this
way. We would like to stress that the logic of these developments was exactly opposite to that
employed in our review. The goal was to find new techniques, independent of the TQ-relation,
rather than to deduce everything from the latter. The first important contribution was made by
Stroganov [30]. He gave an algebraic derivation of the first non-trivial relation in (2.5) (with
k = 2),

(2.94)
T(u + )T(u ) f(u + 2)f(u 2) = O (u u0 )N
in the vicinity of the point u = u0 where the transfer matrices T(u0 + ) and T(u0 ) become
shift operators. Remarkably, this single relation alone contains almost all information about the
eigenvalues T(u). To illustrate this point consider, for instance, the 6V-model. For a chain of the
length N each eigenvalue T(u) is a trigonometric polynomial of the degree N , determined by
N + 1 unknown coefficients. The mere fact that the LHS of (2.94) has an N th order zero immediately gives N algebraic equations for these unknowns. Similar arguments, obviously, apply to
the 8V-model. One additional equation is usually easy to find from some elementary considerations (e.g., from the large u asymptotics in the 6V-model). Further, in the thermodynamic limit,
N with u kept fixed, Eq. (2.94) becomes a closed functional relation for the eigenvalues (its
RHS vanishes). This is the famous inversion relation [3032]. With additional analyticity assumptions it can be effectively used to calculate the eigenvalues of the transfer matrix at N = .
Recently, Eq. (2.94) was used to derive a new non-linear integral equation [33], especially suited
for the analysis of high-temperature properties of lattice models.
9 Again, we have assumed that does not fall into the set (2.89), otherwise f(u) = exp(4Ae2iuk /).
243
Soon after [30] Stroganov derived [34] a particular case of (2.23) for the 6-vertex model with
= /6 (i.e., for L = 3 and m = 1). He also found that for the case of an odd number of sites10
Eq. (2.25) takes the form
T(u 2)T(u)T(u + 2)
= f(u)f(u + 2)T(u 2) + f(u 2)f(u + 2)T(u) + f(u 2)f(u)T(u + 2). (2.95)

He then used this equation to obtain Bethe ansatz type equations for the zeroes of T(u) and
to reproduce Liebs celebrated result [35] for the residual entropy of the two-dimensional ice.
Unfortunately, these results were left unpublished.
The ideas of [30,34] were further developed in the analytic Bethe ansatz [36] where the TQequation (or an analogous equation) is used essentially as a formal substitution to solve the
transfer matrix functional equations. The notion of higher or fused R-matrices was developed in [22] from the point of view of representation theory. These R-matrices were calculated
in [37] for the 6V-model, in [3841] for the 8V-model and in [27] for the SOS-model. The functional relations (2.7) were given in [37] for the 6V-model and in [26] for 8V/SOS-model. The
determinant identity (2.25) was discussed in [26,42]. An algebraic derivation of the truncation
relations (2.39) for the RSOS model [20] was given in [26]. A particular case of this truncation for the hard hexagon model [43] was previously discovered in [25]. An algebraic derivation
of (2.23) in the zero-field six-vertex model is given in [44]. The idea of calculation of - and
-derivatives (2.44) at the RSOS point given in Section 2.3.2 is borrowed from [45] and [46].
Remarkably, the same functional equations (2.5) (along with all their specializations in the
rational case) arise in a related, but different context of the thermodynamic Bethe ansatz [47]; see
[48] for its application to the 8V-model. Usually this approach in lattice models is associated with
non-linear integral equations. Here we refer to the functional form of these equations discovered
in [49]. Further discussion of the correspondence of the functional relation method with the
thermodynamic Bethe ansatz and its generalizations for excited states can be found in [21,29,
5053].
2.6.2. Q-matrix and TQ-relations
As noted before, a full algebraic theory of the Q-matrix in the 8V-model is not yet developed.
The idea of the construction of the Q-matrix in terms of some special transfer matrices belongs to
Baxter. It is a key element of his original solution of the 8V-model. Readers interested in details
should familiarize themselves with Appendix C of [1] (along with other four appendices and,
of course, the main text of that paper, which contain a wealth of important information on the
subject). The results of [1] only apply for certain rational values of . The construction of [1]
and the set of allowed values of were recently revised in [14]. A different construction for
the Q-matrix, which works for an arbitrary , was given in [3]. Some remarks on a comparison
between the two Q-matrices are given in Appendix A.
There are many related solvable models connected with the R-matrix of the 8V-model but
having different L-operators and different quantum spaces. The general structure of the functional relations in all such models remains the same. In particular, they all possess a TQ-relation
(though it may contain different scalar factors and require different analytic properties of the
eigenvalues). In [3] Baxter also presented an extremely simple explicit formula for the matrix
elements of the Q-matrix for the zero-field 6V-model in the sector with N/2 up-spins (the half
10 In our notations this corresponds to = /2 (mod ).
244
filling). However, no such expression is known for the 8V-model, or the other sectors of the 6Vmodel. The quantum space of the 6V-model is build from the two-dimensional highest weight
representation of Uq (sl2 ) at every site of the lattice. Curiously enough, if this representation is
replaced with the general cyclic representation (arising at roots of unity, q L = 1) then all matrix
elements of the Q-matrix can be explicitly calculated [54] as a simple product involving only a
two-spin interaction.11 Remarkably, the resulting Q-matrix exactly coincides with the transfer
matrix of the chiral Potts model [5557]; this allows one to view the latter as a descendant of
the six-vertex model [54]. The generalization of this construction to the eight-vertex and the
KashiwaraMiwa model [58] is considered in [59]. Further developments of the theory of the Qmatrix and related topics (along with many important applications to various solvable models)
can be found in [6069].
Baxters original idea of the construction of Q-operators which utilizes traces of certain monodromy matrices was extended in [8,9] for trigonometric models related with the quantum affine

algebra Uq (sl(2)).
It turned out that in the trigonometric case the situation is considerably simpler
than for the 8-vertex model and the Q-operators coincide with some special transfer matrices.
The corresponding L-operators are obtained as specializations of the universal R-matrix [70]
to infinite-dimensional representations of the q-oscillator algebra in the auxiliary space. Although the calculations of [8,9] were specific to the continuous quantum field theory, the same
procedure can readily be applied to lattice models (see, e.g., [7174] for the corresponding results
for the 6V-model). In the case of the 6V-model with non-zero horizontal field this construction
leads to two Q-matrices,12 whose eigenvalues Q are precisely the Bloch wave solutions of
the TQ-equation.
Note that functional relations which involves bi-linear combinations of Q , namely (2.19),
(2.29) and (2.31) are universal in the sense that they do not involve the model-specific function f(u). These relations were derived in [8,9] in the context of the conformal field theory.
Similar relations previously appeared in the chiral Potts model [54,75,76], though the correspondence is not exact because there is no an additive spectral parameter in that model. Eq. (2.19)
in the eight-vertex and the XXX-models was found in [10] and [77]. A special case (2.74) of
the relation (2.31) involving Baxters original Q-matrix [1] for the 8V-model was conjectured in
[14]. Another special (zero-field) case (2.55) of the same relation (2.31) in conformal field theory
was conjectured in [52].
3. Quantum Wronskian relation and singular eigenvalues
The quantum Wronskian relation (2.13), naturally arising in the above analysis of the TQ
equation, is a very non-trivial functional relation. In this section we show how this relation can
used for the analysis of the eigenvalues. In particular, we consider a practical question of the
calculation of Q (u) from a known Q+ (u). Next, we study certain singular eigenvalues in the
zero-field limit.
11 The factorization of the matrix elements of the Q-matrix is typical for quantum space representations without highest
and lowest weights.

12 As noted in [9], for the half-filled sector of the zero-field 6V-model these Q-matrices reduce to Baxters expression
[3] mentioned above, as they, of course, should.
245
3.1. Solving the quantum Wronskian relation

3.1.1. Six-vertex model (multiplicative spectral parameter)
Let us first consider the example of the 6V-model. It is convenient to present the polynomials
A (x), defined in (2.81) in a factorized form,
A+ (x) = +
n

1 x/xk+ ,
A (x) =
k=1
Nn

1 x/xk ,
(3.1)
k=1
and rewrite the quantum Wronskian relation (2.13) as

zA+ (xq)A (x/q) z1 A+ (x/q)A (xq) = 2i W(z)f(x).
(3.2)
This single functional equation determines both unknown polynomials A (x), as well as the
function W(z), except for an obvious freedom to choose arbitrary x-independent factors without changing the normalization-independent combination
2i W(z)/(+ ) = z z1 ,
(3.3)
fixed by Eq. (3.2) at x = 0. There will be many solution to (3.2) corresponding to different
eigenvectors of the transfer matrix. Indeed, setting there x = qxk+ and x = q 1 xk+ , one obtains
two relations

zA xk+ A+ xk+ q 2 = 2i W(z)f xk+ q ,

z1 A xk+ A+ xk+ q 2 = 2i W(z)f xk+ q 1 .
(3.4)
Taking their ratio, one comes back to the Bethe Ansatz equations for the zeroes x1+ , x2+ , . . . , xn+
of A+ (x),
f(xk+ q)
f(xk+ q 1 )
= z2
A+ (xk+ q 2 )
A+ (xk+ q 2 )
k = 1, 2, . . . , n.
(3.5)
of A (x).
Similar equations (with z replaced by z1 ) hold for the zeroes x1 , x2 , . . . , xNn
Given a solution of (3.5) (which defines A+ (x)) the corresponding polynomial A (x) is
uniquely determined by (3.2). For example, consider the case N = 2n. Comparing the leading
power of x in (3.2) one gets
1
def
s = x1+ x2+ xn+
= x1 x2 xn .
(3.6)
Let us rewrite (3.2) as a functional difference equation

zF(x/q) z1 F(xq) =
f(x)
A+ (xq)A+ (x/q)
(3.7)
for the rational function

F(x) =
A (x)
2i W(z) A+ (x)
(3.8)
Performing the partial fraction decomposition of the RHS of (3.7) and sharing the poles between
F(xq) and F(x/q) one obtains
246

n

f(xk+ q)
x s 2 xk+
1
F(x) =
z(1 s 2 )
x + A (x + q 2 )A+ (xk+ ) x xk+
k=1 k + k

n

f(xk+ q 1 )
x s 2 xk+
z
=
(1 s 2 )
x + A (x + q 2 )A+ (xk+ )
x xk+
k=1 k + k
(3.9)
where A+ (x) denotes the derivative of A+ (x) with respect to x. The two alternative expressions
given above are equivalent in virtue of the Bethe ansatz equations (3.5). The formula (3.9) can
be easily generalized to arbitrary values of n. We leave this as an exercise for the reader.
The above derivation has several limitations. Evidently the field parameter z should not take
the values z = 1, where the denominator in (3.8) vanishes. Further, the above sharing of the
poles in (3.7) is only possible if all the roots xk+ are distinct, finite and non-zero. Moreover no
pair of the roots could satisfy the condition xj+ /xk+ = q 2 . In all these cases the expression (3.9)
becomes meaningless (typically the summand therein diverges).
The quantum Wronskian relation can also be solved directly for the coefficients of the polynomials (2.81)
A+ (x) =
n

ak+ x k ,
A (x) =
k=0
Nn

ak x k .
(3.10)
k=0
This approach lead to a system of algebraic equations which sometimes easier to analyze since
it involves only symmetric functions of the Bethe roots, but not the roots themselves. As an
example, consider the trivial eigenvalue
T(x) = zf xq 1 + z1 f(xq)
(3.11)
corresponding to the ferromagnetic ground state (all spins down) of the six-vertex. One solution
of (3.2) is obvious
A+ (x) = + = 1.
(3.12)
The finite difference equation (3.7) can be readily solved

A (x)
2i W(z)
= z
N

N (xq)k
k=0
z2 q 2k
(3.13)
If z takes one the values z = q k , k = 0, . . . , N , the polynomial A (x) reduces to a single

power, x k , and the corresponding Bloch solutions (2.81) become linearly dependent. These are
the singular cases discussed in Section 2.2. Indeed, choosing the normalization
=
N

z2 q 2k
(3.14)
k=1
so that A (x) remains finite and does not vanish identically for all z and taking into account the
definition of z in (2.83), one can easily see from (3.3) and (3.12) that the singular exponents form
a subset of (2.17), as expected.
247
3.1.2. Eight-vertex/SOS model

The formula similar to (3.9) holds for the 8-vertex/SOS-model as well. The periodicity relations (2.75) imply that the entire functions Q (u) can always be factorized as
Q (u) = ei u/ A (u),
A (u) =
N

h u u
k ,
(3.15)
k=1
where u
1 , u2 , . . . , uN are zeroes of Q (u) and the elementary factors
h(u) = 1 u|q2
(3.16)
have the double imaginary period.13 We will assume that the roots u
k lie in the fundamental
domain
0 < Re u
k < ,
| | < Im u
k < +| |,
k = 1, . . . , N.
(3.17)
There are two independent sets of the Bethe ansatz equations (2.4),
f(u
k + )
f(u
k )
= e4i /
A (u
k + 2)
A (u
k 2)
k = 1, 2, . . . , N
(3.18)
with all upper and with all lower sings for the zeroes of Q+ (u) and Q (u), respectively. The
relations (2.75) impose the following constrains on the parameters entering (3.15),
= + N (mod 2),
2 = i 2
N

(3.19)
u
k + N (mod 2).
(3.20)
k=1
Combining these relations one obtains

+ + = 2m2 ,
u + + u = m1 + 2 m2 ,
m1 , m2 Z,
(3.21)
where
u + =
N

k=1
u+
k,
u =
N

u
k.
(3.22)
k=1
Note that for any pair of eigenvalues Q (u) the parameter in (2.75) can be regarded as a
function of . Indeed, the Bethe ansatz equations (3.18) only contain the exponents , which
up to multiples of coincide with . Once these equations are solved for u
k the exponent
is determined from (3.20).
With the above notations the 8V/SOS-model analog of the formula (3.9) can be written as
13 This choice allows a uniform treatment of even and odd N , though for even N the zeroes Q split into N/2 equidis
tant pairs with the separation . Actually, we found this redundancy quite useful in controlling the consistency of
numerical calculations. Such pairing occurs automatically, without placing any constraints on the positions of zeroes.
248
+ u+
e2i+ u/ h (0) e2i+ uk / f(u+
k + ) h(u + 2u
k)
+
+
+

h(2u + )
Q (u + 2)Q+ (uk )
h(u uk )
k=1 + k
N
G(u) = +
+ u+
e2i+ u/ h (0) e2i+ uk / f(u+
k ) h(u + 2u
k)
,
+
+
+

h(2u + )
Q (u 2)Q+ (uk )
h(u uk )
k=1 + k
N
(3.23)
where
G(u) =
Q (u)
1
2i W() Q+ (u)
(3.24)
and the prime now denotes the derivative with respect to u. This formula is a simple consequence of Bethe ansatz equations (3.18) and the following elegant identity for the elliptic theta
functions14

n
n

1 (zk yj ) 1 (zk yk )1 (x zk + z y)
1 (x yk )
1
=
.
1 (x zk ) 1 (z y)
1 (zk zj )
1 (x zk )
k=1
k=1 j =k
(3.25)
Here x is an independent variable, y1 , . . . , yn and z1 , . . . , zn are arbitrary constants,
y =
n

z =
yk ,
k=1
n

(3.26)
zk ,
k=1
and for brevity the theta-function 1 (x|q) is written as 1 (x). To obtain (3.23) one needs to apply
the identity (3.25) to the RHS of the quantum Wronskian relation, written in the form
f(u)
G(u ) G(u + ) =
Q+ (u + )Q+ (u )
(3.27)
0 (u) defined in (2.46). Let uk denote the

The zero-field variant of (3.23) relates Q0 (u) and Q
zeroes of Q0 (u),
Q0 (u) =
N

h u u
k ,
k=1
(u) =
d log h(u)
.
du
From (2.52) one can show that

N
iu

Q0 (u) = 2Q0 (u)

(u uk ) uk (0) + CQ0 (u)
+
(3.28)
(3.29)
k=1
where C is an arbitrary constant and

duk
i f(uk + )W (0)
uk (0) =
=
.

d =0 Q0 (uk + 2)Q0 (uk )
(3.30)
The unknown constant W (0), defined in (2.53), can be expressed through the zeros uk from the
differential equations (4.12).
14 In the trigonometric limit q 0 this identity reduces to the modified variant of Ex. 2 on page 140 of [6], obtained if
the constant term therein is smartly distributed among the terms in the sum.
249
3.2. Singular eigenvalues

When / is a rational number the spectrum of the transfer matrix of the symmetric 8Vmodel is degenerate (for sufficiently large N ). Solutions of the Bethe ansatz equations (2.4) for
degenerate states are not unique. They contains arbitrary continuous parameters, which determine
positions of the complete strings (2.27). There is nothing wrong with this. In particular, this is
not an indication that the Bethe ansatz is incomplete [15]. Quite to the contrary, as explained
in [13], the appearing continuous parameter are precisely those that are needed to describe the
embedding of the corresponding eigenvectors into the eigenspace of the degenerate eigenvalue.
It is not immediately clear, however, how the eigenvalues Q (u), which have no ambiguities
at generic values of the parameters and , could acquire continuous degrees of freedom for the
degenerate states at special values of and . The explanation is simple: the limiting values of
Q (u) are not uniquely defined; they depend on the details of the limiting procedure. Here we
will not study this phenomenon in its full generality, but give a particular example. For simplicity
we consider the 6V-model in a field, where most calculations can be done analytically. Assume
the same notations as in Section 2.5.3. In particular, recall the definitions of the multiplicative
spectral parameter x, the field variable z and the parameter q given in (2.83). For generic z and q
the transfer matrix of the 6V-model is completely non-generate (the usual degeneracy associated
with the reversal of all spins is broken in the presence of the field). Consider the case N = 4 and
n = 2. This sector contains six different eigenvectors. The corresponding polynomials A (x),
defined in (2.81), are all of the second degree. Let us parameterize them as
A+ (x) = 1 + a1 x + a2 x 2 ,
A (x) = 1 + b1 x + b2 x 2 .
(3.31)
Substituting these expressions into the quantum Wronskian relation (3.2) with + = = 1, one
gets four equations for the unknown coefficients a1 , a2 , b1 , b2 ,
1 = a 2 b2 ,
q 2 a1 b2 + 4q + a2 b1
,
z2 = 2
q a2 b1 + 4q + a1 b2
q 4 b2 + q 2 (a1 b1 6) + a2
,
z2 = 4
q a2 + q 2 (a1 b1 6) + b2
q 2 b1 + 4q + a1
.
z2 = 2
q a1 + 4q + b1
(3.32)
Excluding a2 , b1 , b2 one obtains a six-degree polynomial equation for a1 , which factors into two
linear and two quadratic equations. One of the latter reads

q 1 q 2 z2 a12 + 4q 2 1 z2 + z 1 q 4 a1 + 4q(1 z) q 2 + z = 0,
(3.33)
while the other is obtained by the substitution z z. Altogether one gets six different solutions
for the coefficient a1 . Once it is found, the remaining coefficients are unambiguously determined
by (3.32).
Consider the limit
0.
,
(3.34)
4
In terms of q and z it corresponds to q i, z 1. In this limit the two eigenvalues of the
transfer matrix, T(1) (x) and T(2) (x), corresponding to two different solutions of (3.33), smoothly
250
approach the same value

T(x) = 2x 4 12x 2 + 2 = 2 x 2 b02 x 2 b02 ,
b0 = 1 +
2.
(3.35)
To calculate Q (u) one also needs to describe the first order deviations of and from their
limiting values (3.34). This requires two arbitrary small parameters 1 and 2 , or just one such
parameter and the ratio 1 /2 . We found it convenient to use the following parametrization

= + b4 6b2 + 1 + O 2 ,
(3.36)
= 8 b4 1 + O 2 ,
4
where 0 and b is an arbitrary complex parameter, which is kept finite. The coefficients of
the linear terms in have been specifically chosen to simplify subsequent expressions. The corresponding expansions for q and z simply follow from (2.83). Substituting them into Eq. (3.33),
solving it for a1 and then determining the remaining coefficients in (3.31) from (3.32), one obtains
x2
(1)
(1)
A+ (x) = 1 2 + O(),
b
A (x) = 1 b2 x 2 + O(),
x2
+ O(),
b 2
A (x) = 1 b 2 x 2 + O(),
(2)
A+ (x) = 1
(2)
(3.37)
where b is related to b by a self-reciprocal transformation

3 b2
,
b 2 =
1 3b2
(3.38)
which exchanges the two solutions. The corresponding eigenvalues and eigenvectors of the transfer matrix read
2

T(1) (x) = T(x) 16 1 + b2 x 1 + x 2 + O 2 ,
2

T(2) (x) = T(x) + 32 1 b2 x 1 + x 2 + O 2 ,
(3.39)
where T(x) is given by (3.35) and
(a)

= | | + (a) | + | | | + O(),
a = 1, 2,
(3.40)
with
(1) =
i(1 b2 )
,
1 + b2
(2) =
i(1 + b2 )
.
2(b2 1)
(3.41)
Here we used self-explanatory notations for the spin up, | , and spin down, | , states of
the edge spins.
Note, that the case = /4 corresponds m = 1 and L = 2 in (2.20). In terms of the variable x
the complete string (2.27) for this case consists of two roots x1 , x2 , constrained by one relation
x2 = x1 . Obviously, at = 0 the zeroes of A (x), given by (3.37), are the complete 2-strings
at x1 = b1 or x1 = b 1 . Their positions are given in terms of the very same parameter b, which
defines the embedding of the eigenvectors (3.40) into the eigenspace of the degenerate eigenvalue
(3.35) in the sector with two spins up. Remind that the parameter b can be arbitrary; it determines
the direction of the 2-variable limit (3.36).
251
It is interesting to see what happens in some particular cases of (3.36). First, consider the zerofield case when 0 from the very beginning and is approaching the value /4. In Eq. (3.36)
this corresponds b2 = 1. As obvious from (3.37) and (3.38) the limiting values of A (x) then
coincide and the complete 2-string can take one of the two fixed positions with x1 = 1 or x1 = i.
The second interesting case is when is set to its rational value /4 while the is
approaching zero from arbitrary values. In Eq. (3.36) this corresponds to b = b01 , where b0 is
defined in (3.35). Note, also the corresponding value of b02 = b02 . We see that the complete
2-strings in this case occupy precisely the distinguished positions determined by Eq. (2.59).
Indeed, taking into account (3.35) it is easy to see that Eq. (2.59) is satisfied with C(0) = 2 and
(2)
(x) A (x) (it does not matter which solution in (3.37) is taken, because A(1)
Q
(x) = A (x)
when b = b0 ). We would like to stress that the distinguished string positions arising from (2.59)
can only be achieved by considering the limit from arbitrary values of the field with fixed rational
values of .
Finally, note that degenerate eigenstates in the 6V- and 8V-models were also considered in
[13,15,78].
4. Analytic continuation of the eigenvalues
The eigenvalues of the T- and Q-matrices of the 8V/SOS-model have very interesting analytic
properties with respect to the field parameter . In this section we will study these properties by
a combination of analytic and numerical techniques. We consider the disordered regime of the
8V-model with real q and in the range
0 < q < 1,
(4.1)
<< .
4
2
Note that the parameter in this case is purely imaginary, = i| |.
4.1. Bethe ansatz equation
We will show that different eigenvalues of the transfer matrix of the symmetric 8V-model can
be obtained from each other by the analytic continuation in the variable . In doing this we will,
obviously, need to consider as an independent variable. Therefore, for the eigenvalues Q (u)
we assume the SOS-model periodicity (2.75) which allows arbitrary values of . In practice it
is more convenient to work with a simply related variable, which interpolate the values of +
(remind that it is related to by (3.19)). We will denote this variable by the upright symbol .
The Weierstrass factorization for the eigenvalues was discussed in Section 3.1.2. It is convenient
to parameterize the zeroes of Q (u) as
u
(4.2)
+ ik .
k =
2
Note, that the product representations (3.15) has an obvious ambiguity. For example, the translation of any single root k+ by the period 2| | complemented by the 2 -shift of + ,
k+ k+ 2i,
+ + + 2,
(4.3)
leaves Q+ (u) unchanged (more precisely, Q+ (u) acquires an inessential scalar factor). Using this
freedom one can always bring all zeroes of Q (u) to the periodicity rectangular
| | Re k +| |,
(4.4)
Im k , k = 1, . . . , N.
2
2
252
With this convention the values of the exponents in (3.15) cannot, in general, be restricted to
any finite domain.
Introduce two functions
1
1 (2 + i|q 2 )
.
log
2i
1 (2 i|q 2 )
Consider the Bethe ansatz equations of the following general form
1 () =
1
3 ( + i|q)
log
,
2i
3 ( i|q)
N1 (k )
N

2 () =
(4.5)
2
,
2
(4.6)
2 (k j ) = nk +
j =1
where the numbers n1 , n2 , . . . , nN and are given, while the (complex) variables 1 , 2 , . . . , N
are considered as unknowns. The numbers nk take half-odd integer values for even N and integer
values for odd N . Taking into account (3.15) and (4.2) it is easy to see that the logarithmic form
of the Bethe ansatz equations (3.18) for Q (u) coincide with (4.6) provided k and therein are
replaced by k and .
Obviously, the numbers {nk } depend on the choice of branches of logarithms in (4.5). In the
considered case of the 8V-model, which involves elliptic functions, an extra care is required to
define these branches. Since the functions 1,2 are periodic
i ( + i) = i (),
i = 1, 2
(4.7)
one only needs to specify them in the periodicity strip 0 Im < . We choose the cuts

1
1
| | + i
+ n , m +
| | + i
+ n + , m, n Z
m+
2
2
2
2
(4.8)
for the function 1 () and the cuts

m| | + i(n 2), m| | + i (n 1) + 2 , m, n Z
(4.9)
for 2 () (the latter are shown in Fig. 1). We will assume that the values of (4.5) appearing in
(4.6) are always taken on the principal sheet of the corresponding Riemann surface defined by
the condition
i () = i (),
i = 1, 2.
(4.10)
The values on the cuts are taken from the right (left) side for the upper (lower) cuts in Fig. 1,
as shown by arrows. Note that with these definitions the values of 1 () and 2 () on the real
axis of are unbounded (this does not happen in the trigonometric case). They monotonically
decrease (or increase) with the increase of . We should warn the reader that this property is
never achieved with ad hoc computer definitions of the logarithms in (4.5).
Every eigenvalue of the transfer matrix corresponds to some set15 of the numbers {nk }. Even
when the branches of the logarithms are completely fixed, there is always an ambiguity in {nk }
related to the re-definition of . Indeed, the transformation
k k ,
nk nk + m,
+m
2
,
2
m Z,
(4.11)
15 The numerical examples considered below suggest that these sets are the same for Q (u) and Q (u), provided the
+
phases of the logarithms in (4.6) are defined as described above.
253
Fig. 1. The principal Riemann sheet of the function 2 ().
leaves Eq. (4.6) unchanged.

Further, the numbers {nk } do not uniquely determine solutions of (4.6). Different solutions
corresponding to different eigenvalues could be related with the same set {nk }.16 In any case once
the numbers nk are fixed the roots k = k () solving (4.6) become functions of the complex
variable . It turns out that the latter are multivalued functions with algebraic branching points.
Below we will investigate their monodromy properties for various eigenvalues of the transfer
matrix.
4.2. Differential form of the Bethe ansatz equations
The behavior of the roots k () under variations of can be effectively studied with a differential form of the Bethe ansatz equations. Differentiating (4.6) with respect to one immediately
obtains a system of ordinary linear differential equations
N

k=1
A()j k
k () 2
= 2,
j = 1, . . . , N.
The N by N matrix A()j k is given by

A()j k = 2 (j
k ) + j k
N 1 (j )
(4.12)
N

2 (j
l )
(4.13)
l=1
() denote derivatives of the functions (4.5) with respect to their argument .
where 1,2
16 With the exception of a few simple eigenvalues (e.g. the band of largest eigenvalues [79]) these sets are very poorly
understood and their enumeration is a difficult (but, perhaps, not hopeless) problem. To our knowledge the only case
where this enumeration problem was completely solved [80] is the c < 1 conformal field theory (CFT) with .
This CFT can be obtained in the scaling limit of the eight-vertex (six-vertex) model.
254
Using the identity

k (x + y|q) k (x y|q)
+
k (x + y|q) k (x y|q)
(2x|q)
1 (2x|q)5k (x + y|q)5k (x y|q)
+ k 2 (0|q)3 (0|q)
,
= 4
4 (2x|q)
4 (2x|q)k (x + y|q)k (x y|q)
(4.14)
() are meromorphic doublewhere k = 1, 2, 3, 4 and 1,4 = 2,3 = 1, one can show that 1,2
periodic function of the variable ,
3 (0|q)2 ( + i|q)2 ( i|q)

,
2 (0|q)3 ( + i|q)3 ( i|q)
2 (2|q)4 (i|q)
,
2 () = +
21 (2 + i|q2 )1 (2 i|q2 )
1 () =
(4.15)
(4.16)
where two constants , are given by

=
4 (2|q)
,
24 (2|q)
22 (0|q)1 (2|q)
.
24 (2|q)
(4.17)
For our purposes we need to study the differential equations (4.12) only with the initial conditions corresponding to some eigenvalue of the symmetric 8V-model (typically the ground state
eigenvalue). In this case the matrix A() is, in general, non-singular. Eq. (4.12) can be then
solved for the derivatives k () and defines locally analytic functions k () of the variable .
Exceptions occur at certain root configurations, corresponding to special values of , where the
matrix A() becomes singular. These singular configurations correspond to the branching points
of the solutions in the complex -plane.
Let us rewrite (4.12) in the form
N

j () 2
adj

det A()
Aj k (),
= 2
(4.18)
k=1
where Aadj stands for the adjoint matrix and assume (0 ) = (1 (0 ), . . . , N (0 )) is such that

detA (0 ) = 0.
(4.19)
The type of the branching depends on the rank of A((0 )). The most important case is when

rank A (0 ) = N 1.
(4.20)
Expanding the determinant det A(()) near the point = 0 , one obtains
N

ci i () i (0 ) + higher order terms,
detA () =
(4.21)
i=1
where ci denote some constants. Putting this back into (4.18) and keeping first order terms only,
one gets
N

j ()
ci i () i (0 ) = Bj
i=1
(4.22)
255
where Bj denotes the RHS of (4.18) evaluated at = 0 . It is easy to see that all Bj are non-zero,
otherwise the rank of A(0 ) would have been less than N 1. Summing (4.22) over j with the
coefficients cj one obtains
2
N
N

ci i () i (0 )
=2
ci Bi .
(4.23)
i=1
i=1
Integrating over and substituting the result back to (4.21) one gets
N
1/2

detA() = 2
ci Bi
0 + O ( 0 ) .
(4.24)
i=1
Finally, from (4.22) it follows that at 0

j () j (0 ) = Dj

0 + O ( 0 ) ,

Dj = 2Bj 2
N

1/2
ci Bi
(4.25)
i=1
Therefore, the condition (4.20) implies the square root branching points for all j . In principle,
higher-order branching points are possible if the rank of the matrix A(0 ) drops below N 1.
However, we found that the solutions of the Bethe ansatz equations for N = 2, 3, 4 contain the
second order branching points only.
Note, that for even N the eigenvalues Q (u) for non-degenerate states possess the more restrictive periodicity (2.72). Their imaginary (quasi-)period is equal to rather than 2 . This is
also true for the SOS-model, see Eq. (2.76). Therefore, one can split the roots k () into the two
subsets {1 , 2 , . . . , N/2 } and {N/2 , N/2+1 , . . . , N } obtained from each other by a uniform
shift of all roots by the period | |,
N
(4.26)
.
2
It is easy to see that under this substitution the system (4.6) reduces to only N/2 equations,
provided the phases {nN/2+1 , . . . , nN } are properly fixed in terms of {n1 , . . . , nN/2 }. Further, the
constraints (4.26) are also compatible with the differential equations (4.12). If these constraints
hold for the initial data they will continue to hold for an arbitrary . Therefore for even N one
needs to consider only a half of the Bethe roots, for instance, the first N/2 roots. The matrix A()
in (4.12) effectively reduces to a N/2 N/2 matrix. This considerably simplifies the numeric
analysis for even values of N .
N/2+k () k () = | |,
k = 1, 2, . . . ,
4.3. Overview of the procedure

Below we will give a complete classification of the eigenvalues for the symmetric 8V-model
with N = 2, 3, 4. In most cases we present explicit analytic expressions for the eigenvalues obtained by a direct diagonalization of the transfer matrix. Whenever purely analytic results are not
possible, we provide required numeric values for a particular choice of the parameters q and .
Depending on convenience we will use either of the transfer matrices T(u) and TB (v) related by
(2.64), assuming that the parameters u and v are always connected by (2.63). The corresponding
eigenvalues, denoted here by (u) and B (v) respectively, are related as
N
.
(u) = iq 1/4 eiuN B (v), v = u +
(4.27)
2
256
In the previous sections we also considered the eigenvalues of the higher transfer matrices
Tk (u), defined by (2.5). In this section we will denote them as (k) (u). With these notations
(u) (2) (u).
The transfer matrix of the 8V-model commutes [81,82] with the XYZ-Hamiltonian
1
(j ) (j +1)
(j ) (j +1)
(j ) (j +1)
Jx 1 1
+ Jy 2 2
+ Jz 3 3
2
N
HXYZ =
(4.28)
j =1
provided the constants Jx , Jy , Jz satisfy (A.22). Below we will assume the normalization
Jx = 1.
(4.29)
The eigenvalues of HXYZ can be found from (A.29) which relates this Hamiltonian to the logarithmic derivative of the transfer matrix TB (v).
For each N there are 2N eigenvalues i (u), i = 0, 1, . . . , 2N 1. Below we will assume a definite ordering of i (u). We will first split them according to the momenta P of the corresponding
eigenstates.17 Then, in each sector with a fixed value of P we will arrange i (u) according to
eigenvalues of the XYZ-Hamiltonian (4.28).
For every i = 0, . . . , 2N 1 let { (i) } denote the solution of (4.6), corresponding to the eigenvalue i . We will say that the contour Cij Cij ( ), [0, 1] connects two eigenvalues i and
j (corresponding to the values i and j ) if
Cij (0) = i ,
Cij (1) = j ,
Cij

(i) (j ) ,
Cij
i j ,
(4.30)
where we allowed for an ambiguity which can be compensated by an overall shift of all the numbers nk . Indeed, the only effect of the transformation (4.11) is that the corresponding eigenvalue
of the transfer matrix (u) acquires an extra sign factor
(u) (1)m (u).
(4.31)
As a simple illustration consider the ground state eigenvalue. In the regime (4.1) all roots {k }
are real, so they can be arranged in an increasing sequence 1 < 2 < < N . For the ground
state the numbers nk take consecutive integer or half-an-odd integer values
N +1
, k = 1, . . . , N.
(4.32)
2
Further, for even N the ground state eigenvalue 0 (u) of the 8V-model is non-degenerate. It
corresponds to = 0. At this point the roots solving (4.6) are symmetrically distributed on the
interval (| |, | |), such that k (0) = Nk+1 (0). Let us take this configuration as an initial
condition for (4.12) and trace its evolution for real . When increases in the positive direction
all the roots monotonously move in the negative direction. At = the leftmost root reaches
the boundary of the periodicity rectangular (4.4), 1 () = | |, while one of the middle roots
hits the origin N/2+1 () = 0. The remaining N 2 roots arrange symmetrically around the
origin. The value = belongs to the set of the exponents (2.68) of the symmetric 8V-model.
The resulting solution of the Bethe ansatz equation corresponds to the next-to-leading eigenvalue
of the transfer matrix, 1 (u). Thus, the contour C01 in this case is a straight line from = 0 to
nk = k
17 The momenta P are eigenvalues of the shift operator, TB () which cyclically shifts the edge spins j j 1 to
the left by one lattice step.
257
Fig. 2. The -dependence of the Bethe roots for the ground state at N = 8.
Fig. 3. The dependence between the exponents on for the ground state eigenvalue at N = 8.
= . Further, at = 2 the first root becomes

1 (2) = N (0) 2| |,
(4.33)
while the remaining roots take the original = 0 positions, but shifted one step left,
k (2) = k1 (0),
k = 2, . . . , N.
(4.34)
The whole process is illustrated in Fig. 2. With an account of (4.3) the resulting root configuration
is completely equivalent to initial one at = 0. Thus the eigenvalue 0 (u) is a periodic function
of with the period 2 .
It is instructive to examine what happens to the exponent in (2.75) for the same course
variations of . It can be calculated from (3.20). We plotted the result in Fig. 3. It is a 2 -periodic
function which vanishes at the values = k , k Z, corresponding the symmetric 8V-model, in
accordance with (2.68).
In the remainder of this section we analyze the solutions of the Bethe ansatz equations for the
small size chains with N 4. All numerical calculations are performed for the following case
7
1
,
=
.
(4.35)
10
20
This choice of the parameter is motivated by the following considerations. First, it allows
us to test the behavior of Bethe roots near the special point = /3, since the value (4.35) is
quite close to it, /3 0.05. Second, it provides us with all advantages of the rational case
q=
258
without running into the problems of degenerate states. Indeed, the value (4.35) corresponds to
m = 7 and L = 10 in (2.20), while the degenerate states can only occur for N > L. Below we
will apply the following procedure:
(i) Analytically diagonalize the transfer matrix of the symmetric 8V-model and thus determine
the eigenvalues i (u), i = 1, . . . , 2N .
(10)
(ii) Calculate the corresponding higher eigenvalues i (u) using the formula (2.8).
(iii) Use the functional relations (2.31) or (2.58) with L = 10 to find zeroes of eigenvalues
Q (u). For even N this is straightforward since these eigenvalues coincide, however, for
odd N one needs to correctly share zeroes between Q+ (u) and Q (u).
(iv) Find the branching points of the eigenvalues with respect to the field variable .
(v) On the Riemann surface of the largest eigenvalue 0 (u) try to find paths connecting it to
other eigenvalues i (u) with the exponents from the discrete set (2.68) of the symmetric
8V-model.
The case N = 2 is simple and does not really require the steps (ii)(iii).
The results are presented in tables of two types. In the first one we list eigenvalues of operators
P, S, R and HXYZ (the latter are denoted as Ei ) for each eigenvalue i (u). Remind that we use
B
the normalization (4.29). We also present there numerical values of B
i = i (/2) and i =
B
i (/2) (remind that (v) and (u) are related by (4.27)). The second type tables contain the
Bethe roots and the values of the phases nk in (4.6).
Note, that here we did not attempt to diagonalize the infinite dimensional transfer matrices
of the SOS-model at arbitrary field and compare the results against the corresponding exact
expressions for the eigenvalues. We hope to address this question in the future.
4.4. The case N = 2
This case is very simple and all eigenvalues can be calculated analytically. However, it is quite
illustrative and reveals some general properties of the Bethe ansatz equations (4.6).
Four eigenvalues of the transfer-matrix TB (v) for N = 2 are given by
2
2
B
0 = 2ab + c + d ,
2
2
B
1 = a + b + 2cd,
2
2
B
2 = a + b 2cd,
2
2
B
3 = 2ab c d .
(4.36)
Corresponding eigenvalues of the Hamiltonian are given by

E0 = Jx Jy + Jz ,
E3 = Jx + Jy + Jz .
E1 = Jx + Jy Jz ,
E2 = Jx Jy Jz ,
(4.37)
For the symmetric 8V-model with even N the eigenvalues Q+

i (u) and Qi (u) coincide, therefore
there is no need to distinguish them. Their analytic expressions read

2
2
2 (0|q) i
Q0 (u) = 1 u
q 1 u +
q =
e 4 3 (u|q),
(4.38)
2
2
2
2
2

2 (0|q) i
Q1 (u) = eiu 1 u + q 2 1 u q 2 =
e 2 2 (u|q),
(4.39)
2
2
2

2 (0|q) i
(4.40)
Q2 (u) = eiu 1 u + |q 2 1 u|q 2 = i
e 2 1 (u|q),
2
259
Table 1
Descriptive properties of the eigenvalues for N = 2
i
0
1
2
3
B
i
1.33767166
0.66200143
0.38998228
0.28568795
2.06595956
0.04265730
0.89156871
1.21704816
1
1
1
1
1
1
1
1
1
1
1
1
Ei
2.02064769
0.02064769
0.84245604
1.15754396
Table 2
Roots and phases for N = 2
i
n1 , n2
0
1
2
3
i /2
i
i i/2
i /2 i/2
i /2
0
i/2
i /2 i/2
{1/2, 1/2}
{1/2, 1/2}
{1/2, 1/2}
{3/2, 1/2}

Q3 (u) = 1

2
2
2 (0|q) i
q 1 u +
q =
e 4 4 (u|q).
u

2
2
2
(4.41)
The descriptive properties of the eigenvalues are given Table 1. Table 2 contains the Bethe roots,
the values of the field parameter = + = and the phases nk for all eigenvalues.
Despite being entire functions of the spectral parameter u, the eigenvalues i (u) and Qi (u)
are multivalued functions of the field variable . Actually, for N = 2 their analytic properties
are relatively simple since they are determined by the properties of only one Bethe root 1 ().
Indeed, with an account of (4.26) the Bethe ansatz equations (4.6) reduce to a single equation

2
1
,
21 (1 ) = 2 ( + ) n1 +
(4.42)
2
which defines the function 1 (). Its branching points are determined by the condition (4.19),
which in this case reduces to
1 (1 ) = 0.
(4.43)
Solving this equation for 1 and substituting the result into (4.42) one gets potential locations of
the branching points (br) in the -plane.
(br) = 0 + 2m +
2
n,
2
m, n Z
(4.44)
where 1 (1 (0 )) = 0. No explicit analytic expressions for 0 and 1 (0 ) are available. Their

asymptotic expansions for small q read

5
2i sin 2
cos 4
3 5
q+q
+
+ O q , q 0,
0 = +
(4.45)
3
3

i
1
3
i q cos 2 + i q3
cos 2 cos 6 + O q5 , q 0. (4.46)
1 (0 ) = i
4
2
6
The numerical values
0 = + 0.46954959i,
1 (0 ) = 0.727625178i + i,
(4.47)
260
Fig. 4. The principal sheet of the Riemann surface which contains the eigenvalue 0 for N = 2.
for the case (4.35) are well approximated by the asymptotic formulae (4.45) and (4.46).
Fig. 4 shows two sheets of the Riemann surface of the function 1 () corresponding to the
ground state eigenvalue 0 (u). The principal sheet is defined by the condition 1 (0) = i/2.
This sheet contains two cuts (0 , + i) and (0 , i) in the strip 0 < Re < 2 (they
are shown by solid lines). The same pattern is repeated with the period 2 . The second sheet can
be reached from the first one by crossing either of the solid line cuts in Fig. 4 from left to right.
This sheet contains two new branch cuts (shown by dashed lines) in addition to the same cuts as
on the first sheet.
To demonstrate that all the solutions of Bethe ansatz equations for N = 2 can be obtained
from each other by an analytic continuation in we performed numerical calculations using the
differential equations (4.12). For N = 2 it is a single equation
d1
= 2
d
1 (1 )
(4.48)
where 1 () is given by (4.15).

The results are illustrated in Fig. 4 by three contours C01 , C12 and C03 , which connect corresponding eigenvalues (we used the definition (4.30)). The contour C01 goes between points
on the first sheet while the contours C12 and C03 start on the first and end on the second sheet.
Note that the ending point of the contour C03 is 2 /(2) rather than 0 as it should be according
to Table 2. As explained above, this does not affect the Bethe roots, but changes the sign of the
eigenvalue as in (4.31),
C03
0 3 .
(4.49)
Apparently it is possible to find a different contour C03 ending at the correct point = 0 on some
other sheet of the same Riemann surface. We leave this as an exercise to the reader.
261
Table 3
Descriptive properties of the eigenvalues for N = 3
i
0
1
2
3
Ei
1.85546785
0.69792389
0.57877198
0.57877198
B
i
1.11943180
0.25256796
0.14650972
0.14650972
1.40606193
1.45228936
0.65945338
0.65945338
1
1
1
1
1
1
Note that in the trigonometric limit q = 0 the vertical distance between upper and lower cuts
vanishes and neighboring cuts join together. In the opposite low temperature limit q = 1 all cuts
move to infinity and disappear. In both cases the Riemann surface of the eigenvalues splits into
disjoint components, though in two different ways. In particular, for q = 0 the connected parts of
the Riemann surface no longer possess the 2 -periodicity in .
4.5. The case N = 3
This case is quite important and will be analyzed in more details. When N is odd all eigenvalues of the transfer matrix of the 8V-model are doubly degenerated. So altogether there are four
different eigenvalues. Two of them have the momentum P = 1, while for the other two P = 1 ,
= e2i/3 .
A direct diagonalization of the transfer-matrix leads to the following analytic expressions for
the two eigenvalues of TB (v) with P = 1,
3
B
0,1 (v) = 1 (2|q)
1 (v|q)1 (v + v0,1 |q)1 (v v0,1 |q)

1 (|q)1 ( + v0,1 |q)1 ( v0,1 |q)
(4.50)
where zeros v0,1 satisfy the following transcendental equations

42 (0|q) 32 (v0,1 |q)
32 (0|q) 42 (v0,1 |q)

1 + 7 2 ( 1) + 3 2 [1 + 4 2 ( + 1) + 2 (1 + 2 ) ]
=
,
( 1)[ (3 + 5) 3 1]
(4.51)
with the constants and defined in (A.20). For the choice (4.35) its solutions
v0 = 0.3749001333,
v1 = + 0.29695326i.
2
The remaining two eigenvalues
B
2,3 (v) =
read18
1 (2|q)2 1 (|q)1 (2v|q)

1 (2|q)1 (v 2|q)1 (v + |q)2
1
,
1 (|q)
1 (v|q)
(4.52)
(4.53)
correspond to P = 1 . Their numerical values in Table 3 are real and coincide with each other
only due to the choice of the symmetric point v = /2 where the second term in (4.53) vanishes.
For a generic real v these eigenvalues are complex and non-degenerate.
18 Note that Eq. (4.51) for = /3 has a solution v = 0, corresponding to the simple ground state eigenvalue B (v) =
0
0
1 (v|q)N , considered in [16,17].
262
Table 4
Zeros of the ground state eigenvalues for N = 3
Zeros in u
Q+
0 (u)
Q
0 (u)
0 (u)
(3)
0 (u)

2
2
0.42746i
2
2
2
2
+ 0.42746i
2
2
+ 0.42746i
2
2
2 0.374900
2 1.068576
+ + 0.42746i
2
2
+
2
2
2 + 0.374900
2 + 1.068576
n1 , n2 , n3
2
+ 2
{1, 0, 1}
{1, 0, 1}
4.5.1. The eigenvalue 0

Consider the ground state eigenvalue 0 . The numerical calculation with Eq. (2.31) referred
to in the step (iii) in Section 4.3 above require a separation of zeroes between Q+ (u) and Q (u).
Since the total number of zeroes, 2N = 6, is small we used a simple trial and error method.
Taking an arbitrary subset of 3 zeroes for Q+ (u) and substituting the product (3.15) into the TQequation (2.14) one gets a relation which must be fulfilled identically in the variable u at some
(yet undermined) value of + . If the subset of zeroes is chosen incorrectly then no such value
exist and the process should be repeated.
The numerical results (with the choice (4.35)) are collected in Table 4 and plotted in Fig. 5.
Each of the eigenvalues Q (u) has exactly N zeros in the periodicity rectangular
0 Re(u) ,
| | Im(u) | |.
(4.54)
(k) (u)
Any of the eigenvalues

has 2N zeros in the same domain (their imaginary period is ).
The corresponding Bethe roots are
1
1+ = | |,
2
1
2+ = | | q0 ,
2
and the field parameter + = /2. The roots

+
or a shift
k = Nk
+
| |,
k = k+1
k = 1, . . . , N 1,
1
3+ = | | + q0 ,
2
q0 = 0.427465646
can be obtained from
k+
N
= 1+ + | |.
(4.55)
by either a reflection
(4.56)
Note also that the sets {k+ } and {k } precisely interlace each other (see, e.g., Fig. 10). The above
remarks relating zeroes of the ground state eigenvalues Q+

0 (u) and Q0 (u) apply for arbitrary
odd N .
4.5.2. The eigenvalue 1
This is the smallest eigenvalue for N = 3 (it has the largest eigenvalue of the Hamiltonian).
The numerical zeroes (for the case (4.35)) are presented in Table 5 and plotted in Fig. 6. As
clearly seen from the picture the zeroes of Q+

1 (u) and Q1 (u) form solitary 3-strings.
It is not difficult to check that these solutions of the Bethe ansatz equations are connected with
each other by an analytic continuation in . Consider the path from = 3/2 to = 3/2
along the real axis of (with a small imaginary part for a better convergence). It turns out that
the solution {1 , 2 , 3 } continued along this path transforms into another solution

+ +
k = 1 4| |, 2+ + 2| |, 3+ + 2| | ,
(4.57)
where i+ , i = 1, 2, 3 correspond to zeros of Q+
1 (u) from the Table 5. Remind that k and uk are
related by (4.2). It is easy to check that shifts by 2k| | in (4.57) can be compensated by a proper
263
Fig. 5. Zeros of Q+
0 (u), Q0 (u), 0 (u), 0 (u) for N = 3, marked by , , , , accordingly.
(3)
Table 5
Zeros for the smallest eigenvalue at N = 3
Zeros in u
Q+
1 (u)
Q
1 (u)
1 (u)
(3)
1 (u)
+ + 0.90837673
2
2
+ 0.90837673
2
2
+ 0.29655326i
2
2
2 + 1.10116874
+
2
2

2
2
2
2
+ 0.90837673
2
2
0.90837673
2
2
0.29655326i
2
2
2 1.10116874
n1 , n2 , n3
3
2
+ 3
2
{1, 0, 1}
{1, 0, 1}
choice of nk without changing the field + = 3/2. Table 5 contains the resulting values of nk
corresponding to the roots {1+ , 2+ , 3+ }.
The problem of the analytic continuation between different eigenvalues of the transfer matrix
is much more complicated. The most difficult part of this procedure is a numerical study of the
Riemann surface of the eigenvalues. For N 3 the difficulty lies in the fact that the branching
condition (4.20) just defines an (N 1)-dimensional hyper-surface in the space of Bethe roots
which, by itself, does not correspond to any particular value of the field . To obtain the structure
of the branching points in the -plane one needs to determine how this hyper-surface intersects
with required solutions of the Bethe ansatz equations. In practice we used the differential equations (4.12) and numerically studied monodromy properties of the solutions when was varied
around random loops in the complex plane.
Remarkably, we found that the largest and smallest eigenvalues, 0 (u) and 1 (u), do indeed
correspond to different branches of the same (multivalued) function of . The contour C01 in the
264
Fig. 6. Zeros of Q+
(3)
complex -plane connecting these eigenvalues is shown in Fig. 7. As evident from the figure, the
structure of branching cuts is extremely complicated. This figure shows the (upper half-plane)
cuts on just two sheets of the corresponding Riemann surface. The first sheet (solid line cuts)
contains the eigenvalue Q
0 (u) at = /2, while the second one (dashed line cuts) contains the
eigenvalue Q
(u)
at
=
3/2. Note that the contour C01 leaves the first sheet when it goes
1
under the cut at Re = 0 and arrives to the second of these sheets from under the (dashed line)
cut at Re = . In between these two points the contour crosses additional cuts on other
sheets of the Riemann surface, which are not shown in the picture.
Note that in the limit /3 the zeroes of Q
1 (u) form complete 3-strings which cancel out
from the TQ-relation. As a result one gets

1 (u) = 43 u + q 43 u q ,
3
3
.
3
(4.58)
The minus signs arise from the phase factors exp(2i/) = 1 at = 3/2. When /3
the parameter tends to zero (this parameter determines positions of several cuts in Fig. 7). In
addition to that the dashed line cut with Re = and its symmetric reflection in the lower half
plane both approach the real axes. Therefore, the contour C01 gets pinched twice in both vertical
and horizontal directions. As a result the complete string solution corresponding to 1 (u) for
= /3 is no longer analytically connected to the ground state eigenvalue 0 (u).
265
Fig. 7. Structure of the cuts on the principal sheet Riemann surface containing 0 (solid lines) and on the sheet containing
1 (dashed lines). Only the upper half-plane is shown as the arrangement of the cuts is reflection-symmetric with respect
to the real axis of .
4.5.3. The eigenvalues 2,3

It is enough to consider only one eigenvalue 2 with P = , since all results for 3 can be
obtained by a complex conjugation. The analytic formula for B
2 (v) is given in (4.53). The Bethe
roots and phases for Q
(u)
read
2

1 , 2 , 3 = {0.24023993, 0.45552631 0.75825341i},
{n1 , n2 , n3 } = {0, 0, 1}
(4.59)
and the exponent = /2. Similarly, for Q+

2 (u) one has
k+ = k+ + | |,
+ =
5
,
2
{n1 , n2 , n3 } = {0, 0, 1}.
(4.60)
For the eigenvalue 3 all i , and nk change their sign with respect to those for 2 .
The problem of the analytic continuation for these eigenvalues and, in particular, the question
about their connection to the ground state eigenvalue 0 (u) have not been considered.
4.6. The case N = 4
This is the last case where we systematically study all eigenvalues of the transfer matrix. As
before the eigenvalues will be first grouped according to their momenta, which in this case takes
four possible values P = 1, 1, i. In total there are 16 eigenvalues, all non-degenerate. Below
we will show that all the eigenvalues with P = 1 (there are ten such eigenvalues) correspond to
different branches of the same multivalued function of and explicitly present all paths on the
Riemann surface, which connect these eigenvalues to each other.
4.6.1. The sector P = 1
This sector contains 6 eigenvalues listed in Table 6. Three of them are given by the formula
266
Table 6
Properties of the eigenvalues with P = 1 for N = 4
i
0
1
2
3
4
5
Ei
2.75822456
2.00000000
1.17819166
0.72548274
0.86310373
2.03274182
B
i
1.30674491
0.94124717
0.61623116
0.14757534
0.12908441
0.05569128
2.36255547
0.45441276
0.92097245
1.33882675
1.21915423
0.61370849
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Table 7
Bethe roots, values of the field and phases for P = 1 and N = 4
i
n1 , n2 , n3 , n4
0
1
2
3
4
5
1.41965608
i
i i/2
i 0.59503973i
i i/2
1.65004854 i/2
0.88292901
i /2
i /2
0.59503973i
0
0.65253655 i/2
0
0
0
{3/2, 1/2, 1/2, 3/2}

{3/2, 1/2, 1/2, 3/2}
{3/2, 1/2, 1/2, 3/2}
{3/2, 1/2, 1/2, 3/2}
{3/2, 1/2, 1/2, 3/2}
{5/2, 1/2, 1/2, 5/2}

2
4
4
2 xi + ( + 1)( 1)
B
=
a
+
b
+
2ab
a
+
b
i
1

2
2
2 2 ( + 1)[ ( 1) ( + 1)xi ] xi
2a b
,
( 1)2
(4.61)
where i = 0, 3, 5 and the constants x0 , x3 and x5 satisfy the same cubic equation

x 3 + x 2 ( + 1)2 2 + 1 3 2 + 3 2 + 2( 1)2 ( + 1)( 1)2 ( + 1) = 0
(4.62)
with , defined in (A.20). With the choice (4.35) this equation has three real roots
x0 = 12.98056849,
x3 = 3.41421744,
x5 = 9.56635105.
(4.63)
The remaining three eigenvalues are polynomially expressed through the Boltzmann weights
(A.13)
2

2
2
c + d 2 + a 2 + b2 + c2 + d 2 (ab + cd) + 4abcd,
B
(4.64)
1 = a +b
2
2

2
B
2
2
2
2
2
2 = a + b c + d + a + b + c + d (ab cd) 4abcd,
(4.65)
4
4
2 2
B
4 = a + b 2c d .
(4.66)
The corresponding eigenvalues of the XYZ-Hamiltonian (4.28) read

E1 = 2J1 ,
E2 = 2J2 ,
E4 = 2J3 .
(4.67)
Table 7 contains positions of the Bethe roots 1,2 , the values of the field and phases nk
appearing in (4.6). Note that, since N is even the roots satisfy (4.26) and it is enough to present
only two roots 1,2 .
267
4.6.2. The sector P = 1

This sector contains 4 eigenvalues. All of them are polynomial in Boltzmann weights and the
corresponding Bethe roots can be found explicitly
2 2
4
4
B
6 = 2a b c d ,

2
2

2 2
2 2
2
B
c + d2 ,
7 =2 a b +c d a +b
2

2
2
B
c + d 2 + a 2 + b2 + c2 + d 2 (ab cd) + 4abcd,
8 = a +b
2

2
2
B
c + d 2 + a 2 + b2 + c2 + d 2 (ab + cd) 4abcd.
9 = a +b
(4.68)
(4.69)
(4.70)
(4.71)
The associated eigenvalues of the XYZ-Hamiltonian are

E6 = 2J3 ,
E7 = 0,
E8 = 2J2 ,
E9 = 2J1 .
(4.72)
All numerical values are given in Tables 8 and 9.

Using the differential equations (4.12) we have found that all solutions of the Bethe ansatz
equations (4.6) corresponding to P = 1 can be obtained from the ground state solution by the
analytical continuation in . For the numerical calculations we always assumed the values (4.35).
Consider the principal sheet of the Riemann surface containing the ground state solution at = 0.
As explained before the cut structure on this sheet is always symmetric with respect to the real
axis and periodic with the period 2 . Therefore we consider only the upper half of the periodicity
strip. There is only one cut ( + 0.67535i, + i) on this sheet shown by solid lines in Fig. 8.
Encircling the branching point (as in the contour C12 ) brings one to the second sheet containing
four additional cuts (shown by dashed lines in Fig. 8). In this figure we also shown the contours
which connect 0 with five other eigenvalues 1 , 2 , 3 , 4 , 6 . The change of sign for 6
is related with the transformation (4.11) and (4.31) with m = 1. The horizontal coordinates of
the cuts shown on the figures have been numerically fitted by considering a few different values
of in the vicinity = 7/20.
Paths to the other four eigenvalues 5 , 7 , 8 and 9 are more conveniently described on a
different sheet of the Riemann surface which contains the eigenvalue 4 at = 0. The structure
of cuts on this sheet and connecting contours are shown in Fig. 9. Again at the points 2 /(2)
and 2 2 /(2) we have to take into account (4.31) with m = 1. For 7 we used the shift
(4.3) to move the Bethe roots to the region (4.4).
Table 8
Properties of eigenvalues for P = 1 and N = 4
i
6
7
8
9
Ei
0.86310373
0
1.17819166
2.00000000
B
i
0.51124109
0.27935395
0.09456292
0.05570662
1.25718614
1.63561756
0.86905645
0.54254103
1
1
1
1
1
1
1
1
1
1
1
1
Table 9
Bethe roots, values of the field and phases with P = 1 at N = 4
i
n1 , n2 , n3 , n4
6
7
8
9
i /2 i/2
i /2 i(/2 )
i /2 i/2
i /2 i/2
i /2
i/2 + i(/2 )
0
i/2
0
0
{5/2, 3/2, 1/2, 1/2}

{3/2, 1/2, 1/2, 3/2}
{7/2, 1/2, 3/2, 3/2}
{5/2, 1/2, 1/2, 5/2}
268
Fig. 8. The principal sheet of the Riemann surface containing the eigenvalue 0 for N = 4.
Fig. 9. The sheet of the Riemann surface containing 4 for N = 4.
4.6.3. The sector P = i

There are three complex eigenvalues with P = i and three complex conjugated eigenvalues
with P = i. Again it is sufficient to consider only P = i.

2
2

2 2
2 2
2
c d2 ,
B
(4.73)
10 = 2 a b c d + i a b

2
2
2
2
2
2
B
(4.74)
c2 d 2 ,
11 = (ab + cd) a + b c d + i a b

B
2
2
2
2
2
2
2
2
12 = (ab cd) a + b c d + i a b c d .
(4.75)
Note that for all of them the corresponding energies Ei of the XYZ-Hamiltonian are zero. In
general, these eigenvalues are complex. However, at the symmetric point v = /2 where a = b,
all eigenvalues are real. All information about these eigenvalues is collected in Tables 10, 11.
269
Table 10
Eigenvalues with P = i at N = 4
i
Ei
10
11
12
0
0
0
B
i
0.12908441
0.09456292
0.05570662
1.25718614
0.92097245
0.54254103
i
i
i
1
1
1
1
1
1
Table 11
Bethe roots, values of the field and phases for P = i and N = 4
i
n1 , n2 , n3 , n4
10
11
12
1.50555656 i/2
1.89851386 i/2
3i /4 0.40566532i
0.79702853
1.55536378
3i /4 + 0.40566532i
{5/2, 1/2, 1/2, 3/2}

{3/2, 3/2, 1/2, 1/2}
{3/2, 3/2, 1/2, 1/2}
The parameters of the eigenvalues with the momentum P = i are obtained by negating of all
i s, s and nk s.
The problem of analytic continuation for these eigenvalues has not been considered.
4.7. The case N = 13
The purpose of this section is to analyze a distribution of zeroes for the ground state eigenvalues for sufficiently large value of N . We deliberately choose an odd value of N when Q+ (u)
and Q (u) are linearly independent and their quantum Wronskian is non-zero. In this section it
is more convenient to work with the original Baxters normalization of eigenvalues QB
(v) and
the variable v
Q (u) = eivN/2 QB
.
(4.76)
(v), v = u +
2
Introduce the linear combinations Q1,2 (v),

QB
(4.77)
(v) = Q1 (v) Q2 (v) /2,
such that
Q1,2 (v + ) = (1)(N1)/2 Q2,1 (v).
(4.78)
Numerical zeroes of eigenvalues can be easily calculated from the Bethe ansatz equations (4.6)
with the set of phases (4.32). Here we demonstrate an alternative numerical method which works
well for the considered case. For any odd N the ground state eigenvalue B
0 (v) an odd function
of v, having N zeroes in its periodicity rectangular (2.65). Using the identity
21 (x + y|q)1 (x y|q) = 4 (2x) 3 (2y) 3 (2x) 4 (2y),

x 1/2
i (x) i
q
,
2
(4.79)
it can be represented as
B
0 (v) = 1 (v|q)
(N1)/2

k=0
tk 3 (2v)k 4 (2v)(N12k)/2 ,
(4.80)
270
Table 12
Numerical values of the coefficients ck in (4.81) for N = 13 in the case (4.35)
k
ck
ck
0
1
2
1.00000000
5.63412047
15.52969161
3
4
5
6
26.42365231
28.59503534
18.21779332
5.231349086
Table 13
Bethe roots for the ground state at N = 13
i
1
2
3
4
1.97251218
1.50058366
1.30218100
1.15129255
5
6
7
8
1.00040409
0.80200143
0.33007292
0.64848502
9
10
11
12
13
0.91284796
1.07830969
1.22427541
1.38973713
1.65410007
where tk are constants. Similarly, Q1 (v) can be written as

Q1 (v) = 2N 3 (v)
(N1)/2

ck 3 (v)2k 4 (v)N12k ,
(4.81)
k=0
with some unknown constants ck . In [16] we used the representation (4.81) for = /3, however,
it is valid for the ground state eigenvalues with arbitrary values of (the case = /3 is special
because all coefficients ck can be calculated explicitly [16]). Substituting (4.80) and (4.81) into
the TQ-equation and evaluating it numerically for several values of the spectral parameter one
gets a bi-linear system of equations for the unknown coefficients tk and ck . For the choice (4.35)
its numerical solution is given in Table 12.
Using these results and (4.76)(4.78) one can find the Bethe roots k , k = 1, . . . , 13. The
and = /2.
results for k are given in Table 13, moreover, i+ = Ni+1
(3)
Further, zeros for 0 (u) and 0 (u) are as follows. The 2N = 26 zeros of 0 (u) read
uk =
,
2

r1 , r2 , r3 , r1 , r2 , r3 ,
2
(4.82)
where the signs can be chosen independently and ri stand for complex conjugates of ri . Simi(3)
larly, for 0 (u),

(4.83)
+ {s1 , s2 },
+ s3 , s4 , s3 , s4 .
2
2
2
2
2
The numerical constants ri , si are given in Table 14. The zeroes of eigenvalues are plotted in
Fig. 10.
Note that the sets {k+ } and {k } precisely interlace each other. This fact looks quite remarkable given these sets are obtained from each other by the uniform translation (4.56) over the
distance | | which is much larger than the spacing between roots. For a numerical study of
the Bethe ansatz equations on a large chain of even length (up to N = 512) we refer the reader
to [83].
uk =
271
Table 14
(3)
Parameters sk , rk which determine positions of zeros 0 (u) and 0 (u)
k
sk
rk
1
2
3
4
0.27442683
0.47123771
0.28395326 + 0.15112367i
0.32747706 + 0.34260021i
0.40999133 + 0.42208534i
0.49344215 + 0.21629447i
0.51703097 + 0.067457998i
Fig. 10. Zeros of Q+

(3)
5. Conclusion
In this paper we developed some new ideas in the classical subject of Baxters celebrated
eight-vertex and solid-on-solid models. Our primary observation concerns a (previously unnoticed) arbitrary field parameter in the solvable solid-on-solid model. This parameter is analogous
to the horizontal field in the six-vertex model. This fact might not be so surprising to experts,
since all the hard work has been done before and one just needs to lay side-by-side the papers
[4,11,12] to realize that an arbitrary field parameter is, in fact, required to describe the continuous
spectrum of the unrestricted solid-on-solid model.
The introduction of an arbitrary field allowed us to develop a completely analytic theory of
the functional relations in the 8V/SOS-model. The solutions of the Bethe ansatz equations are
multivalued functions of the field variable, having algebraic branching points. It is plausible that
many (if not all) eigenvalues of the transfer matrix can be obtained from each other via ana-
272
lytic continuation in this variable. To demonstrate this we performed a comprehensive study of

all eigenvalues for the 8V-model for small chains of the length N 4 with a combination of
analytic and numeric techniques. In particular, we saw in these cases that the largest and smallest eigenvalues of the transfer matrix are always connected by the analytic continuation. This
study was partially motivated by our attempts to understand properties of the eigenvalues in the
ferromagnetic regime [84], analytically connecting it with the disordered regime. Note that the
ferromagnetic regime eigenvalues are important in connection with the AdS/CFT correspondence
(see [85] and references therein); we hope to study them elsewhere.
The field parameter is also important in our future considerations [18] of the quantum field
theory limit of the 8V/SOS-model, where it becomes the massive sine-Gordon model. Note, that
the connection between the largest and next-to-largest eigenvalues in this model was previously
studied [53] via the thermodynamic Bethe ansatz. The authors of [53] also considered the analytic continuation but in a different variable, namely, the scaling variable, which has no a direct
analogue in the lattice theory.
It appears that the analytic structure of eigenvalues in the eight-vertex/SOS model certainly
deserves further studies. Somewhat simpler (but still very interesting) structure arises in the sixvertex model and, especially, in the c < 1 conformal field theory [80]. In the latter case the
Riemann surface of the eigenvalues closes within each level subspace of the Virasoro module.
Acknowledgements
The authors thank R.J. Baxter, M.T. Batchelor, B.M. McCoy, K. Fabricius, P.A. Pearce,
S.M. Sergeev, F.A. Smirnov and M. Bortz for useful remarks. One of us (V.V.B.) thanks S.M.
Lukyanov and A.B. Zamolodchikov for numerous discussions of the analytic structure of eigenvalues in solvable models. This work was supported by the Australian Research Council.
Appendix A. The eight-vertex model
In this appendix we briefly summarize basic properties of the symmetric eight-vertex (8V)
model used in this paper. For a more detailed information the reader should consult with Baxters
original publications [15].
Consider a square lattice of N columns and M rows, with the toroidal boundary conditions.
Each edge of the lattice carries a spin variable taking two values = + and = , corresponding to the spin-up and spin-down states of the edge. Each vertex is assigned with a
Boltzmann weight R(, |, ) depending on the spin states , , , of the surrounding
edges arranged as in Fig. 11. There are only eight allowed vertex configurations, shown in
Fig. 12, which have non-vanishing Boltzmann weights. These weights are not arbitrary; they
parameterized by only four arbitrary constants a, b, c, d,
1 = 2 = a,
3 = 4 = b,
5 = 6 = c,
7 = 8 = d.
The remaining eight configurations are forbidden; their Boltzmann weight is zero.
The partition function

R(, |, )
Z=
(A.1)
(A.2)
(spins) (vertices)
is defined as the sum over all spin configurations of the whole lattice, where each configuration
is counted with the weight equal to the product of vertex weights over all lattice vertices.
273
Fig. 11. The arrangement of four spins around the vertex.
Fig. 12. Eight allowed vertex configuration and their Boltzmann weights. Thin lines represent the spin-up states and
the bold lines represent the spin-down states of the edge spins.
The vertex weight matrix R(, |, ) can be thought as a matrix acting in the direct product
of the two two-dimensional vector spaces C2 C2 , where the indices , refer to the first space
and the indices , to the second. It has an elegant representation [1]
R=
4

wk (k k ),
(A.3)
k=1
where k , k = 1, 2, 3, 4, are the Pauli matrices

1
i
,
2 =
,
1 =
1
i

3 =
1
1

4 =
1
1
(A.4)
and
c+d
cd
ab
a+b
(A.5)
,
w2 =
,
w3 =
,
w4 =
.
2
2
2
2
The matrix R can conveniently be presented as a two-by-two block matrix acting in the first
space with the two-dimensional matrix blocks acting in the second space,

R++ R+
R=
(A.6)
R+ R
w1 =
where
R++ =

R =
a
b
b
a

,
R+ =
d
c

,
R+ =
c
d

,
(A.7)
274
The matrix R possesses simple symmetry relations, obvious from (A.3),

R = (k k )R(k k ),
k = 1, 2, 3.
(A.8)
The row-to-row transfer matrix T acts in space of states of N spins located on a horizontal
row of vertical edges. It is defined as the trace of the product of the two-by-two matrices (A.6)

T = TrC2 R(1) R(2) R(N ) ,
(A.9)
where the operator entries (A.7) of the matrix R(j ) act on the states of the j th spin in the row.
The partition function (A.2) can written as

Z = Tr TM ,
(A.10)
where M is the number of rows of the lattice. It follows from (A.8) that
[T, S] = [T, R] = 0
(A.11)
where the operators

(1)
(2)
(N )
S = 3 3 3
(1)
(2)
(N )
R = 1 1 1
(j )
3
(A.12)
(j )
1
defined as the product of the local spin operators

and
over the all sites, j = 1, . . . , N ,
in a row. Note, that for an odd N these operators do not commute among themselves (indeed,
RS = (1)N SR) and only one of them can be diagonalized simultaneously with T(v). Below
we will always assume the basis where the operator S is diagonal (then for even N the operator
R will be diagonal as well).
Following [1] we parameterize the Boltzmann weights a, b, c, d as

a = 4 2|q2 4 v |q2 1 v + |q2 ,

b = 4 2|q2 1 v |q2 4 v + |q2 ,

c = 1 2|q2 4 v |q2 4 v + |q2 ,

d = 1 2|q2 1 v |q2 1 v + |q2 ,
(A.13)
where we use the standard theta-functions [6]
i (v|q),
i = 1, . . . , 4, q = ei ,
Im > 0,
(A.14)
with the periods and . Apart from the simple change in the notation for the theta-functions
the above parametrization is the same as that given by Eq. (8) of [1]. The theta-functions H(z)
and (z) of the nome qB therein are given by

z 2
z 2
2
H(z) = 1
q ,
q ,
(z) = 4
qB = q ,
(A.15)
2KB
2KB
where KB is the complete elliptic integral of the first kind of the modulus
k=
2 (0|qB )
.
3 (0|qB )
(A.16)
The variables v, and used in [1] (hereafter denoted as vB , B and B ) are related to those in
(A.13) by
vB
B
v=
(A.17)
,
=
,
= B .
2KB
2KB
275
With these definitions the transfer matrix (A.9) is the same as that given by Eqs. (6)(8) of
Ref. [1]. We will denote it as TB (v), remembering that our variable v is related to Baxters
variable vB by (A.17).
Throughout this paper we fix the normalization factor in (A.13) as

1
= 22 (0|q)1 4 0|q2 .
(A.18)
Note, that the variables (A.5) can be then written as
1
5i (u|q)
wi = 1 (2|q)
,
2
5i (|q)
i = 1, . . . , 4.
(A.19)
We shall also make use of two invariants

w32 w22
1 (|q)4 (|q) 2
1 (2|q2 ) 2
cd
,
=
=
.
=
=
ab
2 (|q)3 (|q)
4 (2|q2 )
w42 w12
(A.20)
The transfer matrix TB (v) commutes [81,82] with the Hamiltonian of the XYZ-model (remind that we are assuming the periodic boundary condition),
1
(j ) (j +1)
(j ) (j +1)
(j ) (j +1)
Jx 1 1
+ Jy 2 2
+ Jz 3 3
2
N
HXYZ =
(A.21)
j =1
provided
Jx : Jy : Jz =
4 (2|q) 3 (2|q) 2 (2|q)

:
:
.
4 (0|q) 3 (0|q) 2 (0|q)
(A.22)
Following [82] let us show that this Hamiltonian is simply related to the logarithmic derivative
of the transfer matrix at v = . It follows from (A.3) and (A.19) that

d
B
1 (2|q)
log T (v)
dv
v=
= Np4 I +
N

(j ) (j +1)
p1 1 1
(j ) (j +1)
+ p2 2 2
(j ) (j +1)
+ p3 3 3
(A.23)
j =1
where I denotes the unit operator. Here we used new variables

1
1
p1 = (w1 w2 w3 + w4 ),
p2 = (w1 + w2 w3 + w4 ),
2
2
1
1
p3 = (w1 w2 + w3 + w4 ),
p4 = (w1 + w2 + w3 + w4 ).
2
2
Their v-derivatives evaluated at v = denoted as

dpi
.
pi =
dv v=
(A.24)
(A.25)
(A.26)
Note that
p1 + p2 + p3 + p4 = 2w4 .
(A.27)
276
Using (A.19) on can readily show that

2pi = 1 (0|q)
5i (2|q)
,
5i (0|q)
i = 1, 2, 3.
(A.28)
Combining the last relation with (A.23) one can express the Hamiltonian (A.21) as

Jx 4 (0|q)
d
B

1 (2|q) log T (v) Np4 I .
HXYZ =
4 (2|q)1 (0|q)
dv
v=
(A.29)
In the main text of the paper we mostly use the spectral parameter u which is related to
variable v in (A.13) as
u = v /2.
(A.30)
TB (u + /2)
In (2.64) we also define the renormalized transfer-matrix T(u) differing from

by
a simple u-dependent factor. Of course, this factor could have been included in the normalization
of the Boltzmann weights (A.13), but we prefer not to do so and keep a clear distinction between
two alternative normalizations and, in fact, to use both of them. In particular, the Baxters normalization and the variable v are more suitable in writing explicit expressions for the eigenvalues
of the transfer matrix presented in Section 4.
There are two related, but different, constructions of the Q-matrix for the 8V-model, given
in Baxters 1972 and 1973 papers [1] and [2]. The results of [1] apply for rational s and an
arbitrary number of sites N , while the those of [2] apply to arbitrary s and even N . We will
denote the corresponding Q-matrices as Q(72) (v) and Q(73) (v). They obey the same TQ-equation
(Eq. (4.2) of [1] and Eq. (87) of [2]),
TB (v)QB (v) = (v )QB (v + 2) + (v + )QB (v 2),
(A.31)
where

N
B (v) = 1 (v|q) ,
(A.32)
and possess the same periodicity properties

QB (v + ) = SQB (v),
QB (v + 2 ) = qN e2ivN QB (v),
(A.33)
stands for either of

or
Nevwhere S and R are defined in (A.12) and
ertheless, as noticed in [14], these two Q-matrices do not coincide. Of course, they can only be
compared for rational and even N when both of them can be constructed. As shown in [14] the
matrices Q(72) (v) and Q(73) (v) have different eigenvalues for degenerate eigenstates of the transfer matrix. This phenomenon is obviously related with the non-uniqueness of the eigenvectors
for the degenerate eigenstates [13] and demonstrates the impossibility of an universal definition of the Q-matrix in the zero-field 8V-model at rational (cf. [54]). However, as far as only
non-degenerate eigenstates are concerned both operators Q(72) (v) and Q(73) (v) have identical
eigenvalues.
The difference between Q(72) (v) and Q(73) (v) can be traced back [14] to their commutativity
properties with the operator R. The matrix Q(73) (v) (defined for even N only) always commutes
with R. The matrix Q(72) (v) (defined for all values of N ) does not commute with R, irrespectively to whether N is odd or even. On the other hand both these matrices commute with S, so
that the first equation immediately translates into the (real period) periodicity for the eigenvalues (the first relation in (2.66)). The second equation in (A.33) implies only the 2 -periodicity
relation (the second relation in (2.66)).
QB (v)
Q(72) (v)
Q(73) (v).
277
To complete our discussion of various constructions for the Q-matrix, note that in [5] Baxter
considered a modified version of his Q(73) matrix. There he used yet another set of parameters,
namely the variables v and , which we denote here as v (82) and (82) . They are related to v and
in (A.13) as
v (82) =
2i KB
(2v ),
(82) =
2i KB
( 2)
(A.34)
where KB is the same as in (A.17). Writing the Q-matrix of [5] as Q(82) (v (82) , (82) ) and that of
[2] as Q(73) (v, ), and using the explanations in Section 7 of [13] it is not difficult to verify that

Q(73) (v, ) = RDQ(82) v (82) , (82)
(A.35)
where the operator
i
D = e + 4 3 e 4 3 e + 4 3 e 4 3
(A.36)
is defined similarly to those in (A.12). Using the relations

Q(73) (v + ) = qN/2 eivN RQ(73) (v)
(A.37)
one can show that

Q(82) v (82) + 2KB = (qB )N/4 exp v (82) N/4KB RSQ(82) v (82)
(A.38)
RD = (1)N/2 DRS,
which is exactly Eq. (10.5.43a) of [5]. Note that the last equation contains the prefactor RS in
the RHS instead of just R in the second relation in (A.33).
The partition function (A.2) can be regarded as a function of four parameters w1 , w2 , w3 , w4 ,
defined in (A.5). So one can write it as Z(w1 , w2 , w3 , w4 ). When both M and N are even, it
possesses the symmetry
Z(w1 , w2 , w3 , w4 ) = Z(wi , wj , wk , wl ),
(A.39)
where {i, j, k, l} is an arbitrary permutation of {1, 2, 3, 4} and all signs can be chosen independently. Using this symmetry one can always rearrange wi so that
w1 > w2 > w3 > |w4 |.
(A.40)
The partition function-per-site, (v), is defined as

log (v) =
1
1
log Z = lim
log B
0 (v)
M,N MN
N N
lim
(A.41)
B
where B
0 (v) is the maximum eigenvalue of the transfer-matrix T (v). In the (unphysical)
regime (A.40) it reads [1]
log (v) = log c + 2

sinh2 ((B B )n) (cosh(nB ) cosh(nB ))
n=1
n sinh(2nB ) cosh(nB )
(A.42)
where
B = 2iv,
B = 2i,
B = i.
(A.43)
278
In this paper we consider the disordered regime19
, < v < ,
2
which corresponds to a different ordering of the variables wi , namely,
0<<
(A.44)
w4 > w1 > w2 > w3 > 0.
(A.45)
This regime can be mapped into (A.40) by the following transformation of the weights
ab+cd
,
2
a+bcd
d =
,
2
which is equivalent to
a =
w1 = w4 ,
w1 ,
b =
abc+d
,
2
c =
a+b+c+d
,
2
(A.46)
w2 = w1 ,
w3 = w2 ,
w4 = w3 ,
(A.47)
w2 ,
w3 , w4 are defined by (A.5) with a, b, c, d replaced by a , b , c , d . It is easy to

where
check that if w1 , w2 , w3 , w4 belong to the regime (A.45) then w1 , w2 , w3 , w4 belong to (A.40).
Putting (A.13) into (A.46) it is not very difficult to show that the transformed weights a , b , c ,
d can be parameterized by same formulae (A.13) provided one makes the following substitution
v
v /2
,
/2
,
1
,
1 ,
(A.48)
where

i
1 = i(i )1/2 exp

v(v ) + 3( ) + 2 .
(A.49)
Thus, to find the partition function-per-site in the regime (A.45) one needs to substitute parameters B , B and B in (A.42) by
B =
i( 2v)
,
B =
i( 2)
,
B =
i
,
(A.50)
and replace c by c . After straightforward calculations one obtains
log (v) = log 1 (v + |q) +
2i(v ) sinh(2xn ) sinh(2(v )xn )

+
,
n sinh(xn ) cosh(( 2)xn )
(A.51)
n=1
where xn = in/ . The above formula can be equivalently rewritten as

i( 3)

sinh(xn ) sinh(( 3)xn ) cosh(( 2v)xn )
.
+2
n sinh(xn ) cosh(( 2)xn )
log (v) = log 1 (v|q) +
(A.52)
n=1
19 See [5] for the classification of the regimes of the 8V-model. Beware that the variables w therein are numbered
i
differently as compared to this work. Here we follow the original papers [14].
279
Note that for = /3 the second and third terms in the last expression vanish, so it simplifies
to
.
(A.53)
3
Recall that the above derivation was based on the symmetry (A.39) valid for even values of N
only. However, since in the disordered regime (A.45) the Boltzmann weights (A.13) are strictly
positive, the result (A.53), obviously, does not depend on the way the limit N is taken.
Eqs. (A.41) and (A.53) imply that the asymptotics on the largest eigenvalue of TB (v) in this
N
special case has an extremely simple form B
0 (v) (1 (v|q)) . Remarkably, as it was argued
in [16,42,86], for odd values of N this asymptotics is exact even for a finite lattice,
N
N
= .
B
(A.54)
0 (v) = (a + b) = 1 (v|q), N = 2n + 1,
3
We have also checked this fact analytically up to N = 15.
log (v) = log 1 (v|q),
References
[1] R.J. Baxter, Partition function of the eight-vertex lattice model, Ann. Phys. 70 (1972) 193228.
[2] R.J. Baxter, Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain. I. Some
fundamental eigenvectors, Ann. Phys. 76 (1973) 124.
[3] R.J. Baxter, Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain. II. Equivalence to a generalized ice-type lattice model, Ann. Phys. 76 (1973) 2547.
[4] R.J. Baxter, Eight-vertex model in lattice statistics and one-dimensional anisotropic Heisenberg chain. III. Eigenvectors of the transfer matrix and Hamiltonian, Ann. Phys. 76 (1973) 4871.
[5] R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982.
[6] E. Whittaker, G. Watson, A Course of Modern Analysis, Cambridge Univ. Press, Cambridge, 1996.
[7] L.D. Faddeev, E.K. Sklyanin, L.A. Takhtajan, The quantum inverse problem method. 1, Theor. Math. Phys. 40
(1980) 688.
[8] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Integrable structure of conformal field theory. II. Q-operator
and DDV equation, Commun. Math. Phys. 190 (1997) 247278, hep-th/9604044.
[9] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Integrable structure of conformal field theory. III. The Yang
Baxter relation, Commun. Math. Phys. 200 (1999) 297324, hep-th/9805008.
[10] I. Krichever, O. Lipan, P. Wiegmann, A. Zabrodin, Quantum integrable models and discrete classical Hirota equations, Commun. Math. Phys. 188 (1997) 267304, hep-th/9604080.
[11] L.A. Takhtajan, L.D. Faddeev, The quantum method for the inverse problem and the XYZ Heisenberg model, Usp.
Mat. Nauk 34 (5) (1979) 1363, Russian Math. Surveys 34 (5) (1979) 1168 (English translation).
[12] G. Felder, A. Varchenko, Algebraic Bethe ansatz for the elliptic quantum group E, (sl2 ), Nucl. Phys. B 480 (1996)
485503.
[13] R.J. Baxter, Completeness of the Bethe ansatz for the six- and eight-vertex models, J. Stat. Phys. 108 (2002) 148.
[14] K. Fabricius, B.M. McCoy, New developments in the eight vertex model, J. Stat. Phys. 111 (2003) 323337.
[15] K. Fabricius, B.M. McCoy, Bethes equation is incomplete for the XXZ model at roots of unity, J. Stat. Phys. 103
(2001) 647678.
[16] V.V. Bazhanov, V.V. Mangazeev, Eight-vertex model and non-stationary Lame equation, J. Phys. A 38 (2005) L145
L153, hep-th/0411094.
[17] V.V. Bazhanov, V.V. Mangazeev, Eight vertex model and Painlev VI, J. Phys. A 39 (2006) 1223512243, hepth/0602122.
[18] V.V. Bazhanov, V.V. Mangazeev, Analytic theory of the eight vertex model II. Partial differential equations and
Painlev transcendents, in preparation.
[19] R.J. Baxter, Generalized ferroelectric model on a square lattice, Stud. Appl. Math. 1 (1971) 5169.
[20] G.E. Andrews, R.J. Baxter, P.J. Forrester, 8-vertex SOS model and generalized RogersRamanujan-type identities,
J. Stat. Phys. 35 (1984) 193266.
[21] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Quantum field theories in finite volume: Excited state energies, Nucl. Phys. B 489 (1997) 487531, hep-th/9607099.
280
[22] P.P. Kulish, N.Y. Reshetikhin, E.K. Sklyanin, YangBaxter equations and representation theory. I, Lett. Math.
Phys. 5 (1981) 393403.
[23] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, 1997, unpublished.
[24] P. Fendley, A.W.W. Ludwig, H. Saleur, Exact non-equilibrium transport through point contacts in quantum wires
and fractional quantum Hall devices, Phys. Rev. B 52 (1995) 89348950, cond-mat/9503172.
[25] R.J. Baxter, P.A. Pearce, Hard hexagons: Interfacial tension and correlation length, J. Phys. A 15 (1982) 897910.
[26] V.V. Bazhanov, N.Y. Reshetikhin, Critical RSOS models and conformal field theory, Int. J. Mod. Phys. A 4 (1989)
115142.
[27] E. Date, M. Jimbo, T. Miwa, M. Okado, Fusion of the eight vertex SOS model, Lett. Math. Phys. 12 (1986) 209215;
E. Date, M. Jimbo, T. Miwa, M. Okado, Lett. Math. Phys. 14 (1987) 97, Errata.
[28] K. Fabricius, B.M. McCoy, New developments in the eight vertex model. II. Chains of odd length, J. Stat. Phys. 120
(2005) 3770.
[29] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Integrable structure of conformal field theory, quantum KdV
theory and thermodynamic Bethe ansatz, Commun. Math. Phys. 177 (1996) 381398, hep-th/9412229.
[30] Y.G. Stroganov, A new calculation method for partition functions in some lattice models, Phys. Lett. A 74 (1979)
116118.
[31] A.B. Zamolodchikov, Z4 -symmetric factorized S-matrix in two spacetime dimensions, Commun. Math. Phys. 69
(1979) 165178.
[32] R.J. Baxter, The inversion relation method for some two-dimensional exactly solved models in lattice statistics, J.
Stat. Phys. 28 (1982) 141.
[33] M. Takahashi, Simplification of thermodynamic Bethe-ansatz equations, cond-mat/0010486.
[34] Y.G. Stroganov, 1979, unpublished.
[35] E.H. Lieb, Residual entropy of square ice, Phys. Rev. 162 (1967) 162172.
[36] N.Y. Reshetikhin, The functional equation method in the theory of exactly soluble quantum systems, Zh. Eksp. Teor.
Fiz. 84 (1983) 11901201.
[37] A.N. Kirillov, N.Y. Reshetikhin, Exact solution of the integrable XXZ Heisenberg model with arbitrary spin. I. The
ground state and the excitation spectrum, J. Phys. A 20 (1987) 15651585.
[38] E.K. Sklyanin, Some algebraic structures connected with the YangBaxter equation, Funct. Anal. Appl. 16 (1982)
263270.
[39] E.K. Sklyanin, Some algebraic structures connected with the YangBaxter equation. Representations of a quantum
algebra, Funct. Anal. Appl. 17 (1983) 273284.
[40] I. Krichever, A. Zabrodin, Vacuum curves of elliptic L-operators and representations of the Sklyanin algebra, in:
Moscow Seminar in Mathematical Physics, in: American Mathematical Society Translations, Series 2, vol. 191,
American Mathematical Society, Providence, RI, 1999, pp. 199221.
[41] K. Fabricius, B.M. McCoy, Functional equations and fusion matrices for the eight vertex model, Publ. Res. Inst.
Math. Sci. 40 (2004) 905932.
[42] R.J. Baxter, Solving models in statistical mechanics, Adv. Stud. Pure Math. 19 (1989) 95116.
[43] R.J. Baxter, Hard hexagons: Exact solution, J. Phys. A 13 (1980) L61L70.
[44] R.I. Nepomechie, Functional relations and Bethe ansatz for the XXZ chain, J. Stat. Phys. 111 (2003) 1363.
[45] P. Fendley, Airy functions in the thermodynamic Bethe ansatz, Lett. Math. Phys. 49 (1999) 229233.
[46] Y.G. Stroganov, The XXZ-spin chain with the asymmetry parameter = 1/2. Computation of the simplest correlators, Teor. Mat. Fiz. 129 (2001) 345359.
[47] C.N. Yang, C.P. Yang, Thermodynamics of a one-dimensional system of bosons with repulsive delta-function interaction, J. Math. Phys. 10 (1969) 11151122.
[48] M. Takahashi, M. Suzuki, One-dimensional anisotropic Heisenberg model at finite temperatures, Prog. Theor.
Phys. 48 (1972) 21872209.
[49] A.B. Zamolodchikov, On the thermodynamic Bethe ansatz equations for reflectionless ADE scattering theories,
Phys. Lett. B 253 (1991) 391394.
[50] A. Klmper, P.A. Pearce, Conformal weights of RSOS lattice models and their fusion hierarchies, Physica A 183
(1992) 304350.
[51] A. Kuniba, T. Nakanishi, J. Suzuki, Functional relations in solvable lattice models. I. Functional relations and
representation theory, Int. J. Mod. Phys. A 9 (1994) 52155266, hep-th/9309137.
[52] P. Fendley, F. Lesage, H. Saleur, Solving 1D plasmas and 2D boundary problems using Jack polynomials and
functional relations, J. Stat. Phys. 79 (1995) 799819.
[53] P. Dorey, R. Tateo, Excited states by analytic continuation of TBA equations, Nucl. Phys. B 482 (1996) 639659.
281
[54] V.V. Bazhanov, Y.G. Stroganov, Chiral Potts model as a descendant of the six-vertex model, J. Stat. Phys. 59 (1990)
799817.
[55] G. von Gehlen, V. Rittenberg, Z(n)-symmetric quantum chains with an infinite set of conserved charges and Z(n)
zero modes, Nucl. Phys. B 257 (1985) 351.
[56] H. Au-Yang, B.M. McCoy, J.H.H. Perk, S. Tang, M.-L. Yan, Commuting transfer matrices in the chiral Potts models:
Solutions of star triangle equations with genus > 1, Phys. Lett. A 123 (1987) 219223.
[57] R.J. Baxter, J.H.H. Perk, H. Au-Yang, New solutions of the star triangle relations for the chiral Potts model, Phys.
Lett. A 128 (1988) 138142.
[58] M. Kashiwara, T. Miwa, A class of elliptic solutions to the star-triangle relation, Nucl. Phys. B 275 (1986) 121134.
[59] K. Hasegawa, Y. Yamada, Algebraic derivation of the broken ZN -symmetric model, Phys. Lett. A 146 (1990) 387
396.
[60] M. Gaudin, La relation toile-triangle dun modle elliptique ZN , J. Phys. I 1 (1991) 351361.
[61] V. Pasquier, M. Gaudin, The periodic Toda chain and a matrix generalization of the Bessel function recursion
relations, J. Phys. A 25 (1992) 52435252.
[62] S.. Derkachov, Baxters Q-operator for the homogeneous XXX spin chain, J. Phys. A 32 (1999) 52995316.
[63] E.K. Sklyanin, Bcklund transformations and Baxters Q-operator, in: Integrable Systems: From Classical to Quantum (Montral, QC, 1999), in: CRM Proc. Lecture Notes, vol. 26, American Mathematical Society, Providence, RI,
2000, pp. 227250.
[64] T. Deguchi, K. Fabricius, B.M. McCoy, The sl2 loop algebra symmetry of the six-vertex model at roots of unity, J.
Stat. Phys. 102 (2001) 701736.
[65] R.M. Kashaev, The non-compact quantum dilogarithm and the Baxter equations, J. Stat. Phys. 102 (2001) 923936.
[66] K. Fabricius, B.M. McCoy, Evaluation parameters and Bethe roots for the six-vertex model at roots of unity, in: M.
Kashiwara, T. Miwa (Eds.), MathPhys Odyssey 2001 (Hardcover), in: Progress in Mathematical Physics, vol. 23,
Birkhuser, Boston, 2002, pp. 119144.
[67] F.A. Smirnov, Dual Baxter equations and quantization of affine Jacobian, math-ph/0001032.
[68] A. Zabrodin, Commuting difference operators with elliptic coefficients from Baxters vacuum vectors, J. Phys. A 33
(2000) 38253850.
[69] K. Fabricius, B.M. McCoy, An elliptic current operator for the eight-vertex model, J. Phys. A: Math. Gen. 39 (2006)
1486914886.
[70] S.M. Khoroshkin, A.A. Stolin, V.N. Tolstoy, Generalized Gauss decomposition of trigonometric R-matrices, Mod.
Phys. Lett. A 10 (1995) 13751392.
[71] A.A. Belavin, A.V. Odesskii, R.A. Usmanov, New relations in the algebra of the Baxter Q-operators, Theor. Math.
Phys. 130 (2002) 323350.
[72] C. Korff, Auxiliary matrices on both sides of the equator, J. Phys. A: Math. Gen. 38 (2005) 4767.
[73] C. Korff, A Q-operator identity for the correlation functions of the infinite XXZ spin-chain, J. Phys. A: Math.
Gen. 38 (2005) 66416658.
[74] H. Boos, M. Jimbo, T. Miwa, F. Smirnov, Y. Takeyama, Hidden Grassmann structure in the XXZ model, hepth/0606280.
[75] G. Albertini, B.M. McCoy, J.H.H. Perk, Eigenvalue spectrum of the superintegrable chiral Potts model, in: Integrable Systems in Quantum Field Theory and Statistical Mechanics, in: Advanced Studies in Pure Mathematics,
vol. 19, Academic Press, Boston, MA, 1989, pp. 155.
[76] R.J. Baxter, V.V. Bazhanov, J.H.H. Perk, Functional relations for transfer matrices of the chiral Potts model, Int. J.
Mod. Phys. B 4 (1990) 803870.
[77] G.P. Pronko, Y.G. Stroganov, Bethe equations on the wrong side of the equator, J. Phys. A 32 (1999) 23332340.
[78] T. Deguchi, Construction of some missing eigenvectors of the XYZ spin chain at the discrete coupling constants
and the exponentially large spectral degeneracy of the transfer matrix, J. Phys. A 35 (2002) 879895.
[79] J.D. Johnson, S. Krinsky, B.M. McCoy, Vertical-arrow correlation length in the eight-vertex model and the lowlying excitations of the XYZ Hamiltonian, Phys. Rev. A 8 (1973) 25262547.
[80] V.V. Bazhanov, S.L. Lukyanov, A.B. Zamolodchikov, Higher-level eigenvalues of Q-operators and Schroedinger
equation, Adv. Theor. Math. Phys. 7 (2003) 711725.
[81] B. Sutherland, Two-dimensional hydrogen bonded crystals without the ice rule, J. Math. Phys. 11 (1970) 3183
3186.
[82] R.J. Baxter, One-dimensional anisotropic Heisenberg chain, Ann. Phys. 70 (1972) 323337.
[83] M.T. Batchelor, M.N. Barber, P.A. Pearce, Bethe ansatz calculations for the eight-vertex model on a finite strip, J.
Stat. Phys. 49 (1987) 11171163.
[84] B. Sutherland, Low-lying eigenstates of the one-dimensional Heisenberg ferromagnet for any magnetization and
momentum, Phys. Rev. Lett. 74 (1995) 816.
282
[85] N. Beisert, M. Staudacher, J.A. Minahan, K. Zarembo, Stringing spins and spinning strings, JHEP 0309 (2003) 010.
[86] Y. Stroganov, The 8-vertex model with a special value of the crossing parameter and the related XYZ spin chain,
in: Integrable Structures of Exactly Solvable Two-Dimensional Models of Quantum Field Theory (Kiev, 2000),
in: NATO Science Series II: Mathematics, Physics and Chemistry, vol. 35, Kluwer Academic, Dordrecht, 2001,
pp. 315319.
Lattice gauge theory approach to spontaneous

symmetry breaking from an extra dimension
Nikos Irges a , Francesco Knechtli b,
a Department of Physics and Institute of Plasma Physics, University of Crete, GR-710 03 Heraklion, Crete, Greece
b CERN, Physics Department, TH Division, 1211 Geneva 23, Switzerland
Received 6 October 2006; accepted 15 January 2007

Available online 27 January 2007
Abstract
We present lattice simulation results corresponding to an SU(2) pure gauge theory defined on the orbifold
space E4 I1 , where E4 is the four-dimensional Euclidean space and I1 is an interval, with the gauge
symmetry broken to a U (1) subgroup at the two ends of the interval by appropriate boundary conditions. We
demonstrate that the U (1) gauge boson acquires a mass from a Higgs mechanism. The mechanism is driven
by two of the extra-dimensional components of the five-dimensional gauge field which play respectively
the role of the longitudinal component of the gauge boson and a massive real physical scalar, the Higgs
particle. Despite the non-renormalizable nature of the theory, we observe only a mild cut-off dependence
of the physical observables. We also show evidence that there is a region in the parameter space where the
system behaves in a way consistent with dimensional reduction.
1. Introduction
Spontaneous symmetry breaking (SSB) is the phenomenon where the ground state of a system
does not access all of its available symmetry, apparently breaking the symmetry group to a subgroup. In the Standard Model (SM) this is a crucial mechanism and it is not only responsible for
predicting the existence of a fundamental scalar field, the Higgs particle, but also for the gauge
bosons and fermions acquiring a mass. The somewhat unsatisfactory fact about this mechanism
in the SM, is that the Higgs potential, which is the concrete object that drives SSB, is input by
E-mail addresses: irges@physics.uoc.gr (N. Irges), knechtli@mail.cern.ch (F. Knechtli).

doi:10.1016/j.nuclphysb.2007.01.023
284
N. Irges, F. Knechtli / Nuclear Physics B 775 [FS] (2007) 283311
hand at tree level in the Lagrangian, simply because we do not have any more fundamental way
to generate it. There are many ideas of course trying to suggest an origin for the Higgs and its
potential, one of the most elegant being that the Higgs field is the extra-dimensional component
of a higher dimensional gauge field and that the potential is generated quantum mechanically [1].
The earliest scenarios considered as extra-dimensional space the sphere S 2 [25]. In later applications the extra-dimensional space was taken to be non-simply connected, like S 1 or T 2 [68],
so that the (non-contractible) Polyakov loops are non-trivial. This is the general context where
we would like to put ourselves in the present work. Alternative ways to achieve SSB in gauge
theories with extra dimensions are discussed in [9,10].
There are enough motivations to take this idea seriously besides the economic way of generating the Higgs with its potential but the property that drew a lot of recent attention to these
theories is the so-claimed attractive possibility of all order finiteness of the physical scalar mass.
This sounds like a paradox from the beginning since the very point which has kept many field
theorists rather hesitant from taking such an idea seriously is that higher (than four) dimensional
gauge theories are non-renormalizable and in a typical non-renormalizable theory one would expect that a mass parameter receives quantum corrections appearing in an arbitrary power of some
dimensionless quantity built out of a dimensionful coupling and the cut-off. This is to be compared with the renormalizable SM where the couplings are dimensionless and the Higgs mass
receives only a quadratic ultra-violet (UV) cut-off dependence under quantum corrections. Since
even this quadratic UV sensitivity has been viewed as a drawback, supersymmetric generalizations of the SM were introduced and analyzed in detail, where the power like cut-off sensitivity
is not present due to cancellations of infinities between superpartners. There is no doubt that
supersymmetry is an elegant solution to the problem but it could happen that it is not realized at
energies accessible in near future collider experiments so it is useful to be aware of alternative
solutions. Back then to extra dimensions, in the case where the extra (fifth here) dimension is
compactified on a circle, one can carry out a one-loop calculation of the Higgs mass and verify
its aforementioned finiteness [11,12] and can even give all order arguments to that effect [1320],
but the problem with this solution is that a simple circle compactification cannot be realistic for
various reasons, the absence of chiral fermions being one of the main.
A way out is to compactify the extra dimension on an interval I1 instead of a circle, which
can support chiral fermions at the two ends of the interval. Since the interval can be obtained
easily from the circle by orbifolding, i.e. by identifying points and fields in the circle theory
under the Z2 reflection operator R : x 5 x 5 , we will use the name orbifold when we refer to
such a theory. A characteristic property of this orbifold is that it is defined on a space with two
four-dimensional boundaries at each of the fixed points of the reflection action, where the gauge
symmetry is reduced, thus naturally differentiating the boundaries from the rest of the space,
which we call the bulk. An unfortunate consequence of field theories defined in such spaces is
that the all order finiteness of the Higgs mass arguments are not anymore applicable because
of the bulk-boundary interactions appearing at higher orders in perturbation theory which start
to infect the finite bulk mass with cut-off dependence [21]. This is not unexpected; it is known
that handling non-renormalizable theories analytically is not easy, in fact there is no general
prescription that can be used in these theories such that their predictions are trustworthy.
We would therefore like here to start a systematic investigation of higher dimensional orbifold gauge theories from the point of view of a lattice regularization [17,22,23]. The theory
which will serve as our concrete example is an SU(2) gauge theory which has the symmetry
broken (by boundary conditions) to its U (1) subgroup on the boundaries. This, we will argue,
is a promising way of approaching extra-dimensional theories: the goal is a non-perturbative
285
understanding of a class of non-renormalizable theories and the hope is a non-perturbative understanding of the SM Higgs mechanism.
It turns out that phenomenology puts surprisingly tight constraints on models and we will
concentrate here on the two most immediate ones as one proceeds with the construction. One is
associated with the mere existence of a four-dimensional effective action and the second, closely
related to SM phenomenology, the expected hierarchy of masses between the physical Higgs and
the massive gauge bosons in the broken phase of the SM.
1.1. Dimensional reduction
The first issue is to show that an extra-dimensional theory possesses in its parameter space
a regime where it undergoes some kind of effective dimensional reduction. This can happen, in
principle, in more than one ways. Historically, the first mechanism of hiding the extra dimension is going to the KaluzaKlein gauge and making the size of the extra dimension small
enough, so that its effects are negligible up to a certain energy scale and at the same time keeping
the couplings perturbative so that the resulting effective theory can be useful for electroweak
(or gravitational) physics. This is of course guaranteed as long as one treats the radius of the
circle R, the cut-off1 and the gauge coupling g5 as independent parameters which is perhaps
justified to a certain extentit depends essentially on how far one is willing to go in perturbation
theory. In a non-perturbative formulation on the other hand it becomes right away obvious that
these three parameters are tightly connected from the beginning and depending on where one sits
in parameter space, a small change in one of the parameters could result in a dramatic change
of the system. From this point of view, the region in parameter space where the theory describes
the real world, (if it exists) could be only a small patch, which makes one wonder whether the
free stretching of the dimensionful parameters sometimes employed to make a model phenomenologically viable is a valid operation.
More recently, mechanisms of dimensional reduction which depend on localization rather than
compactification have been proposed. In the context of gauge theories, all of these mechanisms
involve one way or another a strong coupling and therefore non-perturbative physics. One, is the
so-called layered phase originally proposed in [24]. The idea is to investigate a five-dimensional
lattice model where the gauge coupling in the four-dimensional slices along the extra dimension is different from the gauge coupling that describes the interaction between the slices. This
anisotropy could give rise to a new, so-called layered, phase where the static force is of Coulomb
type in the four-dimensional slices and confining along the extra dimension, thus providing a localization mechanism. Some evidence for the existence of the layered phase in Abelian gauge
theory was recently given in [25]. Another similar idea was due to [26] where it is assumed that
the system has a phase where the bulk is in a confined while the boundary (defined by a domain
wall) is in a deconfined phase, which forces the boundary gauge fields to remain localized. This
idea was investigated on the lattice [27] and it was found that the low-energy effective theory
contains not only the localized zero-modes but also higher KaluzaKlein modes.
A different mechanism of dimensional reduction was proposed in the context of the D-theory
regularization of non-Abelian gauge theories [2830]. Here a non-Abelian gauge theory in five
dimensions arises as a low-energy effective description of a five-dimensional quantum link
model. The size L of four dimensions is taken to be infinity and the fifth dimension has a fi1 Extra-dimensional theories are non-renormalizable and make sense only with a cut-off in place.
286
nite size R. If the five-dimensional theory happens to be in the Coulomb phase, the gluons are
not massless but gain a mass which is exponentially small in the size of the extra dimension. Thus
by making R larger the correlation length given by the inverse gluon mass grows exponentially
fast in R and therefore the extra dimension disappears.
In this paper we will not be able to give a conclusive answer to this important issue since it
would involve a very extensive scan of the parameter space. We will give though some indirect
evidence for dimensional reduction in the orbifold theory and we will leave the question of
which mechanism is responsible for it, for a future work. At a qualitative level, we will be able
to map the part of the parameter space of our model where simulations were carried out onto
the phase diagram of a known and well understood four-dimensional theory, the Abelian Higgs
model, which provides our first indirect evidence for an effective dimensional reduction. Our
more quantitative, but still indirect, evidence for dimensional reduction will be the measurement
of the static potential between two infinitely heavy charged particles placed on four-dimensional
slices located at the origin and in the middle of the fifth dimension, separated by a distance r.
Using the fact that we measure a massive U (1) gauge boson, we will fit the numerical data for
the static potential to a five-dimensional Yukawa potential and to a four-dimensional Yukawa
potential and compare the fits in both cases.
1.2. Hierarchy of masses
A universal feature that perturbative five-dimensional orbifold pure gauge theories seem to
posses is that the gauge bosons that survive on the boundaries are all massless and the mass of
the Higgs turns out to be too low. In order to make some of the gauge bosons heavy one is forced
to introduce fermions in the bulk [31], but still the ratio of the Higgs to the gauge boson mass is
mh /m 1 [32,33] and thus phenomenologically excluded, unless additional assumptions are
made. This property has been traced to the fact that since in five dimensions gauge invariance
forbids a tree level Higgs mass, it has to be generated quantum mechanically. This one-loop mass
turns out to be suppressed compared to the gauge boson mass which is governed by a vacuum
expectation value of order one. More concretely, at one loop, the Higgs mass is found to be
mh g5 /R 3/2 , where g5 is the five-dimensional gauge coupling, whereas the gauge boson mass
is basically m /R where is a dimensionless vacuum expectation value obtained from
minimizing the potential (see Appendix A). Realistic models require a very small even though
typical one-loop potentials tend to yield an which is either zero or of order 0.1.
We will be able to compute the ratio mh /m non-perturbatively and find that, contrary to
perturbative expectations, a Higgs mechanism is at work, resulting into a non-zero gauge boson
mass, for a large part of the parameter space (in fact for the whole range of parameters we were
able to scan). Again, disentangling the precise dependence of this ratio on compactification and
finite lattice size effects requires a larger scan of the parameter space, which will be the topic of
a future work.
1.3. Organization of the paper
In this paper we investigate numerically a one-dimensional subspace of the parameter space
of a five-dimensional lattice gauge theory, starting from the vicinity of the first order phase transition and approaching the perturbative domain. In Section 2, using generic properties of the
measured orbifold spectrum, we argue that the spectrum of the system in part of this region
seems to be consistent with an effective dimensional reduction. In Section 3 we construct in de-
287
tail the lattice regulated theory and in particular its observables. In Section 4 we present in detail
our quantitative evidence for spontaneous symmetry breaking and dimensional reduction. In Section 5 we state our conclusions. Finally, for completeness we provide two detailed appendices
with some background, Appendix A on 1-loop results and Appendix B on the derivation of the
five-dimensional Yukawa potential.
2. The orbifold effective theory
We consider a pure SU(2) YangMills (YM) theory in five dimensions with gauge potential AM , M = 0, 1, 2, 3, 5. The fifth dimension is an interval, obtained by identifying points and
fields on a circle of radius R under the Z2 reflection operator R : x 5 x 5 . The projection
breaks the gauge symmetry at the two ends of the interval according to
SU(2) U (1).
(2.1)
In the picture of dimensional reduction as it is known in finite temperature field theory [34] the
five-dimensional fields are expanded in a Fourier series in the quantized momentum along the
compact dimension. At low energies, in our case much below the compactification scale given
by the inverse radius 1/R, we have an effective four-dimensional theory of the zero-modes (i.e.
constant along the fifth dimension) of the fields.
But the Fourier series that defines the KaluzaKlein expansion breaks gauge invariance. This
we cannot afford at the non-perturbative level. Here the particle spectrum is read off from correlations of five-dimensional, gauge invariant operators. These operators are classified according to
specific symmetries. If dimensional reduction occurs, a mass gap in the various particle channels
should be seen. The ground state of the operators which corresponds to the A5 gauge field component2 has the quantum numbers of a complex scalar and the ground state of the A , = 0, 1, 2, 3,
components3 have quantum numbers of a U (1) gauge field from the point of view of four dimensions. Clearly, the lowest lying spectrum of the orbifold theory should coincide with the one of
the four-dimensional Abelian Higgs model,4 if dimensional reduction works like in finite temperature field theory. The spectra with the higher excitations included will be different, as in the
five-dimensional theory they are sensitive to the compactification scale 1/R. Before defining the
orbifold theory in great detail and present extensive simulation results (which will be the topic
of the next sections), one would hope to be able to exploit this similarity of spectra by mapping the phase diagram of the orbifold simulation on the phase diagram of the Abelian Higgs
model. Being able to do so, would be a first circumstantial evidence for an effective dimensional
reduction.
Next, we define the parameter space of our model. In a lattice regularization, the cut-off is
provided by the inverse lattice spacing = a 1 and it preserves gauge invariance. That is why
the lattice formulation is particularly useful to study non-renormalizable theories, which requires
a cut-off. The parameter space of the lattice theory is two-dimensional.5 One of the parameters
2 In Section 3.1.2 we will specify the gauge-invariant meaning of this, related to the Polyakov line.
3 In Section 3.2 we will construct a gauge-invariant operator that corresponds to the gauge bosons, very much in
analogy with the four-dimensional SU(2) Higgs model [35].

4 We thank U.-J. Wiese for his comments in this respect.
5 For this discussion we assume the four-dimensional sizes of the lattice to be infinite.
288
is the (dimensionless) number of lattice points in the fifth dimension

R
a
and the second is the dimensionless five-dimensional lattice coupling
N5 =
2N
2N
a= 2 ,
2
g5
g0
(2.2)
(2.3)
where g0 is the dimensionless bare gauge coupling. For later reference, notice that in regions of
the parameter space where g5 has a mild cut-off dependence, is inversely proportional to .
All observables are computed in units of the lattice spacing, or equivalently in units of R, and
are thus functions of N5 and . One can also define a derived parameter, the coupling
4eff
1
R
= 2,
2
g4
g5
(2.4)
for the effective four-dimensional theory. It is also possible in principle to use an anisotropic
lattice [36] in which case a lattice gauge coupling (and a different lattice spacing) is defined for
the fifth dimension (5 ) and a different one for the four-dimensional subspace (4 ). Here we will
restrict ourselves to isotropic lattices for which 4 = 5 = .
From simulations of the five-dimensional SU(2) gauge theory on the orbifold S 1 /Z2 with the
number of points in the extra dimension fixed to N5 = 4, and on an isotropic lattice with an
associated coupling , the spectrum can be safely determined when is larger than a critical
value c = 1.5975, which separates a confined ( < c ) from a deconfined ( > c ) phase. The
spectrum measured in simulations corresponding to the deconfined phase consists of a massive
Higgs ground state and a massive Abelian gauge boson, along with their excitations [23] which
implies that our system in this region of parameter space crosses from a confined phase into
a Higgs phase.6 In order to be able to make a comparison with the Abelian Higgs model we now
recall a few well-known facts.
We summarize results from analytic [38,39] and numerical [40] studies of the fourdimensional Abelian Higgs model. The Euclidean action may be written as

2
S=
(x)(x) 1 + (x)(x)
x
(x)e
iqA (x)

(x + a )
+ c.c.
Up .
2 p
(2.5)
Here, (x) is the complex Higgs field, q is its charge and the hopping parameter related to the
bare mass m0 through
a 2 m20 =
1 2
8.
(2.6)
6 If the four-dimensional effective theory were just a pure U (1) gauge theory (due to the orbifold breaking of the SU(2)
symmetry) and dimensional reduction occurs, one should be able to map our results onto the four-dimensional Abelian
model. For N5 = 4 we have 4eff = . The pure compact U (1) gauge theory has a phase transition at c = 1.01 [37] and
therefore we would end up in its Coulomb phase. Clearly, this cannot be sufficient to describe our system because, as
already mentioned, in the deconfined phase we measure not only massive scalars but also a massive gauge boson.
289
Fig. 1. Phase diagram of the Abelian Higgs model for the Higgs field with charge q = 2. From studies at = [38].
The analytic study in [38] was performed in the limit where the length of the Higgs
field is frozen to 1. In the unitary gauge the action then reads

S =
(2.7)
2 cos qA (x)
Up .
2 p
There are two interesting cases. One is when the Higgs field is in the fundamental representation,
that is it has charge q = 1 and the second is when it has charge q = 2. For us it is the second case
that is relevant since, as will show in the next section, the lattice operator which is identified with
the Higgs has charge two.
We inspect the action for = in the unitary gauge, Eq. (2.7), in the limit . In this
limit the gauge variable A (x) can take, for q = 2, two values, 0 or , corresponding to Z2 gauge
links U (x, ) = 1. (For general charge q it will be a Zq gauge theory.) The Z2 gauge theory has
a second order phase transition. The analytic study of [38] in the = case concludes that there
are three distinct phases, sketched in Fig. 1. The main difference compared to the q = 1 case is
a phase boundary that separates the Higgs from the confinement phase. The static potential in the
Higgs phase is of Yukawa type. In the confinement phase it rises linearly (area law for Wilson
loops) which is not the case in the confinement region for a fundamental Higgs.
Clearly, this is the version of the Abelian Higgs model on which the orbifold theory can be
naturally mapped since the lowest lying spectra are the same and the part of the orbifold phase
diagram we have investigated can be recognized inside the phase diagram of the q = 2 Abelian
Higgs model.
3. Orbifold on the lattice
Gauge theories on the orbifold can be discretized on the lattice [17,22]. One starts with a gauge
theory formulated on a five-dimensional torus with lattice spacing a and periodic boundary conditions in all directions M = 0, 1, 2, 3, 5. The spatial directions (M = 1, 2, 3) have length L, the
time-like direction (M = 0) has length T , and the extra dimension (M = 5) has length 2R. The
coordinates of the points are labelled by integers n {nM } and the gauge field is the set of link
variables {U (n, M) SU(N )}. The latter are related to a gauge potential AM in the Lie algebra
of SU(N ) by U (n, M) = exp{aAM (n)}. Embedding the orbifold action in the gauge field on the
290
lattice amounts to imposing on the links the Z2 projection

(1 )U (n, M) = 0,
(3.1)
where = RT g . Here, R is the reflection operator that acts as Rn = (n , n5 ) n ( =

5) on the links.
0, 1, 2, 3) on the lattice and as RU (n, ) = U (n,
) and RU (n, 5) = U (n 5,
The group conjugation Tg acts only on the links, as Tg U (n, M) = gU (n, M)g 1 , where g is
a constant SU(N ) matrix with the property that g 2 is an element of the centre of SU(N ). For
SU(2) we will take g = i 3 . Only gauge transformations {(n)} satisfying (1 ) = 0
are consistent with Eq. (3.1). This means that at the orbifold fixed points, for which n5 = 0 or
n5 = N5 , the gauge group is broken to the subgroup that commutes with g. For SU(2) this is the
U (1) subgroup parametrized by exp(i 3 ), where are compact phases.
After the projection in Eq. (3.1), the fundamental domain is the strip I1 = {n , 0 n5 N5 }.
The gauge-field action on I1 is taken to be the Wilson action

orb
SW
(3.2)
[U ] =
w(p) tr 1 U (p) ,
2N p
where the sum runs over all oriented plaquettes U (p) in I1 . The weight w(p) is 1/2 if p is
a plaquette in the () planes at n5 = 0 and n5 = N5 , and 1 in all other cases. Dirichlet boundary
conditions are imposed on the gauge links
U (n, ) = gU (n, )g 1
at n5 = 0 and n5 = N5 .
(3.3)
The gauge variables at the boundaries are not fixed but are restricted to the subgroup of SU(N ),
invariant under Tg . The Wilson action together with these boundary conditions reproduce the
correct naive continuum gauge action and boundary conditions on the components of the fivedimensional gauge potential [17]. For example, for SU(2), A3 (photon) and A51,2 (Higgs)
1,2 and A3 Dirichlet ones.
satisfy Neumann boundary conditions and A
5
One of the main results of [17] was to show, through a geometrical construction, that the
orbifold projection equation (3.1) implies the absence of a boundary counterterm for the Higgs
mass. Given the explicit breaking of the gauge invariance at the boundaries, a boundary mass
term for A5 is invariant under the unbroken gauge group. In the continuum this mass term
would be tr{[A5 , g][A5 , g 1 ]} evaluated at the boundaries x5 = 0 and x5 = R. If present, such
a term would imply a quadratic sensitivity of the Higgs mass to the cut-off. For the lattice action
Eq. (3.2) this would require to add a boundary action term with an additional coefficient .
As
the lattice spacing a changes, a fine tuning of would be required to keep the Higgs mass finite.
Fortunately this term is absent and the orbifold action is simply Eq. (3.2). Since five-dimensional
theories are non-renormalizable, they make sense only as effective theories for energy scales
much below the cut-off. On the lattice this is the Symanzik effective action [4144], a continuum action which is a systematic expansion in the lattice spacing a. For the orbifold theory the
Symanzik effective action is [17]

1
d5 z tr{FMN FMN } + ab1
d4 x Re tr{gFMN FMN }
S= 2
2g5
z={0,R}

+ ab2
d4 x Re tr{gFMN gFMN } + a 2 c d5 z tr{DL FMN DL FMN } + ,
z={0,R}
(3.4)
291
where FMN is the field strength tensor, DL its covariant derivative and the coefficients
b1 , b2 , c, . . . are computable in perturbation theory. At 1-loop, b1 = 0 and b2 = 0 [11]. A boundary mass term for the Higgs would appear with a coefficient /a
2.
3.1. Operators for the Higgs
If the fifth dimension were infinite, the gauge links U (n, 5) would be gauge-equivalent to
the identity, which corresponds to the continuum axial gauge A5 0. On the circle S 1 one can
gauge-transform U (n, 5) to an n5 -independent matrix V (n ) that satisfies P = V 2N5 , where
P = P (n ) is the Polyakov line winding around the extra dimension at four-dimensional location n . Therefore an extra-dimensional potential (A5 )lat can be defined on the lattice, through
V = exp{a(A5 )lat }, as

1
P P + O a3 .
a(A5 )lat =
(3.5)
4N5
At finite lattice spacing the O(a 3 ) corrections in Eq. (3.5) are neglected. One easily checks that
RP = P and so R(A5 )lat = (A5 )lat , as it should be to have the same transformation behaviour
as A5 in the continuum.
In order to construct the gauge potential A5 on the orbifold S 1 /Z2 we start from the circle S 1
parametrized by the coordinates n5 = N5 , . . . , N5 1 (N5 = R/a is identified with N5 ).
We impose on the links building the Polyakov line P

P (n ) = U (n , 0), 5 U (n , N5 1), 5 U (n , N5 ), 5 U (n , 1), 5
(3.6)
the orbifold projection Eq. (3.1). The result is
with
P (n ) = l(n )gl (n )g 1 ,
(3.7)

l(n ) = U (n , 0), 5 U (n , 1), 5 U (n , N5 1), 5 .
(3.8)
The Polyakov line Eq. (3.7) on S 1 /Z2 is shown schematically in Fig. 2 and from it we define
(A5 )lat (n ) = {P (n ) P (n )}/(4N5 ). Being anti-Hermitian, the field (A5 )lat can be represented using the unit matrix and the Hermitian generators T A of SU(N ):

A
(A5 )lat = ig0 A05 1N + AA
(3.9)
.
5T
In order to construct the Higgs field on S 1 /Z2 we have to project7 (A5 )lat onto the components
Aa5 for which gT a g 1 = T a (the other generators have gT a g 1 = T a ). The Higgs field is now
Fig. 2. The Polyakov line P on S 1 /Z2 .

7 In the continuum the orbifold projection selects automatically the components Aa . On the lattice, due to the lattice
5
artifacts O(a 3 ) in the definition Eq. (3.5), wrong components (A5 )alat can be non-zero.
292
defined as in the continuum

(n ) = a(A5 )lat (n ), g = 2iag0 Aa5 (n )gT a .
(3.10)
Note that the commutator projects out the identity component of (A5 )lat . Under gauge transformation (n), l(n ) (n , 0)l(n )(n , N5 )1 . Since [(n), g] = 0 at the orbifold
boundaries n5 = 0 and n5 = N5 , it follows that
(n ) (n , 0)(n )(n , 0)1 .
(3.11)
The Higgs field transforms like a field strength tensor at the boundary. In the special case of
gauge group SU(2) only a U (1) gauge symmetry survives at the boundaries. If we parameterize the U (1) boundary gauge transformations by (n , 0) = exp{i(n ) 3 }, the Higgs field
transforms as

0
e2i h
0
h = 1 i2
=
(3.12)
h = 1 + i2
0
0
e2i h
showing that it has charge 2 under the U (1) gauge group.
Since tr{} = 0, in order to extract the Higgs mass we define
3

a
tr (n ) (n )
H (n0 ) =
L n ,n ,n
1
(3.13)
and build the connected correlation

a
H (n0 )H (n0 + t/a) H (n0 ) H (n0 + t/a)
C(t) =
T n
0
const emh t ,
(3.14)
and the Higgs mass mh can be extracted from the effective masses amh,eff (t + a/2) =
ln{C(t)/C(t + a)}. Writing the correlation C(t) as

C(t) = H (t)
H H (0)
H
(3.15)
one can see that it is a sum of positive and negative numbers of order one, the result being a small
number due to cancellations. On the other hand, the variation

2
2
C = H (t)
H H (0)
H C 2
(3.16)
is a sum of positive numbers of order one minus a very small number and hence of order one.
Furthermore, C is essentially independent of t. This means that the error in the effective Higgs
masses is approximately

2

1 2
1
+
,
amh,eff C
(3.17)
C(t)
C(t + a)
and, since C(t) emh t , its t-dependence is
amh,eff emh t .
(3.18)
In the above, mh is the plateau value of the Higgs mass which is constant and therefore one
expects that the error in mh increases with t exponentially.
293
3.1.1. Variational technique

It turns out that the correlation function Eq. (3.14) suffers from a loss of significance in numerical simulation. This is, unfortunately, the common problem of signals, which are exponentially
small in the time t, with an almost constant variance, and hence constant statistical error. The
ratio of the signal to the error falls off exponentially in the time t ; in the large-t region, where
the leading exponential decay Eq. (3.15) due to the Higgs mass should be seen, the signal is lost.
This is reflected in the exponentially growing error of the effective masses Eq. (3.18).
To cure this problem we employ the variational technique of [45]. A basis of Higgs operators is
constructed and the best operator to capture the Higgs mass in Eq. (3.15), i.e. whose overlap with
the eigenstate of the Hamiltonian is the largest, will be a linear combination of these basis fields.
If the contributions from excited states are suppressed then the leading exponential Eq. (3.15)
can be extracted at smaller values of t , where the signal might not be lost.
Here we sketch how this works; more details can be found in [45]. We construct a set of
Euclidean fields Oi , i = 1, . . . , r, with the same quantum numbers as H O1 in Eq. (3.13).
Then we build the matrix correlation function

Cij (t) = Oi (t)Oj (0) Oi (t) Oj (0) ,
(3.19)
and write the spectral decompositions

1 E T t (E E ) (i) (j )
m
n A
Oi (t)Oj (0) =
e n
nm Anm
Z m,n

1 E T (i)
Oi (t) =
e n Ann ,
Z n
where
Z=
eEn T
and A(i)
mn =
m|Oi (0)|n.
(3.20)
(3.21)
(3.22)
Here m, n = 0, 1, 2, . . . label the eigenstates |m with energy eigenvalue Em of the Hamiltonian H and T is the temporal size of the lattice. We use the same symbol Oi to denote the
Euclidean field and the corresponding operator in the Hamiltonian formulation.
There are two effects, which derive from the finiteness of T and the periodic boundary conditions in time (which imply taking the overall trace in the spectral decomposition) [46]. Firstly,
if the operators Oi have a non-vanishing expectation value the connected correlation functions
Eq. (3.19) have in general t -independent contributions
(i)
(j )
(i) (j )
emh T A00 A11 A00 A11 ,
(3.23)
where mh = E1 E0 is the mass gap, i.e. the Higgs ground state mass.
Secondly, in the limit that T , t are both large and the difference T t is close to t , the matrix
correlation function Eq. (3.19) has the leading behavior

(i) (j )
Cij (t)
e(En E0 )t + e(En E0 )(T t) An0 An0
n>0
e(En E0 )T (Em En )t A(i)

nm Anm .
(j )
(3.24)
m,n>0
Here we have assumed that Amn = Anm and Cij (t) is real, which is true if the operators Oi
are Hermitian. If we take t = T /2, the contribution of the second term in Eq. (3.24) goes like
294
exp{[(En + Em )/2 E0 ]T } compared to exp{(En E0 )T /2} of the first term. The second
term is hence subleading and can be neglected. We get
(i) (j )
Cij (t) Cij (t) + Cij (T t), Cij (t) =
(3.25)
An0 An0 e(En E0 )t .
n>0
The correlation function is a sum of two contributions and is symmetric about t = T /2.
First we assume that T is large enough so that the t -independent contribution Eq. (3.23) is
negligible and that t is small enough so that the contribution T t is also negligible. Then the
correlations Eq. (3.19) have the T = behavior8 (assumed in [45])
Cij (t) =
A0 A0 etW ,
(i)
(j )
W = E E0 .
(3.26)
=1
The lowest masses W can be extracted by solving the generalized eigenvalue problem
C(t)ij ,j (t, t0 ) = (t, t0 )Cij (t0 ),j (t, t0 ),
(3.27)
where the correlation matrix is taken at variable time t and at fixed time t0 (we may set t0 = 0).
From a mathematical lemma proved in [45] it follows that

t
(t, t0 ) = c etW 1 + O etW , = 1, . . . , r,
(3.28)
where c > 0 and W = min = |W W |. One expects that c et0 W and that the coefficients of the correction terms in Eq. (3.28), because of the excited states, are suppressed. The
masses can be extracted from

(t, t0 )
aW (t + a/2) = ln
(3.29)
(t + a, t0 )
at moderately large value of t .
If the contribution T t is not negligible, as it happens when t approaches T /2 (and we might
be forced to go to such large values of t to find a plateau), then a different formula has to be used.
The starting point is Eq. (3.25). If we say that (t, t0 ) solves the generalized eigenvalue problem
Eq. (3.27) for the matrix C defined in Eq. (3.25), then it is straightforward to show that
(t, t0 ) = (t, t0 ) + (T t, t0 )
(3.30)
solves the generalized eigenvalue problem for the full matrix C. Using Eq. (3.28) for
the formula

(t, t0 ) = 2c eW T /2 cosh (T /2 t)W .
we get
(3.31)
By using the ratio r12 = (t1 , t0 )/ (t2 , t0 ) and the definitions

x = eW ,
1 = T /2 t1 ,
2 = T /2 t2 ,
it is easy to arrive at the equation

r12 x 2 + x 2 x 1 + x 1 = 0
(3.32)
(3.33)
to be solved numerically (NewtonRaphson) for x. We take t0 = 0, t1 = t and t2 = t + a for

t = a, 2a, . . . , T /2 a.
8 Then in Eqs. (3.20) and (3.21) only the term n = 0 contributes.
295
3.1.2. Higgs operators

A basis of Higgs operators Oi = Hi , defined as in Eq. (3.13), can be constructed by modifying
the definition of the Higgs field to create a set of fields i . We can for example consider
displaced Polyakov lines on S 1 /Z2 as shown in Fig. 3. The position where the displacement
in one of the spatial directions k = 1, 2, 3 takes place can be varied along the extra coordinate
n5 = 0, 1, . . . , N5 /2. The displacements at the other n5 values are equivalent by symmetry
with respect to reflection n5 N5 n5 . For the displacement at n5 = 0 Eq. (3.7) is replaced by
1
U (n , n )|n5 =0 l(n )U (n , n )|n5 =N5 gl (n )g 1 ,
P (0) (n ) =
(3.34)
6

n0 =n0
|nn |=a
and similar expressions P (n5 ) for n5 = 1, . . . , N5 /2. In Eq. (3.34) U (n , n ) is the parallel
transporter from the four-dimensional point n to n in the indicated slice n5 in the extra dimension.
Yet another possibility to create fields for the variational basis is to consider smeared Higgs
fields. The simplest is shown in Fig. 4: the field (n ) (represented by a thick point) is replaced
by a smeared field (n ) made of a linear combination of (n ) and the nearest neighbour
fields in the three-dimensional space

U (n , n )|n5 =0 (n )U (n , n )|n5 =0 .
(n ) = (1 )(n ) +
(3.35)
6

n0 =n0
|nn |=a
The smearing parameter is . The definition of Eq. (3.35) ensures that transforms as under
the gauge transformation in Eq. (3.11). Variants of the smearing technique can be found in [47].
Fig. 3. The displaced Polyakov line L on S 1 /Z2 .
Fig. 4. A smearing procedure for the Higgs field .
296
The smearing procedure Eq. (3.35) can be iterated a number of times, which we take to be 3. We
set = 0.7.
Finally, the gauge links used to construct all the Higgs fields i are replaced by smeared gauge
links. This is a very simple procedure to create extended operators. We have implemented APE
smearing [48] for the spatial links, i.e. U (z, k), k = 1, 2, 3 and U (z, 5). The links are decorated
with staples in the spatial directions l = 1, 2, 3, clearly with the restriction l = k for U (z, k). The
smeared link U is obtained by adding to (1 )U the sum of the decorating staples multiplied
by /(number of staples), where is the smearing parameter. The smeared link is projected back
onto SU(2). The smearing of the gauge links is iterated 3 times with = 0.75.
3.2. Photon operators
In this section we describe the construction of operators in the SU(2) orbifold, which create
the gauge boson (photon) associated with the unbroken U (1) gauge group on the boundaries.
For each Higgs field i constructed as described in Section 3.1.2 we define the SU(2) valued
quantity (in the following we suppress the index i)
(n ) =
(n )
.
det((n ))
(3.36)
We note that from Eq. (3.10) and g = g (which holds for SU(2)) it follows = . For the
three spatial directions k = 1, 2, 3 we define a decorated double link
(n , k)(n ) (no sum over k),
V (n , k) = U (n , k)(n + a k)U
(3.37)
which transforms like under gauge transformations. In analogy with the definition [35] in the
standard SU(2) Higgs model, we define an operator for the photon field

Wk (n ) = i tr 3 V (n , k) .
(3.38)
We will consider this field projected to zero three-dimensional momentum
3
a
Wk (n0 ) =
Wk (n ).
L n ,n ,n
1
(3.39)
It is not difficult to show that the operator Wk (n0 ) is real, invariant under the group conjugation Tg
and odd under the three-dimensional parity transformation. In order to study the naive continuum
limit, we define a continuum gauge potential at the boundaries through

U (x, k) = eaAk (x) = 12 + aAk (x) + O a 2 ,
(3.40)
where Ak = ig0 A3k 3 . Here we use the continuum notation to label the lattice points, x =
= (x) + ak (x) and the properties
n a. Using the definition of lattice derivative (x + a k)
(x)2 = 12 and {Ak (x), (x)} = 0 we get up to O(a 2 ) corrections

Wk (x) = ia tr 3 (x) k + 2Ak (x) (x) .

(3.41)

Dk
Consistently with Eq. (3.12) the covariant derivative Dk has a charge 2 in front of Ak . In summary, in the naive continuum limit Wk (x) corresponds to a covariant derivative term for the Higgs
field and is hence a vector in the representation of spin 1.
297
The photon mass can be extracted as follows: for each Higgs field i in the variational basis
(i)
discussed above we construct using Eq. (3.38) the corresponding photon field Wk (n0 ). Then we
build the matrix of connected correlation functions

1 a (i)
(j )
Wk (n0 + t/a)Wk (n0 )
CijW (t) =
3
T n
k
0

a
a
(j )
(i)
Wk (n0 )
Wk (n0 ) .
(3.42)
T n
T n
0
The expectation value of Wk(i) (n0 ) is zero on average, nevertheless it is subtracted to possibly
reduce the statistical noise. The matrix CijW (t) is treated as its counterpart for the Higgs to finally
get the masses.
3.3. The static quark potential
We measure the potential between a static quark and a static antiquark placed in the fourdimensional slices as we vary the location n5 = 0, 1, . . . , N5 /2 of the slice. The potential is
extracted from the expectation values of the traces of rectangular Wilson loops W (t, r) of size r
in three-dimensional space and t in time direction. The Wilson loops are averaged over the three
spatial directions and smeared gauge links, constructed like it is explained in Section 3.1.2, are
used for the spatial Wilson lines. The potential V (r) is then defined as the plateau value at large
time t of the effective masses

tr{W (r, t)}
.
aVeff (r, t + a/2) = ln
(3.43)
tr{W (r, t + a)}
In the boundary slices of the orbifold the gauge links belong to gauge group U (1), whereas in
the bulk slices they are SU(2) gauge links. Since the links on the boundary are colder (i.e.
the boundary four-dimensional plaquettes have a larger value) [22], the potential turns out to
be more precise on the boundary. Also, as the value of N5 is increased, the statistical errors on
the potential also increase. This calls for improving the extraction of the potential and there are
methods to do this.
4. Numerical results
We present detailed simulation results of the SU(2) gauge theory on the orbifold. The main
part of the simulations is a scan in at fixed N5 = 4. The three-dimensional space size is L/a = 8
and the temporal size is T /a = 96. The large value of the latter was necessary in order to extract
reliably the Higgs mass, which turns out to be very close to the 1-loop perturbative value. To
check for finite L effects we also performed for some values simulations at L/a = 12. The
algorithm is based on SU(2) heatbath and overrelaxation updates in the bulk and U (1) heatbath and overrelaxation updates on the boundaries. The statistics of the simulations at L/a = 8
varies between 90 000 and 260 000 measurements of the observables, at L/a = 12 it is of 32 000
measurements. Each measurement is separated by one heatbath and L/2a overrelaxation sweeps
(i.e. updates of all the gauge links).
We have also run some simulations to measure the static potential in the four-dimensional
slices along the extra dimension. For these runs the orbifold geometry was chosen to be T /a = 32
and L/a = 16 and we compare the potentials with N5 = 4 and N5 = 6 at = 1.609. This value
298
was chosen, since the mass spectrum there resembles more what we expect in a compactification
scenario.
4.1. Phase transition
In infinite volume or with periodic boundary conditions on a finite torus five-dimensional
SU(N ) gauge theories have at least two phases [36,49,50], a confinement massive phase at small
values of and a deconfinement or Coulomb massless phase at large values of . For SU(2)
the phase transition is located at c = 1.64 [36,49]. With orbifold boundary conditions this phase
transition persists but the critical value c depends on N5 and on L/a. It is signalled by a jump in
the expectation values of plaquettes, see [22]. In Fig. 5 the plot on the left-hand side shows what
happens on the orbifold with N5 = 4 and L/a = 8 with vacuum expectation values tr{ },
where for we take some of the Higgs fields as explained in the legend and in Section 3. There
is a clear discontinuity9 in the vacuum expectation values at c = 1.5975.
There is a strong indication that the phase transition is of first order. The plot on the righthand side of Fig. 5 shows the history10 of the expectation value of the Polyakov line in one of
the three-dimensional space directions evaluated at the boundary n5 = 0 of the orbifold. As it
was shown in early Monte Carlo study of SU(2) gauge theory at finite temperature [51,52] the
expectation value of the Polyakov line is zero in the confined phase and it becomes non-zero in
the deconfinement phase, thereby breaking spontaneously a Z2 symmetry which changes the sign
of the Polyakov line. In finite volume one actually observes in the Monte Carlo history jumps
between the Z2 states. Precisely both of these behaviours can be seen on the right-hand side of
Fig. 5. Until about bin number 400 the system was in the deconfined phase and then it changes
into the confined phase.
The location of the phase transition depends not only on N5 but also on the ratio L/a/N5 . In
principle we would like to be in a situation where N5 L/a in order to have a compact extra
Fig. 5. The phase transition on the orbifold at N5 = 4. The plot on the left-hand side shows the behaviour of vacuum
expectations values
tr{ } for different Higgs fields. The plot on the right-hand side shows a metastability in the
Polyakov line in one of the three-dimensional directions right at the phase transition c = 1.5975.
9 The expectation value of the basic Higgs field Eq. (3.10) is actually constant for < .
c
10 The measurements for each simulations are blocked in 500 bins for the error analysis.
299
dimension. But the meaning of compactification will have to be qualified by looking at the results
for the particle spectrum.
4.2. Higgs and photon spectra
In Figs. 6 and 7 the Higgs and photon masses for the ground state and the first excited state are
shown as a function of in units of 1/R for the N5 = 4, L/a = 8 and T /a = 96 orbifold geometry. It was not possible to extract these masses for < c = 1.5975. In the confined phase the
signal for the effective masses disappears immediately in noise. Right above the phase transition
the signal for the particle masses is there.
Fig. 6. The ground and first excited state of the Higgs. Scan in at fixed N5 = 4. Masses are in units of 1/R. The dashed
line represents the 1-loop result for the ground state.
Fig. 7. The ground and first excited state of the photon. Scan in at fixed N5 = 4. Masses are in units of 1/R.
300
Table 1
Finite volume study of the spectrum at N5 = 4. The excited state of the Higgs could not be determined at L/a = 12
L/a
mh R
m R
m R
1.609
1.609
1.65
1.65
1.80
1.80
8
12
8
12
8
12
0.085(13)
0.075(10)
0.093(14)
0.120(14)
0.052(24)
0.048(13)
0.138(9)
0.202(20)
0.132(6)
0.120(13)
0.079(4)
0.097(5)
0.62(8)
0.48(8)
0.69(5)
0.61(5)
0.38(17)
0.70(6)
The first observation, looking at Fig. 6, concerns the Higgs ground state mass mh . For all
> c the Higgs mass is consistent with its value computed in 1-loop perturbation theory: for
general gauge group G = SU(N ) this is
c
(4.1)
,
N5
where c = 3/(4 2 ) N (3)C2 (G)/2 and C2 (G) = N (c = 0.1178 for SU(2)). Here we have
taken the continuum result in [11] and replaced the dimensionful coupling g5 by its lattice definition Eq. (2.3).
The second observation, looking at Fig. 7, concerns the photon ground state mass m . Contrary to the 1-loop prediction [31] the photon mass is non-zero for all > c . The photon mass
even increases as the phase transition is approached. This means that there is spontaneous symmetry breaking in the pure gauge theory. This is, to our best knowledge, the first non-perturbative
evidence for the Higgs mechanism11 originating from an extra dimension.
The third observation, looking at both Figs. 6 and 7, concerns the excited state masses for the
Higgs mh and the photon m . In perturbation theory, the first excited (KaluzaKlein) states are
expected to appear split from the ground states by (m)R = 1, the second with a mass splitting
twice that, and so forth. In no range of we see excited states at about 1 in units of 1/R. Close
to the phase transition the excited states are separated from the ground state, especially in the
case of the photon. Instead for larger they get closer in mass to the ground states. This is an
indication that at fixed N5 = 4 the system behaves more like a compact system close to the phase
transition rather than for large .
At this point one might worry about finite L effects in the particle masses. Especially so
for the photon mass, since its non-zero value contradicts perturbation theory. In Table 1 we offer
a comparison at several values of the particle masses between L/a = 8 and L/a = 12. One can
see that finite L effects are small and in most cases not significant. For sure there is no variation
that could be explained with a behaviour m 1/L which is characteristic of finite volume effects.
mh R =
4.3. Static potential in the four-dimensional slices

In this section we present simulation data for the static potential in the boundary slice, see
Section 3.3, together with the results of various fits. Since we know from the results in Section 4.2
that the gauge boson associated with the U (1) gauge symmetry on the boundary is massive, the
physically motivated fits are Yukawa potentials
11 Spontaneous symmetry breaking in this context goes back to works by Hosotani [6,7].
301
aV (r) = c1 exp(m r)/(r/a) + c0

aV (r) = d1 K1 (m r)/(r/a) + d0
in four dimensions,
(4.2)
in five dimensions,
(4.3)
where c0 , c1 and d0 , d1 are the fit parameters and the photon mass m is the one measured in
the simulations. In Appendix B we derive the five-dimensional form of Yukawa potentials. It is
interesting to compare the Yukawa fits with Coulomb fits
aV (r) = f1 /(r/a) + f0
in four dimensions,
(4.4)
aV (r) = e1 /(r/a) + e0
in five dimensions,
(4.5)
where f0 , f1 and e0 , e1 are the fits parameters.

In Fig. 8 we present results from a simulation of the orbifold geometry T /a = 32, L/a = 16
and N5 = 4. The statistics is of 20 000 measurements of the potential. The photon mass is am =
0.108(7) (for the fits we neglect its error), measured in the simulations described in Section 4.2.
In the fits only the points r/a = 2, . . . , 8 are included. The 2 per degree of freedom are listed
in Table 2. The four-dimensional fits are excluded whereas the five-dimensional ones are very
good.
The data at N5 = 4 might suffer from the fact that the ratio between the cut-off and the compactification scale is only R/a = N5 / 1.3. In Fig. 9 we present results from a simulation of
the orbifold geometry T /a = 32, L/a = 16 and N5 = 6. The statistics is of 9000 measurements
of the potential. The photon mass is am = 0.173(18), taken from a simulation at the same
and N5 values but with T /a = 96 and L/a = 12. In Table 2 we can compare the 2 per degree of
freedom. We see that at N5 = 6 together with the five-dimensional fits also the four-dimensional
Fig. 8. The static potential at = 1.609 and N5 = 4 between static charges on the boundary. The five-dimensional fits
are clearly favoured.
Table 2
The values of 2 /(degrees of freedom) for the various fits to the static potential at the boundary at
= 1.609. A comparison between N5 = 4 and N5 = 6 is made
N5
5d Coulomb
5d Yukawa
4d Coulomb
4d Yukawa
4
6
0.4
1.1
0.6
1.9
14
4.2
10
1.7
302
Fig. 9. The static potential at = 1.609 and N5 = 6 between static charges on the boundary. Besides the five-dimensional
fits also a four-dimensional Yukawa fit can describe the data.
Yukawa form is a good fit. The four-dimensional Coulomb fit is instead excluded. It should be
noted that the errors on the potential data are comparable at both values of N5 .
We also measured the static potential in the middle slice at n5 = 2. The results of the fits
confirm the conclusions from the fits in the boundary slice at N5 = 4, that is the four-dimensional
forms are disfavoured. The 2 values are now about 3 for the four-dimensional forms and 0.3
for the five-dimensional forms. At N5 = 6 the statistical errors are too large and all the fits are
equally good. As was discussed in Section 3.3 we have to improve the measurements of the
static potential, especially for the middle slice. It is important to have precise data for the latter,
since any difference with respect to the potential on the boundary might be a hint of localization
effects.
4.4. Discussion
A certainly surprising feature of these models emerges when we look at the mass spectrum
itself, measured in R-units. We have seen in Fig. 6 that when the compactification scale is lower
than the cut-off scale by a fixed gap (i.e. for N5 fixed), the Higgs ground state mass is consistent
with its 1-loop perturbative value not only at large values of but also in the definitely nonperturbative region when is close to its critical value at the phase transition. Moreover also the
photon mass has a mild dependence on , see Fig. 7. Since we do not have at present any powerful
analytical tools to explain these observations, we will keep our discussion at a qualitative level.
We would like to argue that one way to interpret this peculiar phenomenon is by a very mild
cut-off dependence of the five-dimensional bare coupling g52 in the non-perturbative regime.12
The generic situation observed in our simulations, as far as the spectrum is concerned, can be
translated in an effective field theory language by saying that the theory possesses a region in its
parameter space such that any operator of dimension 5 + p appearing in the effective action and
12 Note that this is not in contradiction with the perturbative, 1-loop expectation which wants the five-dimensional bare
coupling to have a mild cut-off dependence for energy scales E 1/R where the theory is truly five-dimensional (see
Eq. (3.13) of [53]).
303
effective operators13 scales as

Z(g0 , g4 ) (5+p)
O
,
p
(4.6)
with Z(g0 , g4
) a very slowly varying function of the dimensionless bare couplings g0 = g5
and g4 = g5 / R, at least as long as the ratio /(1/R) is kept fixed. In the orbifold theory,
taking the Higgs mass as an example, we recall that a boundary counterterm is non-perturbatively
excluded and thus all possible corrections come from either pure bulk effects or bulk effects with
boundary insertions, both of which descend directly from the circle.
In an attempt to predict the explicit cut-off dependence in Eq. (4.6), we note that naivedimensional analysis tells us that as decreases with g5 fixed, the cut-off increases, see
Eq. (2.3). The compactification scale is 1/R and a wide separation from the cut-off scale requires
1/R. Increasing while keeping the gap between the compactification and cut-off scales
fixed, would require decreasing (at fixed g5 ) and therefore an increase of R, which drives the
fifth dimension to its decompactification limit. A general lesson then is that for fixed N5 , moving
towards the large regime is expected to enhance the cut-off effects (appearing as E/ at low
energies E in the sense of an effective action) and decompactify the theory, whereas moving
in the opposite direction, i.e. towards smaller , is expected to suppress the cut-off effects and
drive the system into a compactified but non-perturbative regime. Eventually the phase transition
is reached at the critical value of = c , where the cut-off reaches its maximal value. There is
a possible caveat though in this argument: we have implicitly assumed that it makes sense to vary
the cut-off while keeping the dimensionful coupling g5 fixed for all values c < < .
If we look at the spectrum of the excited states of the Higgs and the photon in Figs. 6 and 7,
we see that as grows their masses decrease, thus deviating more and more from their expected
perturbative value as KaluzaKlein states of mass 1/R. This fact could be explained with our
argument that cut-off effects increase at large . We cannot at the moment explain why these
cut-off effects would affect the excited states but not the ground states.
Next, let us see what happens when we start changing N5 while keeping fixed. The cut-off
depends only on and is also fixed. Using Eq. (2.2) we obtain that
N5 R
=
(4.7)
N5
R
which implies that increasing N5 amounts to increasing also R. Thus, in a compactified scenario
we expect to be able to approach an effective dimensional reduction by decreasing N5 . It is
interesting to note that according to our potential data we observe an opposite effect: the fit to
a four-dimensional Yukawa law becomes much better when we increase N5 from 4 to 6 at fixed ,
namely when we decompactify the extra dimension. Therefore, to the extent that this result can
be considered as a firm physical property of the system in this region of the parameter space, we
are lead to the conclusion that the fact that we start observing an effective dimensional reduction
at N5 = 6 is more likely to be a consequence of a localization mechanism rather than an effect
of compactification. It is possible that the region that corresponds to dimensional reduction from
compactification is located at much smaller N5 which would however require an anisotropic
lattice to be probed.
Finally, regarding the applicability of the vicinity of the phase transition in model building,
we would like to point out that even though it clearly corresponds to a strong coupling regime
13 This is Symanziks analysis of cut-off effects.
304
from the point of view of five dimensions, viewed from the point of view of the four-dimensional
effective theory with an effective coupling defined as in Eq. (2.4), it could correspond to a weakly
coupled regime, if for example large renormalization factors change Eq. (2.4).
5. Conclusions
The first results for the spectrum of the orbifold theory simulated on a lattice with an extra
dimension of the size of N5 = 4 lattice spacings, show that there is spontaneous symmetry breaking, which manifests in a massive gauge boson associated with the U (1) boundary gauge group.
This result was unexpected from computations at 1-loop in perturbation theory, where the photon
remains massless. Moreover, the Higgs mass is also measured in a large range of gauge coupling
and it is always close to its perturbative value. The excited state masses for the Higgs and the
photon are not at the scale of the inverse compactification radius, where they are expected to be
in a scenario of dimensional reduction like in finite temperature field theory. But close to the
orbifold phase transition the mass splitting between the ground states and the first excited states
increases. Data for the static potential strongly suggest five-dimensional potential forms, both
when the static potential is measured with the charges on the boundary slice or in the middle
slice along the extra dimension. We have also presented potential data at N5 = 6. They indicate
that a four-dimensional Yukawa form, with the vector boson mass equal to the measured photon
mass, is a good fit to the potential on the boundary together with the five-dimensional forms.
The results we have so far give a consistent picture and they motivate for further work to
explore the phase diagram of the orbifold theory. Despite its non-renormalizability, the theory
makes finite predictions for the mass spectrum and we would like to understand better how to
change the lattice parameters to study the scaling properties. This also requires technical improvements, for the basis of Higgs operators and for the static potential.
Acknowledgements
We thank B. Bunk for his help in the construction of the programming code. We are grateful
to P. de Forcrand, M. Della Morte, P. Hasenfratz, Y. Hosotani, M. Lscher and U.-J. Wiese
for discussions and helpful suggestions. We thank the Swiss National Supercomputing Centre
(CSCS) in Manno (Switzerland) for allocating computer resources to this project. N. Irges thanks
CERN for hospitality.
Appendix A. Higgs potential from extra dimension(s)
In this appendix we review the calculation of 1-loop potentials in orbifold gauge theories. This
material is already well known in the hep-ph community but since it is perhaps less familiar in
the lattice community, for completeness we reproduce here in detail the perturbative arguments
for the existence or not of spontaneous symmetry breaking, reproducing essentially some of the
results of [31].
Consider a (massive) free field theory for a one-component real scalar field in D-dimensions
with action S. After a Euclidean rotation, the path intergral that defines the vacuum energy
is [8]:

1
e = [D]eSE
(A.1)
, = .
det[ + M 2 ]
305
The mass (M) dependence of the above can be extracted by using the identity

log(det A) =
dt tA
tr e
.
t
(A.2)
We obtain

= log
1 D/2
=
2 (2)D
det [ + M 2 ]
1
dt
t
D+2
2
etM .
2
(A.3)
For a fermionic free field we have

e = det(D),
D = + M.
(A.4)
The non-zero eigenvalues of the Dirac operator come in complex conjugate pairs i + M,
hence

det(D) = det DD = det + M 2 .
(A.5)
We can therefore summarize the contributions to the effective action, setting D = 4, as

M2
1
FI
lI
=
()
dl
le
,
32 2
I
(A.6)
where FI is 0 for bosonic and 1 for fermionic degrees of freedom of mass MI .

Let us apply the above for the case where there are d extra compact toroidal directions. For
the four-dimensional theory, after dimensional reduction, i.e. KaluzaKlein decomposition, we
obtain the mass formula

d

ni + aiI 2
2
2
,
MI = mI +
(A.7)
Ri
i=1
where
Ri are the radii of the d circles of the torus.
{ni }, i = 1, . . . , d, are integers.
m2I is a (4 + d)-dimensional mass which remains in the Ri limit. This is zero for a pure
gauge theory.
The shifts aiI originate from the possible failure for periodicity of the (4 + d)-dimensional
field I :

I
I x , y i + 2ki Ri = e2i i ki ai I x , y i ,
(A.8)
where yi are the coordinates of the circles.
According to the above, for the T d torus we have the expression for the potential
Td
Veff

=
()FI
I
{ni }
1
32 2

dl le
0

i
(ni +aiI )2
Ri2 l
(A.9)
306
By commuting the integral with the sum over ni and then performing a Poisson resummation
using the formula

1 (mb)T A1 (mb)
T
T
en An+2ib n =
e
(A.10)
det A {m }
{n }
i
where A is a d d (invertible) matrix and n and m are d-dimensional KK number vectors, we

find

2+d
2 n2 R 2
Td
F I ( i Ri )
2i i ni aiI
i i i .
2 e l
()
e
dl
l
Veff =
(A.11)
4d
2
32
{ni }
I
0
The terms with ni = 0 give rise to a divergent contribution to the vacuum energy. They represent
a contribution to the cosmological constant and can be neglected for the present discussion. For
all other non-vanishing vectors n we perform the integral explicitly. This leads to the finite result
2i ni a I
4+d

i
i
e
Td
FI ( 2 )
Veff =
(A.12)
()
Ri
2 2 4+d .
12+d
2
32 2
(
n
R
)
I
i
n =0
i i i
A.1. 5D SU(2) gauge theory on S 1 /Z2
For a 5D theory compactified on the 1-dimensional torus, the circle, we have d = 1. Then
S1
Veff
5

e2ina I
FI ( 2 )
=
()
R
32 13/2
(n2 R 2 )5/2
I
n =0

cos (2na I )
3
FI
=
()
.
64 6 R 4
n5
I
(A.13)
n=1
For a pure gauge theory we have in addition that FI = 0. We then finally obtain for the vacuum
energy
S1
Veff

cos (2na A )
3
=
64 6 R 4
n5
(A.14)
A n=1
where the index I has been changed to A, indicating the adjoint representation of some gauge
group. This potential may have a minimum that leads to an extra-dimensional version of the
Higgs mechanism.
There are different cases where a failure of periodicity can happen and a shift in the KK
formula is generated. One particularly interesting example is the presence of a Wilson line in an
extra-dimensional gauge theory

dy A
A ,
a A = q A g5
(A.15)
2 5
S1
where AA
5 is the internal component of the gauge field with gauge coupling g5 and qA the charge
of the Ath field under the corresponding generator. To be more concrete, let us look at an example. Consider a 5D SU(2) gauge theory compactified on the circle. In order to compute the
307
vacuum energy, according to Eq. (A.14) the only quantity we need to compute are the a A . To
compute the a A we have to compute the eigenvalues of the mass-squared operator [31]
DM DM = D D D5 D5
in the vacuum, with DM acting on fields F

DM F = M F + [BM , F ],
(A.16)
(in Euclidean space)

= F AT A
in the adjoint representation as
BM =
AM .
(A.17)
BM is a constant background field given by the vacuum expectation value (vev) of AM .

After orbifolding, which breaks SU(2) U (1) the only fields that can take a vev are A15 (x 5 )
and A25 (x 5 ). The unbroken U (1) invariance can be used to rotate the vevs in such a way that we
have
B51 = H,
B52 = B53 = 0,
B5 = ig5 B5A T A .
(A.18)
x5
dependent parts of
The non-trivial part comes from the D5 D5 part of the operator acting on
the gauge fields. Writing out this explicitly in the vacuum acting on an adjoint field F ,

D5 D5 F = 5 5 F + 2[B5 , 5 F ] + B5 , [B5 , F ] ,
(A.19)
gives the operator
(D5 D5 )AB =
5 5
0
0
0
5 5 g5 H H
2g5 H 5
0
2g5 H 5
5 5 g5 H H
!
.
(A.20)
Since this operator does not mix different KK modes we can diagonalize it separately for each
level n.
Eigenvalues of AA
.
The matrix elements of the D5 D 5 operator
2R

dx 5 f x 5 (D5 D5 )g x 5
f | D5 D5 |g =
(A.21)
can be easily obtained by using the basis of orthonormal functions
1
R
cos
n 5
x ,
R
for A, B = 3,
(A.22)
since A3 is even under the orbifold, and
1
R
sin
n 5
x ,
R
for A, B = 1, 2,
1,2 are odd under the orbifold. The matrix we obtain for n = 0 is
since A
n2
0
0
R2
n2 (|n| + )2 (|n| )2
=0)
n2
2
n
,
,
,
AA(n
0
+
2
R2
R2
R2
R2
R2
R2
n
n2
2
0
2 R 2
+ R2
R2
(A.23)
(A.24)
308
where
= g5 H R
(A.25)
1,2 do not have zero modes

and on the right we have given the eigenvalues. For n = 0, the A
3
but A does have a zero mode whose eigenvalue is
2
(A.26)
.
R2
Eigenvalues of ghosts.
We notice that the ghosts have the same Z2 parity assignment as the gauge fields A therefore the effect of the ghosts is to just reduce the degree of polarization as 4 2.
Eigenvalues of AA
5.
A
Since the Z2 parities of the AA
5 are the opposite from those of A , we will obtain
A3,(0)
1,2,(0)
A
2
R2
(A.27)
n2 (|n| + )2 (|n| )2
,
,
.
R2
R2
R2
(A.28)
and
A(n =0)
A5
By comparing with Eq. (A.7) we see that non-zero shifts a I are given by in Eqs. (A.24)
and (A.28). As we have argued, only the non-zero modes are relevant for the vacuum energy. We
count 2 + 1 (2 from physical degrees of polarization of A , 1 from A5 ) eigenvalues (n + )2 /R 2
(with a I = ) and an equal number of eigenvalues (n )2 /R 2 (with a I = ). Furthermore, as
can be seen from Eq. (A.14) the a I = contribution is the same as the a I = + contribution.
We then obtain for the vacuum energy the final result
S1
Veff

3 3 2 cos (2n)
SU(2) =
.
64 6 R 4
n5
(A.29)
n=1
This potential could, in principle break the remaining U (1) down to nothing on the branes. Plotting the potential one can see that it has two degenerate minima at = 0 and = 1, none of
which breaks U (1).
A similar computation gives for the SU(3) SU(2) U (1) model the potential

332 1
S1
SU(3) =
cos (2n) + 2 cos (n) .
Veff
6
4
5
64 R
n
(A.30)
n=1
The latter can be seen by noticing that the eigenvalues for the adjoint of SU(3) are
(n 2 )2
(n )2
n2
,
,
2
.
(A.31)
R2
R2
R2
Again, the minimum is at = 0 which does not break SU(2) U (1).
In fact, in some cases a useful criterion to see if it is possible to break the symmetry by
the Hosotani mechanism is to use the statement proven in [11]: The Hosotani mechanism does
not reduce the rank of H if the symmetry breaking global minimum is at = 1. To show this
statement, we can compute the Wilson line due to the vev H = g51R of a scalar along the T A
2
direction:
W = Peig5
dx 5 H T A
= e2iT .
A
309
(A.32)
exp(2iT A )
is a diagonal matrix. In the case of SU(2):

It is straightforward to show that
exp(i1 ) = 12 = exp(i2 ). For SU(N ), the non-diagonal generators are obtained embedding 1,2 . Thus, we always have that

W , Hi = 0,
(A.33)
where Hi are the generators corresponding to the Cartan subalgebra of G, i.e. that the Wilson
loop commutes with at least those generators and therefore it leaves at least a U (1)1
U (1)rank(H) unbroken. From this it is clear that in the SU(2) model, U (1) cannot break further
with = 1. For the SU(3) model = 1 could break SU(2) U (1) down to U (1) U (1) by the
Hosotani mechanism. In both cases, it is not clear what = 1 would do.
Appendix B. The Yukawa potential in 5D
The potential is given by the integral

d4 q
eiqx
V (x) =
(2)4 q 2 + m2
(B.1)
where |x| = r, q = |q|q , |q | = 1 and

q1 = cos 1 ,
q2 = sin 1 cos 2 ,
q3 = sin 1 sin 2 cos ,
q4 = sin 1 sin 2 sin ,
(B.2)
where 0 j , 0 2 . Then the potential can be written as

1
V (x) =
(2)4
2
d|q| |q|

d2 sin2 1 sin 2
d1
0
eiqx
.
+ m2
q2
(B.3)
Choosing coordinates such that x = r e1 with e1 a unit vector, we have that q x = |q|r cos 1
and then
1
V (x) =
(2)4
|q|3
d|q| 2
|q| + m2

d1 sin2 1 ei|q|r cos 1 .
d2 sin 2
0
(B.4)
Performing the angular integrals we obtain

V (x) =
1 1
(2)2 r

d|q|
0

|q|2
|q|r
.
J
1
|q|2 + m2
(B.5)
Finally, changing variables as y = |q|r one has the left over radial integral
1
V (x) =
(2)2 r 2

dy
0
y2
J1 (y),
y 2 + (mr)2
(B.6)
310
which can be done, yielding the result

V (x)
1
(mr)K1 (mr).
(2)2 r 2
(B.7)
As r 0, the Bessel function (mr)K1 (mr) 1 so we have at short distances

r 0: V (x) =
1
.
(2)2 r 2
For large r on the other hand we have

)
m 1 emr
.
r : V (x) =
2 (2)2 r 3/2
(B.8)
(B.9)
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
S.R. Coleman, E. Weinberg, Phys. Rev. D 7 (1973) 1888.

D.B. Fairlie, Phys. Lett. B 82 (1979) 97.
D.B. Fairlie, J. Phys. G 5 (1979) L55.
N.S. Manton, Nucl. Phys. B 158 (1979) 141.
P. Forgacs, N.S. Manton, Commun. Math. Phys. 72 (1980) 15.
Y. Hosotani, Phys. Lett. B 126 (1983) 309.
Y. Hosotani, Ann. Phys. 190 (1989) 233.
I. Antoniadis, K. Benakli, M. Quiros, New J. Phys. 3 (2001) 20, hep-th/0108005.
G.R. Dvali, S. Randjbar-Daemi, R. Tabbash, Phys. Rev. D 65 (2002) 064021, hep-ph/0102307.
N. Arkani-Hamed, A.G. Cohen, H. Georgi, Phys. Lett. B 513 (2001) 232, hep-ph/0105239.
G. von Gersdorff, N. Irges, M. Quiros, Nucl. Phys. B 635 (2002) 127, hep-th/0204223.
H.C. Cheng, K.T. Matchev, M. Schmaltz, Phys. Rev. D 66 (2002) 036005, hep-ph/0204342.
N. Arkani-Hamed, L.J. Hall, Y. Nomura, D.R. Smith, N. Weiner, Nucl. Phys. B 605 (2001) 81, hep-ph/0102090.
A. Masiero, C.A. Scrucca, M. Serone, L. Silvestrini, Phys. Rev. Lett. 87 (2001) 251601, hep-ph/0107201.
G. von Gersdorff, N. Irges, M. Quiros, hep-ph/0206029.
G. von Gersdorff, N. Irges, M. Quiros, Phys. Lett. B 551 (2003) 351, hep-ph/0210134.
N. Irges, F. Knechtli, Nucl. Phys. B 719 (2005) 121, hep-lat/0411018.
G. Martinelli, M. Salvatori, C.A. Scrucca, L. Silvestrini, hep-ph/0503179.
C.S. Lim, N. Maru, K. Hasegawa, hep-th/0605180.
Y. Hosotani, hep-ph/0607064.
G. von Gersdorff, A. Hebecker, Nucl. Phys. B 720 (2005) 211, hep-th/0504002.
F. Knechtli, B. Bunk, N. Irges, PoS LAT2005 (2005) 280, hep-lat/0509071.
N. Irges, F. Knechtli, hep-lat/0604006.
Y.K. Fu, H.B. Nielsen, Nucl. Phys. B 236 (1984) 167.
P. Dimopoulos, K. Farakos, S. Vrentzos, hep-lat/0607033.
G.R. Dvali, M.A. Shifman, Phys. Lett. B 396 (1997) 64, hep-th/9612128.
M. Laine, H.B. Meyer, K. Rummukainen, M. Shaposhnikov, JHEP 0404 (2004) 027, hep-ph/0404058.
S. Chandrasekharan, U.J. Wiese, Nucl. Phys. B 492 (1997) 455, hep-lat/9609042.
R. Brower, S. Chandrasekharan, U.J. Wiese, Phys. Rev. D 60 (1999) 094502, hep-th/9704106.
B. Schlittgen, U.J. Wiese, Phys. Rev. D 63 (2001) 085007, hep-lat/0012014.
M. Kubo, C.S. Lim, H. Yamashita, Mod. Phys. Lett. A 17 (2002) 2249, hep-ph/0111327.
C.A. Scrucca, M. Serone, L. Silvestrini, Nucl. Phys. B 669 (2003) 128, hep-ph/0304220.
G. Panico, M. Serone, A. Wulzer, Nucl. Phys. B 739 (2006) 186, hep-ph/0510373.
J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, fourth ed., International Series of Monographs on
Physics, vol. 113, Clarendon Press, Oxford, 2002.
[35] I. Montvay, Phys. Lett. B 150 (1985) 441.
[36] S. Ejiri, J. Kubo, M. Murata, Phys. Rev. D 62 (2000) 105025, hep-ph/0006217.
[37] G. Arnold, B. Bunk, T. Lippert, K. Schilling, Nucl. Phys. B (Proc. Suppl.) 119 (2003) 864, hep-lat/0210010.
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
311
E.H. Fradkin, S.H. Shenker, Phys. Rev. D 19 (1979) 3682.

K. Osterwalder, E. Seiler, Ann. Phys. 110 (1978) 440.
K. Jansen, J. Jersak, C.B. Lang, T. Neuhaus, G. Vones, Phys. Lett. B 155 (1985) 268.
K. Symanzik, Mathematical problems in theoretical physics, in: R. Schrader, et al. (Eds.), 6th Int. Conf. on Mathematical Physics, Berlin, West Germany, 1121 August 1981, in: Lecture Notes in Physics, vol. 153, 1982, p. 47.
K. Symanzik, Nucl. Phys. B 226 (1983) 187.
K. Symanzik, Nucl. Phys. B 226 (1983) 205.
M. Lscher, hep-lat/9802029.
M. Lscher, U. Wolff, Nucl. Phys. B 339 (1990) 222.
I. Montvay, P. Weisz, Nucl. Phys. B 290 (1987) 327.
F. Knechtli, hep-lat/9910044.
Ape Collaboration, M. Albanese, et al., Phys. Lett. B 192 (1987) 163.
M. Creutz, Phys. Rev. Lett. 43 (1979) 553.
B.B. Beard, et al., Nucl. Phys. B (Proc. Suppl.) 63 (1998) 775, hep-lat/9709120.
L.D. McLerran, B. Svetitsky, Phys. Lett. B 98 (1981) 195.
J. Kuti, J. Polonyi, K. Szlachanyi, Phys. Lett. B 98 (1981) 199.
K.R. Dienes, E. Dudas, T. Gherghetta, Nucl. Phys. B 537 (1999) 47, hep-ph/9806292.
On the SU(2|1) WZNW model and its statistical

mechanics applications
Hubert Saleur a,c, , Volker Schomerus b
a Service de Physique Thorique, CEA Saclay, F-91191 Gif-sur-Yvette, France
b DESY Theory Group, DESY Hamburg, Notkestrasse 85, D-22603 Hamburg, Germany
c Physics Department, University of Southern California, Los Angeles, CA 90089-0484, USA
Received 27 November 2006; accepted 28 February 2007

Available online 21 March 2007
Abstract
Motivated by a careful analysis of the Laplacian on the supergroup SU(2|1) we formulate a proposal for
the state space of the SU(2|1) WZNW model. We then use properties of
sl(2|1) characters to compute the
partition function of the theory. In the special case of level k = 1 the latter is found to agree with the properly
regularized partition function for the continuum limit of the integrable sl(2|1) 33 super-spin chain. Some
general conclusions applicable to other WZNW models (in particular the case k = 1/2) are also drawn.
1. Introduction
The SU(2|1) WZNW model is a key example of the sigma models with supergroup targets that appear in the supersymmetric description of non-interacting disordered systems in low
dimensional statistical mechanics. The first occurrence of this model probably arose via a supersymmetrization of the path integral for two copies of the two-dimensional critical Ising model.
It was shown in [1] (and, independently, in [2]) how a system (with central charge c = 1)
could be introduced to cancel out the pair of free Majorana fermions (regrouped for convenience
* Corresponding author at: Service de Physique Thorique, CEA Saclay, F-91191 Gif-sur-Yvette, France
E-mail addresses: hubert.saleur@cea.fr (H. Saleur), volker.schomerus@desy.de (V. Schomerus).

doi:10.1016/j.nuclphysb.2007.02.031
H. Saleur, V. Schomerus / Nuclear Physics B 775 [FS] (2007) 312340
into a Dirac fermion) path integrals

Z=
d d d d exp[S0 + S] = 1
where
S0 =
and

S =
313
(1.1)

d 2x
+

+ +
2
(1.2)

d 2 x m(x)
.
i
+
2
2
(1.3)
The theory without random mass m(x) = 0 is obviously a free OSP(2|2) theory, which can be
considered as a SU(2|1) WZNW model at level k = 1/2.1 Averaging over disorder produces
a marginally irrelevant currentcurrent perturbation of this WZNW model. This is crucial to
understanding the (logarithmic) corrections to pure Ising model scaling. The deep infrared (IR)
behavior however is not changed by the disorder, which corresponds to the fact that (1.2) is a
simple free theory, with pure fermionic correlators identical to those of the usual Ising model.
The second occurrence of the SU(2|1) model is more involved. It arises in the study of (2 + 1)dimensional spin-full electrons in the presence of a random (non-Abelian) gauge potential. The
supersymmetrization of the path integral for two copies of the Dirac fermions produces a free
OSP(4|4) theory which has been argued to flow to the SU(2|1) WZNW model at level one
under the action of the disorder [3]. The nature of the spectrum and correlation functions play an
important role in the description of the electronic wave functions at that fixed point.
Previous works on the SU(2|1) model have focused on some correlation functions [4,5] and
on the construction of some characters [6], but a complete picture of the theory has been missing.
Indeed, the analysis of WZNW on supergroups is notoriously difficult, even for the simplest
case of GL(1|1) [7]. In a recent paper [8], we have shown how a careful study of the particle limit
(in particular, of the simultaneous left and right invariant actions on the space of functions on the
group) could provide considerable insight into this problem. Combining this insight with some
additional input from the representation theory of the current algebras allowed us to formulate
a complete proposal for the state space of the theory in the case of GL(1|1). The latter involves
a rather intricate mixing of left and right movers that is intimately related to the representation
theory of Lie superalgebras, in particular to the importance of indecomposable representations.
We were then able to check this proposal through an exact construction of the theory in the
continuum formulation.
The aim of this work is to extend the lessons we have learned in [8] to a non-Abelian setup,
using SU(2|1) as the simplest non-trivial example.2 Once more, the analysis of the particle limit
(Section 2) along with some input from the representation theory of the sl(2|1) current algebra
(Section 3) shall provide all the necessary ingredients for the construction of the field theory
state space (Section 4), in close analogy to our previous investigation of the GL(1|1) model. In
the present case we shall not attempt to verify the structure of the state space through calculations
1 Our conventions are such that the sub SU(2) algebra has level k. In part of the literature, the level is defined as
2 ours, so the free system in (1.1) has k = 1 there.

2 To be more precise, we shall consider the universal cover of SU(2|1) in which the Abelian, time-like circle is replaced
by the real line. We shall comment on this in much more detail in Section 4.
314
of correlators, though this would be possible as well (see [9]). Instead we shall use results on an
integrable sl(2|1) spin chain to test our continuum constructions. Such a spin chain was first
investigated in [10] as a discrete version of the SU(2|1) WZNW model. We shall see that both
approaches are consistent. The comparison, however, is a bit subtle, mainly due to the fact that
the supergroup SU(2|1) has an indefinite metric. While this poses no problem for the (algebraic)
conformal field theory analysis, the computation of the partition function on the lattice suffers
from divergencies which need to be regularized. We shall do this through some appropriate analytic continuation. In this sense, our analysis also supports a particular prescription for extracting
information from spin chains with an indefinite metric.
2. The minisuperspace analysis
The aim of this section is to decompose the space of functions on the supergroup SU(2|1)
into (generalized) eigenfunctions of the quadratic Casimir element in the regular representations.
Since the Casimir commutes with the generator, the eigenspaces may be decomposed into representation of the Lie superalgebra sl(2|1). It is therefore useful to have some background on the
representation theory of sl(2|1). We shall review a few known facts below before addressing the
harmonic analysis. More details can be found e.g. in [11,12].
2.1. The Lie superalgebra sl(2|1)
In this subsection we provide a short overview on finite dimensional representations of sl(2|1).
Rather than reproducing a complete list of such representations we shall focus on those that are
relevant below, namely on Kac modules and the projective covers of atypicals.
2.1.1. The defining relations of sl(2|1)
The even part g(0) = gl(1) sl(2) of the Lie superalgebra g = sl(2|1) is generated by four
bosonic elements H , E and B which obey the commutation relations

H, E = E ,

E + , E = 2H,

B, E = [B, H ] = 0.
(2.1)
In addition, there exist two fermionic multiplets (F + , F ) and (F + , F ) which generate the
odd part g(1) . They transform as ( 12 , 12 ) with respect to the even subalgebra, i.e.

1
1
H, F = F ,
H, F = F ,
2
2

E , F = F ,
E , F = E , F = 0,
1

B, F = F ,
2

E , F = F ,

1
B, F = F .
2
(2.2)
Finally, the fermionic elements possess the following simple anti-commutation relations

F , F = F , F = 0,

F , F = E ,

F , F = B H
(2.3)
among each other. Formulas (2.1)(2.3) provide a complete list of relations in the Lie superalgebra sl(2|1).
315
2.1.2. Kac modules and irreducible representations

Kac modules [13] are the basic tool in the construction of irreducible representations. In the
case of g = sl(2|1), these form a 2-parameter family {b, j } of 8j -dimensional representations.
We may induce them from the 2j -dimensional representations (b 12 , j 12 ) of the bosonic
subalgebra g(0) by applying the pair F of fermionic elements. Our label b C denotes a gl(1)charge and spins of sl(2) are labeled by j = 12 , 1, . . . . The dual construction which promotes the
fermionic generators F to creation operators, yields anti-Kac modules {b, j } (b and j take the
same values as above). The bosonic content of (anti-)Kac modules may be read off rather easily
from their construction,

1
1
1
1
{b, j }|g(0)
(b, j ) (b, j 1) b + , j
. (2.4)
= {b, j }|g(0)
= b ,j
2
2
2
2
For generic values of b and j , the modules {b, j } and {b, j } are irreducible and isomorphic. At
the points b = j , however, they degenerate, i.e. the representations are indecomposable and no
longer isomorphic. In fact, Kac and anti-Kac modules are then easily seen to possess different
invariant subspaces. To be more precise the (anti-)Kac modules {j, j } and {j, j } are built
from two atypical representations such that

1
{j, j }: {j } j
,
2

1
j
{j } .
{j, j }:
(2.5)
2
The atypical irreducible representations {j } that appear in these small diagrams are (4j + 1)dimensional. With respect to the even subalgebra they decompose according to

for + and j = 12 , 1, . . . ,
(j, j ) (j + 12 , j 12 ),
{j } |g(0) =
(2.6)
(j, j ) ((j + 12 ), j 12 ), for and j = 12 , 1, . . . .
For j = 0, only the trivial representation (0) occurs. It is also useful to introduce the characters
of these representations. By definition, these are obtained as

R (z, ) = strR B zH
where the super-trace extends over all states in the representation R of sl(2|1). For Kac modules
the character is rather simple. In fact, it factorizes
{b,j } (, z) = b1/2 f (, z)
l=j
1/2

zl
l=j +1/2
with a fermionic contribution f that is independent of the Kac module under consideration,
f (, z) = 1 1/2 z1/2 1/2 z1/2 + .
The characters of atypical representations can be obtained easily form their decomposition formulas (2.6). We would like to pursue a rather different route here that uses the decomposition
(2.5) of Kac modules into atypicals. The first formula implies that
{j,j } (, z) = {j } (, z) {j 1/2} (, z).
(2.7)
316
Fig. 1. A graphical illustration of how characters for an atypical representation can be obtained as an infinite sum of
characters of Kac modules. Here the one-dimensional atypical identity 0 appears as a sum over {1/2, 1/2} (thin lines and
dots), {1, 1} (medium lines and dots), {3/2, 3/2} (thick lines and dots) etc. All spurious contributions (that is, the whole
tower but the origin) appear twice, and they disappear by cancellations of bosonic and fermionic degrees of freedom. The
diagram corresponds to the choice of plus sign in formula (2.8).
We can solve these equations for the characters of atypical representations by the following
infinite sums
{j 1/2} (, z) =
{j n/2,j +n/2} (, z).
(2.8)
n=0
One may check by explicit computation that the contributions from all but two bosonic multiplets cancel each other in the infinite sum through a mechanism that is visualized in Fig. 1. The
remaining two terms certainly agree with the decomposition formulas (2.6). Our derivation here
may seem like a rather complicated path for such a simple result, but we shall see later that the
same trick works for characters of atypical affine representations which are otherwise difficult to
obtain.
2.1.3. Projective covers of atypical irreducible modules
By definition, the projective cover of a representation {j } is the largest indecomposable representation P (j ) which has {j } as a subrepresentation (its socle). We do not want to construct
these representations explicitly here. Instead, we shall display how they are composed from atypicals. The projective cover of the trivial representation is an 8-dimensional module of the form

1
1
P(0): {0}
(2.9)
{0}.
2 +
2
317
For the other atypical representations {j } with j = 12 , 1, . . . one finds the following diagram,

1
1
P (j ): {j } j +
(2.10)
j
{j } .
2
2
These representation spaces are (16j + 4)-dimensional. Let us agree to absorb the superscript
on P into the argument, i.e. P (j ) = P(j ), wherever this is convenient.
2.2. Functions on the supergroup SU(2|1)
Now we are prepared to analyze the space of functions on the supergroup SU(2|1). For this
purpose, let us introduce coordinates through the following explicit decomposition of elements
U SU(2|1),
U = ei F eizB gei F .
Here, the bosonic base SU(2) R is parametrized by an element g SU(2)

= S3 along with the
time-like variable z. In these coordinates, the generators of the right regular action read
0
RE = R E
RF = i ,
+ ,
1
1
1
1
0
RH = RH
+ + + ,
RB = iz + + ,
2
2
2
2
0

iz/2 1/2
0
RF = ie
D(1/2) (g) + i RE + i iz RH
(2.11)
(2.12)
(2.13)
0 are the generators of the right regular representation of SU(2). They act on the matrix
where RX
j
elements Dab (g), a, b = j, j + 1, . . . , j , according to

j
j
j
j
0
0
Dab (g) = bDab (g),
RE
(j + b + 1)(j b)Da(b+1) (g),
RH
+ Dab (g) =

j
j
0
RE
(j + b + 1)(j b)Dab (g).
Da(b+1) (g) =
Matrix elements with j = 1/2 appear as coefficients in the differential operators RF and their
0 plays an important role in checking that the above generators
behavior under the action of RX
of the right regular representation obey the defining relations of sl(2|1). Formulas for the left
regular representation may be obtained similarly,
LE = L0E ,
LF = i ,
1
1
1
1
LH = L0H + + + ,
LB = iz + + + + ,
2
2
2
2

j
LF = ieiz/2 D(1/2) (g) + i L0E + i iz L0H .
(2.14)
(2.15)
(2.16)
It is probably not necessary to stress that left and right generators (anti-)commute with respect to
each other.
By construction (see however [9]), the generators of the left and right regular representation
act on the space of all Grassmann valued functions with square integrable coefficients on the
bosonic base, i.e. on the space

L2 SU(2|1) := L2 SU(2) R ( , )
where ( , ) denotes the Grassmann algebra that is generated by our four fermionic coordinates and . With respect to the left regular action, the space of square integrable functions
318
can be shown to decompose as follows,

4j {b, j } {b, j }
L2 SU(2|1)
=L
j =1/2 b
=j

(2j + 1) Pj+ Pj 2j Pj+ Pj .
(2.17)
Here, the summation runs over j = 0, 1/2, 1, . . . , {b, j } denotes the typical representations of
sl(2|1) and Pj are the projective covers of the atypical representations {j } . Most of our conventions can be found e.g. in [11]. A prime on a representation means that the degree is inverted,
i.e. that fermionic vectors become bosonic and vice versa. The result is a special case of the general observation made in [14] and it generalizes a similar decomposition we described in [8] for
the left regular action of 11gl(1|1)11. The interested reader can find an explicit proof in Appendix
A. Let us comment that the decomposition of the left regular action displays the same violation
of the PeterWeyl theorem as in the case of GL(1|1). In particular, since the quadratic Casimir
is not diagonalizable in the projective covers Pj , the Laplacian on the supergroup SU(2|1) can
only be brought into Jordan normal form. The blocks can reach a rank up to three.
The functions on our supergroup carry another (anti-)commuting action of the Lie superalgebra g by left derivations. There is a corresponding decomposition which is certainly identical
to the decomposition above. A more interesting problem is to decompose the space of functions with respect to the graded product g g in which the first factor acts through the left
regular action while for the second factor we use the right regular action. In the typical sector,
the 4j |4j -dimensional multiplicity spaces in the first line of Eq. (2.17) get promoted to typical
representations of the right regular action, i.e.

L2 SU(2|1)
{b, j }L {b, j }R J
=LR
(2.18)
j =1/2 b
=j
where J is a single indecomposable, containing all the atypical building blocks. Its structure
may be summarized by the following picture
{ 12 } { 12 }+
{1} { 12 }+
{0} {0}
{0} { 12 }+ { 12 } {0}
{ 12 } { 12 }+
{ 12 }+ {0}
(2.19)
{0} {0}
This diagram is the natural extension of the corresponding picture for GL(1|1). It extends to
infinity in both directions and combines all the atypical sectors into a single indecomposable
representation. Note that by construction, each projective cover in the decomposition of the right
regular representation appears with the correct multiplicity. We shall see below how this picture
is modified in the full quantum theory.
319
3. Representation theory of the affine algebra

The previous analysis of the particle limit applies to all sigma models on SU(2|1), but the
information it provides is usually not sufficient in order to reconstruct the entire field theory
from it. This is very different for the WZNW model in which the entire spectrum can be generated from particle wave functions through current algebra symmetries. We need some facts on
the representation theory of the sl(2|1) current algebra and shall provide them in the following
section. All the results we collect here are well known from [6,1518]. Their derivation, however, is somewhat original. In particular, we shall use a simple, but highly efficient prescription
to construct characters of atypical representations of
sl(2|1) through infinite sums over typicals.
This extends the formula (2.8) we have discussed in Section 2 to an infinite dimensional setting, thereby generalizing a trick that has first been proposed in the context of the gl(1|1) current
algebra [19].
3.1. Some basic ingredients
Irreducible representations of the affine sl(2|1) algebra can be built over the irreducible
typical representations {b, j } with j = 1/2, . . . , k/2 as well as over the atypicals {j } with
j = 0, 1/2, 1, . . . , k/2. Ground states in the former set of representations possess conformal dimension

h{b,j } = j 2 b2 /(k + 1)
while the conformal dimension for ground states in the latter set vanishes. Following the work
[15] of Bowcock et al. we shall divide these representations into three different classes. The
generic class I representations occur for {b, j } with b
= jm where we defined
jm := j + m(k + 1)
for m integer.
Class II representations include those erected over {b, j } with b = jm , m

= 0, along with the
sectors generated from atypicals {j } , j
= 0. The vacuum representation that is generated from
the atypical {0} is the only member of the final class, which we denote as class IV for historical
reasons. Our aim is to describe the singular vectors in the corresponding Verma modules and to
provide the associated formulas for the super-characters

c
R (q, z, ) := strR q L0 24 B0 zH0
of irreducible representations. The results we describe have first appeared in [15].
Before we start our discussion of characters let us quickly recall that it is possible to construct
sl(2|1) currents in terms of decoupled bosonic and fermionic variables. To be more precise, we
introduce a set of bosonic currents e (z), h(z), b(z) and assume them to satisfy the operator
product expansions of an affine sl(2) algebra at level k 1. In addition, let us introduce two sets
of fermionic fields p a and a obeying the canonical relations
a (z1 )p b (z2 )
ab
+ .
z1 z2
Then we can construct an sl(2|1) current algebra at level k through the following prescription,
E + (z) = e+ (z) + : 1 p 2 :(z),

1
H (z) = h(z) + : 1 p 1 2 p 2 :(z),
2
320

1
B(z) = b(z) : 1 p 1 + 2 p 2 :(z),
2
F + = 2 e+ (z) + 1 (b + h)(z) : 1 2 p 2 :(z),
F = 1 e (z) + 2 (b h)(z) + : 1 2 p 2 :(z).
E (z) = e (z) + : 1 p 2 :(z),

F + (z) = p 2 (z),
F (z) = p 1 (z),
Since the fermionic fields a
and p a
(3.1)
are supposed to commute with the bosonic fields e (z), h(z)
and b(z), the characters of typical representations factorize with the factors 1 (y, q) arising from
the fermionic pairs. The shift j j 1/2 in the bosonic contribution may be traced back to a
similar shift in the labeling of typical sl(2|1) representations, see Eq. (2.4).
3.2. Typical (class I) representations
The generic class I representations have no singular vectors except from the ones that arise
through the representations of a bosonic su(2) current algebra at level k 1. In this sense, they
may be considered the typical representations of the affine sl(2|1) algebra. The statement implies
a precise expression for the characters of class I representations
q b /(k+1) 1/2 1/2 1/2 1/2 k1
1 z , q 1 z
, q j 1/2 (z, q)
b 3 (q)

where 1 (y, q) = iy 1/2 q 1/8
1 q n 1 yq n 1 y 1 q n1
2
I
{b,j
} (q, z, ) =
(3.2)
(3.3)
n=1
and b
= jm and 1/2 j k/2. We also recall that the su(2) characters are given by
(k+1)a 2 +2aj a(k+1)
j2
1
q
(z
za(k+1)2j )
k1
j
1
a
.
j 1/2 (z, q) = q k+1 8 z
n
1 n1 )(1 q n )
n=1 (1 zq )(1 z q
We shall use the symbol {b, j } for these irreducible representations of the affine algebra. The
formulas are easy to understand: they follow directly from the representation (3.1) of the sl(2|1)
current algebra. In fact, each pair of fermionic fields contributes a factor 1 / while the bosonic
sl(2) and u(1) current algebras are responsible for the characters k1 and an additional factor
1 , respectively.
3.3. Atypical (class II) representations
Nothing prevents us from evaluating the previous character formulas at the points b = jm . But
the resulting functions turn out to be the characters of indecomposable representations {jm , j }
which contain one fermionic singular multiplet. In order to state this more precisely, let us consider in more detail the set of atypical labels,

A := jm , j 1/2 j k/2; m Z .
The set A is visualized in Fig. 2. Our picture shows clearly that the projection to the b-coordinate
of each element in A is injective and hence it can be used to enumerate our atypical labels.
Note, however, that values b (k + 1)/2Z are omitted. This motivates to introduce an improved
enumeration map from A to non-zero half-integers which is defined by

jm , j = jm m + 1/2 for m > 0,
jm+ , j = jm+ m for m 0,

jm , j = jm + m for m 0,
jm+ , j = jm+ + m 1/2 for m < 0.
321
Fig. 2. The set A of atypical labels for the affine sl(2|1) algebra. Even though the sl(2) spin j is cut off at j = k/2, there
exist infinitely many atypical labels (black dots) which are in one to one correspondence with the atypical labels of the
finite dimensional algebra sl(2|1) (central black and pink dots). This correspondence is formalized by our map .
By construction, is not only an injection but its image now also consists of all non-zero halfintegers. We may view as an affine version of the enumeration map ({j, j }) = j for
representations of sl(2|1).
At first sight, the enumeration of atypical labels for our sl(2|1) current algebra may seem
like a rather technical device. But there is more to it. We recall that atypical labels {j, j } can
also be enumerated by non-zero half-integers, i.e. ({j, j }) = j . Our claim now is that the
atypical class I representation with label {jm , j } behaves very similarly to its finite dimensional
counterpart 1 ({jm , j }) in the sense that
I
{j
(q, z, ) = {II ({j ,j })} (q, z, ) {II ({j
,j }
m
m ,j })1/2 sgn jm }
(q, z, ).
(3.4)
This formula is a rather central result for the representation theory of our current algebra. Let
us stress that it is the affine version of a corresponding equality (2.7) between characters of
sl(2|1) representations. As in the finite dimensional setup, Eq. (3.4) emerges from the existence
of fermionic singular vectors in atypical class I representations. In the case m = 0, it claims that
the only such singular vectors are those that appear in the atypical Kac module spanned by the
ground states. When m
= 0, however, the ground states form a typical representation and the
singular vectors appear only on the |m|th level of the class I module.
At first it may seem a bit surprising that affine representations {jm , j } and m
= 0 behave so
similarly to the Kac modules of sl(2|1). In the next subsection we shall understand this behavior
in terms of spectral flow symmetries in the representation theory of the current algebra. Before
322
any study of spectral flow automorphisms, it might be useful to illustrate the similarity between
atypical representations of the current algebra and their finite dimensional counterpart more explicitly, at least for one example. To this end, let us focus on the representation {k/2 + 1, k/2}
which we claim to be a close cousin of the sl(2|1) representation {k/2 + 1/2, k/2 + 1/2}. By
construction, the ground states of the current algebra representation transform in the typical multiplet {k/2 + 1, k/2} and they possess conformal weight h = 1. From these vectors we can
generate states with vanishing conformal weight with the help of modes in the current algebra.
Such modes transform in the 8-dimensional adjoint representation {0, 1} of sl(2|1). Through
decomposition of the tensor product between {k/2 + 1, k/2} and {0, 1} one finds that the Verma
module over {k/2 + 1/2, k/2 + 1/2} contains an atypical sl(2|1) multiplet with conformal weight
h = 0. In fact, the results of [12] imply that the latter transforms according to the projective cover
P(k/2 + 1/2), see Eq. (2.10). Not all of these states survive when we descend from the Verma
module to the class I representation. This step involves removing bosonic singular vectors and
a moment of reflections shows that such vectors with h = 0 exist and that they transform in
the submodule {k/2 + 1, k/2 + 1} of P(k/2 + 1/2). Hence, the states with h = 0 in our class I
representation decompose into the Kac module {k/2 + 1/2, k/2 + 1/2} plus a bunch of typical
representations. The fermionic singular vectors that are responsible for Eq. (3.4) transform in
the subrepresentation {k/2, k/2} of {k/2 + 1/2, k/2 + 1/2}, giving rise to the identity (3.4) with
j1 = k/2 + 1 and j = k/2.
Before we draw some conclusions from Eq. (3.4), let us quickly comment on our notations.
II of a represenNote that for j = 1/2 and m = 0 the above formula involves the character {0}
tation which has a somewhat special status. In fact, it cannot be obtained as quotient of one of
II
,
the indecomposable representations {jm , j } , unlike the representations with characters {n/2}
n
= 0. Instead, it arises as a submodule of the representations {1/2, 1/2} . Our discussion sugII must be the character of the vacuum representation. In the terminology of Bowcock
gest that {0}
IV for this quantity.
et al. the latter is a class IV representation. Thus, we shall also write {0}
Even though Eq. (3.4) is not a closed formula for the characters of class II representations, we
can now use the same trick as in Section 2.1.2 and write characters of class II representations as
an infinite sum of class I characters,
II
{j
1/2} (q, , z) =
I 1 (j n/2) (q, , z)
(3.5)
n=0
for j = 1/2, 1, 3/2, . . . . Note that the map is invertible on all non-zero half-integers and it
furnishes the label of the Kac module that sits at the bottom of the corresponding class I representation. By inserting our explicit formulas for class I characters we find
1 (z1/2 1/2 , q)1 (z1/2 1/2 , q)
3 (q)1 (z, q)

za(k+1)j
za(k+1)j
2
q (k+1)a 2aj j
.
1 + q a z1/2 1/2 1 + q a z1/2 1/2
II
{j
} (q, z, ) = i
(3.6)
aZ
Character formulas of this type have to be used with some care: Before the denominators are
expanded, one should spit the summation over a into two parts. The one arising from positive
values of a can be converted into a power series right away. In all terms with non-negative a,
however, one must first reduce the fraction by q a such that the subsequent expansion contains
323
only non-negative powers of q. In the end, we recover the known results on the representation
theory of the affine sl(2|1) algebra [6,18]. Our derivation was based on three ingredients: the
decoupling formulas (3.1) for bosonic and fermionic generators, the structure (2.5) of atypical
Kac modules for sl(2|1) and the fact that atypical class I representations with m
= 0 decompose
in the same way as in the case of m = 0. We shall argue now that the last ingredient emerges
from spectral flow symmetries in the representation theory of affine sl(2|1).
3.4. Spectral flow symmetries
The affine sl(2|1) algebra admits several interesting automorphisms. We shall be mainly concerned with two such spectral flow automorphisms . By construction, are defined on the
entire current algebra, but for our purposes it is sufficient to know how they act on the generators
B0 , H0 , L0 ,
(B0 ) = B0 k/2,
(H0 ) = H0 + k/2,
(L0 ) = L0 + H0 B0 .
From these formulas we may infer how (super-)characters behave under the action of and
this in turn is sufficient to determine how spectral flow automorphisms map representations of
the current algebra onto each other. Along with we shall also be interested in the composite
automorphism = + 1 which acts as
(B0 ) = B0 + k,
(H0 ) = H0 ,
(L0 ) = L0 2B0 k.
Any automorphisms of the current algebra gives rise to a map between representations and hence
to a map between characters. From the action on the zero modes B0 , H0 and L0 we can easily
read off that

(q, , z) = k q k q, q 2 , z
(q, , z) = k/2 zk/2 q, q 1 , qz ,
(3.7)
for all characters of the sl(2|1) current algebra. If R is any representation of
sl(2|1) and R is
its character, then the image R of R under an automorphism obeys
R (q, , z) = R (q, , z).
Given the character R of some representation R, we can use Eqs. (3.7) to compute its image under the above automorphisms = , . This in turn allows us to recover uniquely the
representations R and R.
In the following we shall spell out the action of our spectral flow automorphisms on the
class I and II representations we have studied above. Our rather compact notations allow us to
summarize the results for the spectral flow automorphisms in a single line

{n/2} = {n/2 k/2} .
{b, j } = {b k/2 1/2, k/2 + 1/2 j } ,
(3.8)
To verify our assertions, the reader is invited to convert them into identities between supercharacters and to check these identities by direct computation. The formulas become somewhat
more explicit if we label irreducible representations according to the representation their ground
states transform in,
{b, j } {b k/2 1/2, k/2 + 1/2 j }
{b, j } {k/2 + 1/2 j }
for b
= (j k 1),
for b = (j k 1),
324

{j } (j + k/2 + 1/2), k/2 + 1/2 j
{j } {k/2 j } ,
for j
= 0,
{0} {k/2} .
(3.9)
The third line, for example, tells us that the image of the irreducible representation generated
from the atypical representation {j }+ under the action of + is an irreducible representation
whose ground states transform in the typical representation {j + k/2 + 1/2, k/2 + 1/2 j }. The
latter may be obtained from the corresponding Verma module by removing singular vectors on
some excited levels.
We also want to spell out analogous formulas for the automorphism = + 1 . In the
compact notation, its action is given by

{b, j } = {b + k + 1, j } ,
(3.10)
{n/2} = {n/2 + k} .
Note that the symmetry maps sectors whose ground states transform in an atypical representation {j } of the Lie superalgebra sl(2|1) into sectors with typical spaces of ground states
according to the following rules,

m
{j } jm , j
for m > 0,
m

{j } jm 1/2, j + 1/2 for m < 0.

Hence, the existence of the spectral flow symmetries explains why the representations {jm , j }
behave like atypical representations of the affine sl(2|1) algebra: they are simply related to the
sectors erected over atypical sl(2|1) representations by an automorphism.
3.5. Modular transformation and S-matrix
We would like to conclude this section on the representation theory of the sl(2|1) current
algebra with a few comments on modular properties of the characters. In the following we shall
consider the characters as functions of , and . They are related to the variables we used above
through q = exp 2i , z = exp 2i and = exp 2i, as usual. From our explicit formula (3.2)
for characters of class I representations it is easy to infer the auxiliary formula
I
{b=0,j
}
1
, ,

e
=
i
ik 2
2
i 2
2
2

j =1/2
2
4jj I
sin
(, , ).
k+1
k + 1 {b=0,j }
Note that the right-hand side contains an explicit dependence which, if we demand that the
modular transform be interpreted in a conventional sense and closes onto characters, suggests
the contribution of a continuous spectrum of exponents. The need is confirmed by the modular
transformation of the character for {b, j } representations with b
= 0, which require an integral
2
representation of e2i b /(k+1) etc. After some Gaussian integration, we find
k

2
4ibb
ik
2
4jj
1
2 2 )
(
I
db e k+1 {b
= ie 2
sin
, ,
,j } (, , )

k
+
1
k
+
1
j =1/2
(3.11)

2
where we formally evaluated integrals of the type exp(i x ) = /i [of course, the integrals are naively divergent as Im > 0].
I
{b,j
}
325
Modular transformations of the type II and IV characters are a bit more cumbersome to work
out. It can be attacked rather efficiently using our representations (3.5) as infinite sums of class
I characters. Here we shall content ourselves with the example of the class IV representation at
k = 1. If we also set = = 0 we find that
i
IV
{0}
(1/ ) =
db
I
( )
cos b {b ,1/2}
(3.12)
where the contour has to avoid the poles. Rotating into the purely imaginary direction gives
IV . The remaining integral can be
a contribution from poles which is easily identified with {0}
I
factored in terms of {0,1/2} ,

IV
IV
{0}
(1/ ) = {0}
( ) +
q2
d
( ).
I
cosh {0,1/2}
(3.13)
We thus recover by this very elementary means the results of [6] obtained through use of the
Mordell integral [20]. The construction of modular invariants using these transformation formulas is a complex problem, which we shall address later in the case k = 1.
4. The state space and partition functions
Our aim now is to formulate a proposal for the states space of the sl(2|1) WZNW model.
We shall then verify our claim in the special case k = 1 through a free field representation of the
model. The third subsection is devoted to the partition function of the theory. The latter forgets all
information about the complicated way in which irreducible blocks are glued together to build J .
We then specialize once more to k = 1 and comment on the global topology of the target space.
4.1. The proposal for integer level k
It is now rather straightforward to come up with a proposal for the state space of the WZNW
model on SU(2|1). In fact, we can simply depart from formula (2.18) and make it symmetric
with respect to the action of our spectral flow symmetry. The invariance under the action of
should be considered as an additional input. In principle, the spectral flow symmetry of the
sl(2|1) current algebra could be broken by the physical couplings of the theory. Since this did
not happen for the GL(1|1) WZNW model, it seems natural to propose
HCFT =
k/2

j =1/2
{b, j }
L {b, j }R J
(4.1)
b
=jm
where J is a single indecomposable representation of the two (anti-)commuting super current

algebras that contains all the atypical contributions. It is composed from the atypical building
blocks {l1 } {l2 } in the same way as in the minisuperspace theory. To obtain the corresponding
diagram one simply has to replace {j } = {j } with {j } .
By construction, all the sl(2|1) currents act on the state space (4.1) and they obey periodic
boundary conditions. This applies in particular to the fermionic fields. One can find a second,
closely related theory in which only bosonic fields are periodic. In order to construct its state
326
space, we need to revisit our discussion of spectral flow symmetries. As we have mentioned
above, the automorphisms we have investigated in the previous section all extend to the entire
current algebra. In particular, they map fermionic modes with integer mode numbers onto each
other, i.e. they respect periodic boundary conditions on the fermionic currents. There exists yet
another important isomorphism that intertwines between integer and half-integer mode numbers
for the fermionic generators. It can be considered as the square root of the automorphism . On
the bosonic zero modes, the new isomorphism is given by
(B0 ) = B0 + k/2,
(H0 ) = H0 ,
(L0 ) = L0 B0 k/2.
extends to the full current algebra such that it acts trivially on the bosonic sl(2) currents and it
shifts modes of the fermionic currents by 1/2, as usual.
Our isomorphism induces a map between representations of the current algebra with integer
fermionic modes and a new type of representations in which fermionic generators have halfinteger mode numbers. According to the usual terminology, the former class of representations
form the R sector while the latter belong to the NS sector of the theory. The theory with state
space (4.1) includes exclusively R sector representations in which all currents obey periodic
boundary conditions. Another option is to consider a theory that encompasses both R and NS
sector with the state space given by
R
NS
H CFT = HCFT
HCFT
= HCFT HCFT .
Note that the NS sector has exactly the same intricate structure as the R sector since the former
is the image of the latter under the action of . In the following we shall refer to both models
as WZNW model on SU(2|1). Even though it seems natural to include the NS sector, it is not
required by all applications.
4.2. Free field representation at k = 1
So far, the main motivation for our proposal (4.1) came from the harmonic analysis on
SU(2|1). By construction, we are guaranteed to recover the correct state space of the particle
limit when we send the level k to infinity. Our formula (4.1) applies to finite k and it suggest
that field theory effects would merely truncate the spin j to an value j k/2 and then make the
whole theory symmetric with respect to spectral flow. We are now going to test this proposal in
the extreme quantum case, namely at k = 1. At this point, the WZNW model admits a free field
representation that we are going to spell out momentarily (it seems to have appeared first in [5]).
It involves a pair of symplectic fermions 1 , 2 , and a pair of bosons , . While the boson
comes with the usual metric, is assumed to be time-like. For their propagators this means

1 (z, z )2 (w, w)
= ln |z w|2 ,

(z, z )(w, w)
= ln |z w|2

(z, z ) (w, w)
= ln |z w|2 .
Note that the central charge of this free field theory is c = 2 + 1 + 1 = 0 and hence it agrees
with the central charge of SU(2|1) WZNW models. We shall begin our discussion of the WZNW
model with explicit formulas for the currents. In order to construct the four bosonic currents, we
need to split the space-like bosonic field (z, z ) = (z) + (
z) into its chiral components. Our
bosonic currents then read,
E + (z) = e 2i(z) ,
1
H (z) = i(z),
2
E (z) = e
327
2i(z)
,
1
B(z) = i (z).
2
The necessity to split into its chiral components means that the boson is compactified to the
so-called self-dual radius, as usual in the free field representation of the SU(2) WZNW model
at level k = 1. In addition, the following expressions for the four fermionic currents also involve
the chiral components and of the time-like bosonic field (z, z ) = (z) + (z),
V (z) = e
1 i((z)+ (z))
2
1 (z),
W (z) = e
1 i((z) (z))
2
2 (z).
(4.2)
Similarly, one may spell out the anti-holomorphic generators of the sl(2|1) current algebra. It
is rather easy to check that the above expressions give rise to fields with the correct operator
product expansions. Let us note that the free field representation we consider in this section has
to be distinguished clearly from the KacWakimoto type construction (3.1) we have used earlier
to construct the characters at integer levels k. We shall comment on this a bit more later on.
It is possible to check that fields of dimension zero can be organized exactly as it is suggested
by our diagram (2.19). We shall just sketch the relevant arguments because a full proof is rather
laborious to write down. Let us consider the left part of the diagram only and identify the fields
that make up the various blocks of the composition series. Clearly, the {0} {0} representation at
the top corresponds to the field 1 2 . From here we can act with the fermionic currents W , V
and arrive at expressions for the two blocks on the intermediate level of the diagram,
{1/2} {0}:
{0} {1/2}+ :
2 i / 2
e
2 ,
i /
2 i / 2
e
e
1 ,
ei/
ei
i 2
2 2 ,
1.
1
From the previous formulas we can read off the fields that make up the topmost representation
{1/2} {1/2}+ in our diagram,

2 ei / 2
ei/ 2 ei / 2 2
ei /
1

{1/2} {1/2}+ :
(4.3)
1.
ei 2 2 2
ei 2 1
Acting with the holomorphic fermionic currents V (z) we arrive at the following formulas for
fields that belong to the multiplet

i/ 2 e3i / 2

2 2 e
(+ )
/
2 ei / 2

ei
i
2
i
2
{1} {1/2}+ :
(4.4)
2 e
e

1

ei 2
i
2
2 e
on the intermediate level of the diagram. Our notation means that every product of the three
holomorphic fields on the left-hand side with the three anti-holomorphic fields on the right-hand
side is part of this 9-dimensional block. Similarly, we can now descend to the bottom of the
diagram,

/ 2
i /
2 ei / 2
ei/ 2 ei
e

{1/2} {1/2}+ :
(4.5)
1.
2 ei 2
ei 2
Finally, the representation {0} {0} in center bottom position is represented by the identity
field. It is easy but laborious to check that the different representations are connected by the
328
action of the left and right

generators as indicated in the diagram. In checking this, notice that

i
2
i
2
2 e
2 e
up to a total derivative.
There are a number of interesting further comments and observations that we would like to
make. Let us begin with a brief comment on the relation with KacWakimoto like representations
of the form (3.1). As discussed in [9], a naive evaluation of the action of the latter on vertex
operators leads to a much simpler picture in which the atypical sector is a smooth deformation
of the typical part. In particular, there is no mixing between left and right movers as in the case
of the representation J . In order to see the latter, the screening charge of the KacWakimoto
representation must be taken into account (see [9] for details). The free field representation we
have employed in this subsection is much simpler to use, but it is restricted to k = 1.
The free field representation also allows us to illustrate very explicitly how atypical fields of
dimension h = 0 are embedded into sectors with ground
states in typical multiplets once their

spin exceeds k/2. Take, for instance, the field O = ei 2( ) 2 from the {1} representation
and observe that
E (z)ei
2( )
2 (w) =
1
i 2(z) i 2( )
:e
e
2 (w): + .
(z w)2
(4.6)
Thus O is not a highest weight of the current algebra, and applying a current operator can

give rise to a field of dimension h = 1, namely the field P = ei 2 2 . This field belongs to the typical
representation {3/2, 1/2} in the sector {1} with affine highest weight
)/ 2
i(3
Q=e
2 2 . One may easily generalize these observations to all representations in
the complex, hence confirming our analysis based on spectral flow.
Let us finally come to the most important point, which concerns the possible construction of
consistent theories3 that are realized on a subspace of our state space (4.1). Note that H does
contain a large number of fermionic singular vectors that we decided not do decouple, partly
because the minisuperspace analysis suggested that it was unnatural to do so. But to a certain
extend one should consider (4.1) as some kind of maximal choice from which other models can
be obtained by consistent decoupling of singular vectors. In general there can be several such
reduced theories. Once more we may use our free field representation for the k = 1 model to
illustrate nicely how this works. Note that the expressions (4.2) for the currents only contain
derivatives of the fermionic fields. Hence, we do not spoil the sl(2|1) current algebra symmetry
if we decide to work with a model in which the fundamental fields are e.g. , , 2 and 1 .
Since 1 is not part of this model, some of the sectors we discussed above do no longer appear.
This concerns the sectors {0} {0} and {1/2} {1/2}+ on the top floor of J and the sector
{0} {1/2}+ on the intermediate floor. In the resulting model, there is still a single atypical
sector that comprises all the irreducible atypicals, but it is reduced to two floors and has the
shape of a saw blade. Obviously, a similar analysis applies to the theory that contains 2 instead
of 2 . But we can even go one step further and drop both 1 and 2 so that only fermionic
derivatives remain. What results is a model whose atypical sector decomposes into an infinite
sum of irreducibles. The latter are the sectors that appear on the bottom floor of J , i.e. {0} {0}
and {1/2} {1/2}+ from our list above. All others need either 1 or 2 . Similar phenomena
are possible at other levels. We shall see another explicit example in Section 5.2. Let us stress,
however, that the free field construction at k = 1 does not include the defining {0, 1/2} field of
3 Consistency in this paragraph refers to the existence of genus zero correlators obeying the usual factorization constraints. The construction and behavior of torus amplitudes is not addressed.
329
the WZNW model. A careful study of the KnizhnikZamolodchikov equations [35] shows that
consistency in the presence of the {0, 1/2} sector requires the identity field to be embedded into
an indecomposable sector. In this sense, the fully truncated atypical sector we have just described
cannot be embedded into the sl(2|1) WZNW model.
4.3. Partition functions
We now go back to the full theory based on our proposal (4.1). We would like to compute
the partition function of the model, with and without inclusion of the NS sector. Since partition functions are obtained by taking the trace over the state space, the details of the action of
fermionic generators in the atypical sector J do not show up in the result. In other words, the
contribution from the indecomposable J is the same as if we would take the trace over a sum
of its irreducible components. The latter can be resumed as follows,

II
II
II
II
II
II
strJ q L0 +L0 =
2{}
(q)
{}
(q) {+1/2}
(q)
{}
(q) {1/2}
(q)
{}
(q)
Z/2

II

II
II
II
{}
(q)
{1/2}
(q)
{}
(q) {+1/2}
(q)
Z/2
II

II
II
II
{}
(q)
{1/2}
(q)
{}
(q) {+1/2}
(q)
=1/2,1,...
II

II
II
II
{}
(q)
{+1/2}
(q)
{}
(q) {1/2}
(q)
=1/2,1,...
k/2

mZ j =1/2
I
I
{j
(q)
{j
(q).
,j }
,j }
m
In the last step we have inserted the relation (3.4) between characters of class I and class II
representations and we used the isomorphism to convert the sum over non-zero half-integers
into a sum over m and j . Our result shows that the contribution from the atypical representations
agrees exactly with the part that we omitted from the typical sector of the theory. Hence, the full
partition function becomes
Z(q) =
k/2

j =1/2 bR
I
I
{b,j
{b,j
} (q)
} (q).
(4.7)
Let us also briefly discuss how the partition functions is modified when we want to include the
NS sector. In that case, the trace extends over both HCFT and its image under the spectral flow .
The modular invariant partition function Z(q)

of this theory contains four different terms, two
in which (1)F is inserted and two in which it is not. It is customary to label the corresponding
contributions with R, sR, NS and sNS where the small s signals the insertion of (1)F . With
these notations, the standard super-characters we have discussed throughout the previous section
should all carry a superscript sR. It is easy to find explicit formulas for the other three sets of
(super-)characters using the relation

sNS (q, , z) = sR (q, , z) = k/2 q k/2 sR q, q 1 , z
to convert sR characters into sNS super-characters. The same prescription is used when we construct NS characters from the R sector, only that we have to replace the R super-characters by
330
ordinary characters. The partition function Z(q),

finally, has the same form as Eq. (4.7) with an
additional summation over all four types of terms.
Let us illustrate the previous results in the case of k = 1 again. The (sR) characters of this
theory take a particularly simple form, as was first observed by Bowcock et al. in [6],

q b /2 b
1/2 (q, )0 (q, z) 0 (q, )1/2 (q, z)
(q)
2
I
(q, z, ) =
{b,1/2}
(4.8)
where 0 and 1/2 are SU(2) level one characters for spin 0 and 1/2 respectively. This expression allows us to determine the modular invariant physical partition function Z involving
periodic or antiperiodic boundary conditions for the fermions along both periods of the torus.
The doubly periodic sector ( sR ) gives a vanishing contribution for characters {b,1/2} since the
super-dimension of the horizontal Kac modules vanishes. We are left with three contributions,
which read respectively
q b /2
1/2 (q)0 (q),
(q)
2
R
(q) = 2
{b,1/2}
(4.9)
NS
{b,1/2}
(q) =

q b /2 2
2
(q) ,
0 (q) + 1/2
(q)
(4.10)
sNS
{b,1/2}
(q) =

q b /2 2
2
(q) .
0 (q) 1/2
(q)
(4.11)
Their modular transformations are easy to obtain for b = 0,

1
R
sNS
(1/ ) =
( ),
{0,1/2}
{0,1/2}
i
1
NS
NS
(1/ ) =
( ),
{0,1/2}
{0,1/2}
i
1
sNS
R
(1/ ) =
( ).
{0,1/2}
{0,1/2}
i
(4.12)
Obviously, the dependence of these formulas originates in the 1/ term in the characters, and
has to be compensated by a similar factor coming from the b sum in order to obtain a modular
invariant quantity.
The question we want to ask now is what kind of sum over b one should consider. It seems at
first sight that, since we are dealing with SU(2|1), the b number should be discrete, in agreement
with the imaginary exponential appearing in Section 2.2 (note that this is also compatible with
invariance under action of the spectral flow). It is likely that such a theory would make sense
in genus zero. Difficulties arise, however, when we try to establish modular invariance of the
partition function, i.e. consider the theory in genus one. Indeed, even if b is discretized, there is
no reason to truncate its range, and thus the naive spectrum of conformal weights is unbounded
from below, and exhibits arbitrarily large negative dimensions. This should not come as a surprise
since the metric on the group is not positive definite, and thus the naive functional integral in
the WZNW model is divergent. What is required to obtain a physical partition function on the
torusone that could be compared with the spectrum of some lattice Hamiltonian sayis some
sort of analytic continuation.
331
This raises some interesting questions on which we would like to digress briefly. For a compactified time-like boson, partition functions would involve sums of the form
2
en , Re > 0.
n
This sum is obviously divergent, but one could be tempted to give it a meaning by analytical
continuation from a similar sum with Re < 0. An equivalent problem arises when we try to
continue a theta function such as

2
( ) =
ei n
n
into the lower half plane Im < 0. It is known that this continuation is not possible, since the
function has singularities which are dense on the real axis (a quick proof is obtained by first
observing that is singular for an even integer, and then using modular transformations).
Theta functions have a natural boundary, and are simple examples of lacunary functions, i.e.
almost all their Fourier coefficients are zero [21]. The partition function of a compactified
time-like boson is thus a formal object from which it is hard to extract physical meaning. On the
other hand, without compactification, the partition function can easily be analytically continued.
Indeed, replacing the discrete sum by an integral we have
1
2
dx ei x =
i
which can be continued in the lower half plane since it has a single cut along the negative imaginary axis.
We thus restrict to the theory with continuous spectrum of b, and propose the simplest partition
function

Zk=1 = R
{0,1/2}
2 NS
+
{0,1/2}
2 sNS 2
+

{0,1/2}
db (q q)
b
2 /2
(4.13)
which we interpret through analytic continuation, up to an irrelevant phase, as

phys
Zk=1

= R
{0,1/2}
2 NS
+
{0,1/2}
2 sNS 2
+

d (q q)

{0,1/2}
2 /2
(4.14)

2 2
= |0 | + |1/2 |
2

d (q q)

2 /2
This object is obviously modular invariant since, from a direct calculation of the integrals,

1
2 /2
2 /2
d (q q)
d (q q)
(4.15)
=

hence compensating the factors coming from the functions in the characters. We note that the
spectrum of conformal weights in the periodic sector is a continuum starting at h = 1/8. The
field with h = 0 does not appear.
332
In conclusion, the requirement for our theory to possess a physical partition function has
forced us to let b be continuous. Geometrically, this amounts to a decompactification of the timelike circle. Hence we are led to consider the universal cover of SU(2|1) so that we can perform
an analytical continuation on the number b. One may interpret the prescription that leads to the
physical partition function as an effective change of the target space along the lines advocated
in [22]. Let us also stress that, while our arguments were based on the k = 1 theory, it is clear
that a similar reasoning can be carried out for other levels.
The argument leading to b being continuous also seems to exclude the smaller theories where
part of the complex J is dropped, at least for general values of k. We shall see an exception in
the case k = 1/2 later.
5. Some selected applications
This following section contains some selected applications of our general analysis. In the
first subsection we shall compare our results with studies of the continuum limit of the integrable
sl(2|1) 33 spin chain [10]. The agreement we find supports a new interpretation of lattice results.
The second subsection is devoted to the k = 1/2 theory which was not included above, but we
shall see that it shares many of the structures we uncovered throughout the last few sections.
5.1. The 33 super-spin chain revisited
In [10] an integrable sl(2|1) invariant super-spin chain was studied using both analytical and
L where 3 and 3 stand
numerical techniques. Its Hamiltonian acts on the tensor product (3 3)
for the representations {1/2} in our previous terminology. It was argued that in the continuum
limit this chain flows to a SU(2|1) WZNW model at level k = 1. At the time, the WZNW model
on the supergroup SU(2|1) had not been constructed and it seems instructive to revisit the issue
now on the basis of our improved understanding of the continuum field theory. We shall see that
the suggested identification with the continuum limit of the spin chain can be maintained, but
some of the lattice results receive an interesting reinterpretation.
Let us begin by reviewing briefly some results on the spectrum of the lattice model. In [10]
it was found analytically that this spectrum exhibits a unique ground state at h = h = 0, which
lies in the single true singlet of the model, i.e. it is an sl(2|1) invariant state that is not part
of a larger indecomposable representation. This ground state corresponds to an extremely degenerate solution of the Bethe ansatz equations where all roots collapse to the origin. Besides
the ground state, many excited states were also found. The lowest lying state above the ground
state corresponds to a filled sea of some (non-complex conjugate) string complexes. The rest of
the spectrum is given by excitations obeying the usual pattern of holes and shifts of the sea. The
scaled energies of these excitations over the ground state were found analytically to be
L
1
1 1
E = + (N+ + N )2 + (D+ + D )2
2v
4 2
8
+ CN (L)(N+ N )2 + CD (L)(D+ D )2
(5.1)
where CN (L) 0, CD (L) for L .

Here, N , D are quantum numbers characterizing the Bethe ansatz solution. In the continuum
limit, the quantity (5.1) is expected to converge to x = h + h (all weights in the c = 0 theory),
as usual. Formula (5.1) indicates an infinite degeneracy of the level h = h = 1/8 (obtained with
333
N+ = N = D+ = D = 0 say) in the limit L . Numerical studies confirm this behavior: Indeed, they show that an infinite number of levels converges to h = h = 1/8 as L increases.
This was already interpreted in [10] as indicating the existence of a continuum of conformal
weights starting at h = h = 1/8 in the thermodynamic limit.
Although an analytical study of the asymptotic corrections to (5.1) seems still out of reach,
numerical studies in a closely related model suggest that the leading contributions to CN and CL
can be well fitted by the formulas
c
4
(5.2)
+ ,
CD (L) ln L +
ln L
c
for large number 2L of lattice sites. When these leading terms are plugged back into the formula
(5.1) for the spectrum of the lattice model, we see that the second line resembles very much
the spectrum of a free boson which has been compactified to a circle with radius square of the
order ln L. In other words, if we assume that Eqs. (5.2) are correct, the contribution from the
antisymmetric sector (i.e. form excitations for which N+ N or D+ D are non-zero)
to the partition function can be estimated as
1 (e/R+mR/2)2 (e/RmR/2)2
Zanti =
q
q
e,m

R
R 2 |m m |2
1
=
exp
2 Im
2 Im

CN (L)
m,m
R
1

2 Im
(5.3)
where (q) is Dedekinds eta function, as before. The divergence is proportional to R, i.e. to the
size of the target space, as expected. We conclude that in the lattice model, the contribution from
the antisymmetric sector to the partition function
multiplies the contribution from the symmetric
sector in Eq. (5.1) by a term of the order of ln L. The ground state meanwhile, being a very degenerate Bethe ansatz solution, does not come with such an extra factor. The generating function
of levels in the periodic sector will therefore have the form

1
(q q)
1/8 +
Z R = 1 + cst ln L
(5.4)
Im P P
where the dots represent excitations from the symmetric sector in (5.1).
The first conclusion we draw is that the contribution of the continuum completely overrides
the one from the discrete state (as would be the case in any quantum mechanics problem with
discrete states and a continuum with delta function normalizable states), and that a properly
normalized partition function does not see the singlet with h = h = 0. The resulting object is in
good agreement with our conjectured partition function (4.14).
Reference [10] contained various failed attempts to build a conformal field theory containing
both the continuum of representations {b, j = 1/2} and a single identity field associated with the
representation {0}. Given our new insight into the continuum model, the problems to incorporate
the singlet state may not come as a complete surprise. Although the free field construction at
k = 1 suggests the possibility of smaller theories, the study of modular invariants (as well as of
four point functions, as we mentioned above) seems to preclude the appearance of the singlet
representation on its owni.e. without being part of a big indecomposable with vanishing superdimension. In addition all the states we found in the continuum approach were non-normalizable.
334
Both observations lead us to speculate that Eq. (4.14) represents the full operator content of the
continuum limit, and that there is no discrete state associated with a true singlet. Put differently,
the new investigation suggests that the true singlet observed on the lattice is an artifact of the
regularization and does not belong to the continuum limit. Our new interpretation of the lattice
results receives additional support from the very singular nature of the Bethe ansatz solution that
corresponds to the singlet state. It would be interesting to check further the decoupling of the true
singlet by studying the scaling behavior of matrix elements of lattice regularized current algebra
generators.
There is one more potential objection one might raise. Note that in our continuum theory
fermionic and bosonic states are perfectly paired so that the Witten index of the SU(2|1) WZNW
L
model is guaranteed to vanish. Meanwhile, for our lattice spin chain on the space (3 3)
one finds an excess by one for the number of bosonic states over the number of fermionic ones.
Hence, the Witten index is non-zero on the lattice and one would naively expect the same to be
true for the continuum limit, in conflict with what we have proposed above. In order to resolve
this issue, we suggest that there exist different spin chains which give rise to the same continuum
limit while possessing an excess of fermionic states over bosonic. More concretely, while we do
not understand the whole structure yet, we have found4 that the ground states of integrable chains
L scale to conformal weight h = 0 as well (in fact the
L 3 and 3 (3 3)
of the type (3 3)
ground state energy is given exactly by E0 = length e0 where e0 has no finite size correction
and is the same for all chains), but this time they come in the representation 3 (respectively 3).
Once we sum over the various lattice models, the balance between bosonic and fermionic states
may be restored even before taking the continuum limit.
5.2. The WZNW model at k = 1/2
Our investigation above was restricted to integer level k. But as we have mentioned before,
these are some fractional values of k, in particular k = 1/2, which play an important role for
applications. While we are not prepared to give a systematic account on fractional level theories,
we would like to discuss briefly a model with k = 1/2. Our analysis will lead to the remarkable
conclusion that the basic structure of this model is essentially the same as for integer k, only that
there exist several components within the atypical sector, each of them being modeled after J .
In this case k = 1/2, the relevant representation theory of the sl(2|1) current algebra is
particularly simple. In fact, all relevant representations can be obtained from the vacuum sector
{0} through application of spectral flow symmetries. It is not difficult to show that at k = 1/2
the automorphism 2 is inner, i.e. 2 id. This means that application of 2 does not lead to
any new representations. The remaining non-trivial automorphisms are of the form +n with
n Z and = 0, 1. We shall denote the corresponding irreducible representations of the sl(2|1)
current algebra by

(n, )
= +n {0} .
By construction, this set closes under fusion. In fact, the fusion product simply amounts to a
composition of the associated automorphism.
With the exception of the sectors labeled by n = 0, 1, the representations {(n, )} do not
contain a highest or lowest weight. The representation {(0, 0)} is to be identified with the vacuum representation. {(0, 1)} = {0, 1/2} is the only other admissible representation at k = 1/2.
4 We thank F. Essler for kindly exploring this question numerically.
335
It is generated from the 4-dimensional typical multiplet {0, 1/2} of ground states with conformal
weight h = 1/2. In addition, there are four more highest/lowest weight representations which
are erected over the atypical discrete series representations {(1, )} = {(, 1/4)}
and
corresponding
to
a
negative
spin
j
=
1/4.
The
choice
of
the
sign
{(1, } = {(+, 1/4)}
in the first argument of the bracket determines on whether the representation is highest () or
lowest (+) weight. The subscript, on the other hand corresponds to the two different choices of
the parameter b that make these representations atypical. All four representations possess ground
states of conformal weight h = 0. In all other representations {(n, )} with |n| 2, the conformal
weight is unbounded from below.
Since we can generate every representations from {0} , is suffices to display the character of
the vacuum representation,

1 3 (q, 1/2 ) 4 (q, 1/2 )
+
.
{0} (q, z, ) =
2 4 (q, z1/2 ) 3 (q, z1/2 )
We shall explain the origin of this formula in a moment. Characters of all the other representations are obtained from the vacuum character {0} through

n
n n+1
{(n, )} (q, z, ) = +n {0} (q, z, ) = 4 2 z 4 q 2 {0} q, q n2 , q nz z .
To derive the above character formula and for the subsequent discussion we note that the sl(2|1)
current algebra at level k = 1/2 possesses a free field representation which employs the same
free fields as in the case of the k = 1 theory, i.e. two free bosonic fields and with spacelike and time-like signature, respectively, and a pair of symplectic fermions 1 , 2 . The bosonic
sl(2|1) currents read
1
i

E + (z) = e2i (z) 2 1 (z)1 (z),
(5.5)
H (z) = (z),
2
2
1
i

E (z) = e2i (z) 2 2 (z)2 (z),
(5.6)
B(z) = (z).
2
2
Note that, unlike in the case of k = 1, the bosonic currents involve the symplectic fermions and
the time-like free boson. For the fermionic currents one finds
1
1

V + (z) = ei((z)+ (z)) 1 (z),
V (z) = ei((z) (z)) 2 (z),
2
2
1 i((z)+ (z))
1 i((z) (z))
+
W (z) = e
2 (z),
W (z) = e
1 (z).
2
2
As in the case of the k = 1 theory, the free field construction determines a consistent model with
an sl(2|1) current algebra symmetry. If we do not include the symplectic fermions (note that once
more the currents only involve derivatives), but only their derivatives then the state space reads

(n, ) (n, ) .
Hk=1/2 =
n,
Since the spectral flow automorphisms and correspond to multiplication with the fields
i
e 2 ( ) ,
ei ,
it is fairly easy to write down at least one field in each sector of the model,

n
n+2
(n, ) (n, ) contains ei 2 i 2 .
336
The space Hk=1/2 contains R sector representations only, but it is certainly possible to include
the NS sector by adding the image under the spectral flow . Since this works just in the same
way as above, we shall not repeat the discussion here.
Even though all the representations we are working with are atypical, the state space decomposes into irreducible building blocks. This is quite different from the structure of the atypical
sectors J we described above. On the other hand, it is very similar to one of the consistent
theories with k = 1 that we described at the end of Section 4.2. In the k = 1 theory, the singular
vectors of the indecomposable block J were decoupled by restricting to a theory that contained
only derivatives of the fermionic fields. Conversely, the experience from k = 1 suggest that in
the k = 1/2 case we may be able to construct a theory with a more complicated atypical sector
by including one or both of the symplectic fermion fields 1 and 2 [23].
We claim that in case we include both fermionic zero modes we end up with an atypical
structure that decomposes into four different blocks, each of them being built in the same way as
our sector J . We shall present the analysis only for the block that contains the vacuum sector
{0} . The other three sectors are obtained by acting with , + and + . Let us start our
discussion with the field 1 2 . Any action with V , W and E will remove one of the two
fermionic zero modes and hence 1 2 sits at the top of a sector {0} . The action of V + , W +
and E + takes us from here into a set of fields which all contain a factor 1 . These fields can be
shown to belong to a sector that is isomorphic to {(2, 0)}
= +2 {0} . Further application of V ,
W and E bring us to a set of fields that contain only derivatives of fermions. These form a
subrepresentation {0} at the bottom of our atypical representation. A similar analysis applies
if we act with V , W and E first. This time, we descend to {0} via the sector {(2, 0)}
=
+2 {0} . Continuing along this line of thoughts, one can see that the sectors {(2n, 0)}, n Z,
form the composition series for an indecomposable representation J with {(2n, 0)} in place
of {n/2}. The state space of the maximal theory therefore decomposes into four indecomposable
blocks. Once more, there are two intermediate theories, each of which has four saw-blade shaped
atypical sectors. They are obtained if we omit either 1 or 2 (but not their derivatives, of course)
from the above maximal theory.
Even though we are not prepared to analyze WZNW models for generic fractional levels, it
is remarkable that the structure we have first uncovered in our minisuperspace limit, re-appears
even for k = 1/2. It seems very likely that the same is true for a generic choice of the level.
6. Conclusions and outlook
An obvious conclusion of our study is that WZNW models on supergroups are interesting
examples of logarithmic CFTs, much richer than it has been anticipated in earlier works. Gurarie (see [24] and references therein), for instance, argued that super WZNW models with c = 0
could be considered as made of two decoupled component theories with opposite values of the
central charge, an observation justified in part by the fact that in the GL(1|1) WZNW model,
the stress energy tensor belongs naturally to a four-dimensional GL(1|1) multiplet in which L0
is diagonalizable, and hence T has no non-trivial logarithmic partner. The SU(2|1) WZNW
clearly does not obey any such decoupling. In fact, restricting to the right moving current algebra
as in [24] we see that the identity field belongs to a projective representation of the zero mode
algebra on which the Casimirand hence L0 is not diagonalizable. Applying L2 to this representation produces a Virasoro Jordan cell at level h = 2 and a non-trivial logarithmic partner
of the stress energy tensor. This can be seen quite explicitly in the case k = 1 where, within the
337
free field representation (and similarly to the case of symplectic fermions), the field
t (z) := :1 (z)2 (z)T (z):
is a logarithmic partner of
2
2
1
1
T := :1 (z)2 (z): : (z) : + : (z) :.
2
2
Note that the whole structure of indecomposables is in fact much more complicated than envisioned in [24] when the interplay of left and right current algebras is taken into account.
Even though the structure of the state space is rather difficult when analyzed with respect to
the combined left and right action, it is surprisingly simple once we restrict to either the left or the
right action alone. Note that the Lie superalgebra sl(2|1) has a large number of indecomposables
(see e.g. [12]) from which only a very distinguished sub-class does actually occur within the
state space of our model. In fact, we have seen above that all states (both in the minisuperspace
theory and the full field theory) transform according to the so-called projective representations
of sl(2|1), i.e. either in typicals and projective covers of atypicals. This is not to say, however,
that other representations of sl(2|1) have no relevance for sigma models on supergroups. In addition to the left and right regular representation there is yet one more important symmetry that
arises from the adjoint action of sl(2|1) on the state space. With respect to the latter, states can
transform in other indecomposables. The underlying mathematical structure turns out to be quite
intriguing and will be described elsewhere. Since the adjoint action is left unbroken by maximally symmetric boundary conditions, the resulting decomposition should have applications, in
particular to the study of boundary conditions for sigma models on supergroups.
As a final comment let us point out one generic feature we have encountered in both GL(1|1)
and SU(2|1), namely that the contribution of the indecomposable sector J simply makes up
for the subtractions in the atypical sectors of the theory, so that the partition function sees only
contributions from Kac modules, and has a simple factorized form. This behavior is sufficient
for a modular invariant partition function but it is not necessary. The potential existence of different versions of the theory where only parts of the complex J appear, requires more study.
We note that some hints in this direction are provided by the study of four point functions. In
the case k = 1, the four point function of the fields in the {0, 1/2} representation has been studied in detail. It turns out that the KZ equations factorize, and that it is possible to decouple one
conformal block. Two blocks remain, leading to logarithmic dependence, and indicating that the
identity field remains part of an indecomposable representation. This suggests that the smallest
theory, where the complex J is reduced to an infinite sum of irreducibles, cannot appear in the
sl(2|1) WZNW model. In the case k = 1/2 meanwhile it is possible to decouple two conformal
blocks, leaving only the identity field, and indicating that the smallest theory does make sense
this timea feature consistent with the free field representation and the modular invariant. The
sl(2|1) WZNW model at fractional level and the explicit construction of consistent theories with
a truncated atypical sector certainly deserve a more systematic investigation.
Acknowledgements
We thank Fabian Essler, Gerhard Gtz, Thomas Quella and Anne Taormina for interesting
conversations. V.S. would like to thank the SPhT for the warm hospitality during several stays.
This work was partially supported by the EU Research Training Network grants Euclid, contract number HPRN-CT-2002-00325 and ForcesUniverse, contract number MRTN-CT-2004005104.
338
Appendix A. The right regular representation

In this appendix we would like to prove the decomposition formula for the right regular representation. We shall use the same notations that were introduced in Section 2.2. In order to
analyze the decomposition of the space of functions under the right regular action of sl(2|1), we
shall first study its restriction to the Lie sub-superalgebra gl(1|1). More precisely, we shall make
use of the following embedding

= F ,
(E) = B H,
(N) = B + H.
+ = F +,
The main technical lemma of this section implies that under the action of gl(1|1), the space H of
functions on the supergroup SU(2|1) decomposes into projectives only.
Lemma. Under the action of RX R(X) of the generators X gl(1|1), the space H of functions
of SU(2|1) decomposes according to
H
=
j

j
P(2b + 1) 2 P(2b) P(2b 1) T .
b=j
Here, T is a direct sum of typical gl(1|1) representations and P(a) denotes the projective cover
of the atypical irreducible a.
Before we prove this statement, let us formulate two consequences for the right regular representation of sl(2|1). To begin with, let us recall from [12] that an sl(2|1) representation
descends on a projective representation of the embedded gl(1|1) algebra if and only if is projective. Our lemma claims that the gl(1|1) action on H contains only projectives. Hence, the
same must be true for the right regular action of sl(2|1).
Proof of Lemma. For the proof it will be useful to introduce the following odd functions

1/2
:= eiz/2 D(1/2) g 1 ,
= eiz .
It is not difficult to see that the space H is spanned by functions of the form
n,j
j
Fab = einz Dab (g)( , + , ),
where ( , + , ) is an arbitrary element in the algebra generated by the arguments. It is very

easy to describe explicitly the space of functions which are organized in atypicals of gl(1|1). The
latter is characterized by the vanishing of RE ,

AR = H iz + Rh0 + = 0 .
(A.1)
We can easily solve the equation for and describe the space AR explicitly. In fact, it is spanned
by the functions
b,j
j
Fab = eibz Dab (g)( , + , ).
On the subspace AR the other generators of gl(1|1) simplify to

RN = (2iz + + ),
R + = i+ ,
1/2
= ieiz/2 D(1/2) (g)
(A.2)
0
i RE
.
(A.3)
339
for all AR . The representation of gl(1|1) can be restricted to the space AR of all elements
AR such that = . A short look on the action of the gl(1|1) generators reveals that
AR
j

j
P(2b + 1) P(2b).
b=j
Similarly, we see that

AR /AR =
j

j
P(2b) P(2b 1).
b=j
Since all representations are projective we conclude that

AR =
j

j
P(2b + 1) 2 P(2b) P(2b 1).
b=j
This concludes the proof of our lemma.
References
[1] D. Bernard, (Perturbed) conformal field theory applied to 2D disordered systems: An introduction, hep-th/9509137.
[2] Ch. Mudry, C. Chamon, X.-G. Wen, Phys. Rev. B 53 (1996) R7638;
Ch. Mudry, C. Chamon, X.-G. Wen, Nucl. Phys. B 466 (1996) 383.
[3] M.J. Bhaseen, J.S. Caux, I.I. Kogan, A.M. Tsvelik, Disordered Dirac fermions: The marriage of three different
approaches, Nucl. Phys. B 618 (2001) 465499, cond-mat/0012240.
[4] Z. Maassarani, D. Serban, Non-unitary conformal field theory and logarithmic operators for disordered systems,
Nucl. Phys. B 489 (1997) 603625, hep-th/9605062.
[5] A.W.W. Ludwig, A free field representation of the osp(2|2) current algebra at level k = 2, and Dirac fermions in
a random SU(2) gauge potential, cond-mat/0012189.
[6] P. Bowcock, M. Hayes, A. Taormina, Characters of admissible representations of the affine superalgebra sl(2|1),
Nucl. Phys. B 510 (1998) 739764, hep-th/9705234.
[7] L. Rozansky, H. Saleur, Quantum field theory for the multivariable AlexanderConway polynomial, Nucl. Phys.
B 376 (1992) 461509.
[8] V. Schomerus, H. Saleur, The GL(1|1) WZW model: From supergeometry to logarithmic CFT, hep-th/0510032.
[9] G. Gotz, T. Quella, V. Schomerus, The WZNW model on PSU(1, 1|2), hep-th/0610070.
[10] F.H.L. Essler, H. Frahm, H. Saleur, Continuum limit of the integrable sl(2|1) 33 superspin chain, Nucl. Phys.
B 712 (2005) 513572, cond-mat/0501197.
[11] L. Frappat, P. Sorba, A. Sciarrino, Dictionary on Lie Algebras and Superalgebras, Academic Press, San Diego, CA,
2000. Extended and corrected version of the E-print hep-th/9607161.
[12] G. Gtz, T. Quella, V. Schomerus, Representation theory of sl(2|1), hep-th/0504234.
[13] V.G. Kac, Lie superalgebras, Adv. Math. 26 (1977) 896.
[14] A. Hffmann, On representations of supercoalgebras, J. Phys. A 27 (1994) 64216432, hep-th/9403100.
[15] P. Bowcock, A. Taormina, Representation theory of the affine lie superalgebra sl(2|1, C) at fractional level, Commun. Math. Phys. 185 (1997) 467493, hep-th/9605220.
[16] M. Hayes, A. Taormina, Admissible sl(2|1, C)(k) characters and parafermions, Nucl. Phys. B 529 (1998) 588610,
hep-th/9803022.
[17] A.M. Semikhatov, A. Taormina, Twists and singular vectors in
sl(2|1) representations, Theor. Math. Phys. 128
(2001) 12361251, hep-th/0311166.
[18] A.M. Semikhatov, A. Taormina, I.Y. Tipunin, Higher level Appell functions, modular transformations, and characters, Commun. Math. Phys. 255 (2005) 469512, math.qa/0311314.
[19] L. Rozansky, H. Saleur, S and T matrices for the U (1|1) WZW model: Application to surgery and three manifolds
invariants based on the AlexanderConway polynomial, Nucl. Phys. B 389 (1993) 365423, hep-th/9203069.
[20] L.J. Mordell, Acta Math. 61 (1933) 323.
340
[21] T. Apostol, Modular Functions and Dirichlet Series in Number Theory, vol. 41, Springer-Verlag, Berlin, 1997.
[22] M. Bocquet, D. Serban, M.R. Zirnbauer, Disordered 2d quasiparticles in class d: Dirac fermions with random mass,
and dirty superconductors, Nucl. Phys. B 578 (2000) 628.
[23] F. Lesage, P. Mathieu, J. Rasmussen, H. Saleur, Nucl. Phys. B 686 (2004) 313.
[24] V. Gurarie, A.W.W. Ludwig, Conformal field theory at central charge c = 0 and two-dimensional critical systems
with quenched disorder, hep-th/0409105.
Renormalization group flows for the second ZN

parafermionic field theory for N odd
Vladimir S. Dotsenko a,b , Benoit Estienne a,b,
a Laboratoire de Physique Thorique et Hautes Energies, 1 Universit Pierre et Marie Curie, Paris-6, France
b CNRS, Universit Denis Diderot, Paris-7, Bote 126, Tour 25, 5me tage, 4 place Jussieu,
F-75252 Paris Cedex 05, France

Received 1 March 2007; accepted 7 March 2007
Abstract
Using the renormalization group approach, the Coulomb gas and the coset techniques, the effect of
slightly relevant perturbations is studied for the second parafermionic field theory with the symmetry ZN ,
for N odd. New fixed points are found and classified.
Parafermionic conformal field theories describe systems enjoying conformal symmetry and a
cyclic symmetry ZN .
The first series of parafermionic conformal field theories appeared in 1985 [1]. Since then they
have been well studied and applied in various domains [24]. The second parafermionic series
(2)
ZN has been developed fairly recently [58] and it still awaits its applications.
(1)
In the case of the first series ZN , to a given N , is associated a single conformal theory. This is
different for the second series: for a given N , there exist an infinity of unitary conformal theories
(2)
ZN (p), with p = N 1, N 2, . . . . These theories correspond to degenerate representations
of the corresponding parafermionic chiral algebra. They are much more rich in their content of
physical fields, as compared to the theories of the first series. They are also much more complicated. But, on the other hand, the presence of the parameter p, for a given ZN , opens a way to
* Corresponding author at: Laboratoire de Physique Thorique et Hautes Energies, Universit Pierre et Marie Curie,
Paris-6, France.
E-mail addresses: dotsenko@lpthe.jussieu.fr (V.S. Dotsenko), estienne@lpthe.jussieu.fr (B. Estienne).
1 Unit Mixte de Recherche UMR 7589.
doi:10.1016/j.nuclphysb.2007.03.018
342
V.S. Dotsenko, B. Estienne / Nuclear Physics B 775 [FS] (2007) 341364
reliable perturbative studies. It allows in particular to study the renormalization group flows in
the space of these conformal theory models, under various perturbations.
In the theory of second order phase transition, it is widely accepted that fixed points of the
renormalization group should be described by conformal field theories. Saying it differently, a
CFT describes the critical point of some statistical system. In order to investigate the behavior
of the renormalization group in the vicinity of a fixed point, it is useful to study the effects of
slightly relevant perturbations of the corresponding conformal field theory.
(2)
In this paper we shall present results for the renormalization group flows of the ZN
(p) theories with N odd, being perturbed by two slightly relevant fields. In a previous letter [9] we have
(2)
studied the case of the Z5 (p) theories. These results are here generalized to any odd integer N ,
and more details are given.
(2)
parafermionic theory with N odd could be found in [6]. The q charge
The details of the ZN
(2)
of ZN takes values in ZN , so that in the Kac table of this theory one finds the ZN neutral fields,
of q = 0, the q = 1, . . . , N 1 doublets, and the Z2 disorder fields. The symmetry of the theory
is actually DN , which is made of ZN rotations and the Z2 reflections in N different axes. These
last symmetry elements amount to the charge conjugation symmetry: q q.
(2)
(2)
In this paper we will treat the case N odd. The first non-trivial ZN theory with N odd is Z3 ,
and its renormalization by slightly relevant fields has already been treated in [12,13]. So we will
analyze the case N 5. We expect different results from the case N = 3 since
(2)
Z3 (p) =
SU k (2) SU 4 (2)
SU k+4 (2)
(0.1)
is a SU(2) coset with a shift parameter 4, while the N 5 theories are SO(N ) cosets with a shift
parameter 2:
(2)
ZN (p) =
SOk (N ) SO2 (N )
.
SOk+2 (N )
(0.2)
(2)
1. Perturbing ZN
A conformal theory can be seen as the field theory describing the critical point of some statistical system, i.e. the fixed point of the renomalization group. In order to get some insights into
the neighborhood of this critical point, one can study the effects of slightly relevant perturbations
of the corresponding conformal field theory. To do so, one needs to identify a set of slightly pertinent fields: spinless fields with anomalous dimension d = 2 2, 1. The action takes the
following form:

A = A0 +
(1.1)
gi d2 x i (x).
i
For the perturbed theory to be renormalizable these fields should not produce additional
slightly relevant fields when fused together: the set of slightly relevant fields must close with
respect to fusion rules.
Considering slightly relevant fields allows to use perturbation theory. In the leading approximation in , the renormalization group equations gi = i (g) are obtained directly from the
relevant fields fusion rules [10].
(2)
We want to perturb the second parafermionic theory, which we will denote as ZN . First we
need to identify a set of slightly relevant fields of this theory, which close by the operator algebra.
343
= 1 , with 1 being
Slightly relevant fields are fields with conformal dimension =
a small positive parameter. Since there are no fields with negative dimension, slightly relevant
fields are necessarily Virasoro primary. Note that does not necessarily mean ZN is primary.
(2)
Perturbatively well controlled domain of ZN (p) theories is that of p 1, giving a small
parameter 1/p. This is similar to the original perturbative renormalization group treatment
of minimal models for Virasoro algebra based conformal theory [10,11].
Since we want to conserve the ZN symmetry, we demand these fields to be neutral w.r.t. ZN .
Slightly relevant neutral fields can be of 2 sorts:
a ZN primary, singlet (q = 0). We will denote these fields as S;
q
q
aZN neutral descendant of a doublet: A = x1 1 xn n D q , with the neutrality condition
i qi + q = 0 mod N .
1.1. Singlet S
(2)
The Kac formula for ZN (p) has been given in [6]; it can be found in Appendix A. The
conformal dimension of a primary singlet S(n|n ) is given by
S(n|n ) =

((p + 2)
n p
n )2 42 (
n n )2 n 2 n2
=
+ O 2 .
4p(p + 2)
4
2
(1.2)
There are infinitely many solutions to = 1 as p . But we want a closed algebra of

slightly relevant fields, with a bounded number of fields that does not depend on p. It is similar
to the case of minimal models in which there are many slightly relevant field: all the n,n+3 . But
the field 1,3 alone forms a closed algebra: its fusion does not generate the other fields n,n+3
with n > 1 simply because the + side of 1,3 is trivial (n = 1). We will do a similar treatment
here. In order to help ensuring the closing of the fields, we will impose the following condition:
n = (1, 1, . . . , 1), i.e. we demand the + side to be trivial.
There remains one unique singlet:
S = S(1,1,...,1|3,1,1,...,1) ,

S = 1 N + O 2 .
(1.3)
(1.4)
1.2. Fundamental descendant of a doublet D Q

By fundamental descendant we mean a ZN descendant that is still Virasoro primary. The
doublets DQ=2q , Q = 0, 1, . . . , N1
2 , have a non-trivial boundary term in their dimension.
q
q
Any ZN fundamental descendant A = x1 1 xn n D Q that satisfies the neutrality condition

i 2qi + Q = 0 mod N necessarily has a gap
i xi equal to the fundamental gap Q =
Q(QN )
mod [1]. The conformal dimension of such a descendant is (cf. Appendix B)
2N
A
(1,1,...|
n ) =
((p + 2) p
n )2 42
+ BQ + Q .
4p(p + 2)
(1.5)
We want A to be smaller than 1. This condition drastically reduces the admissible fields.
The details of the analysis are given in Appendix B, and we find that there is one single neutral
344
descendant of a doublet that is slightly relevant:

1
for N = 5,
A 2 (11|13)
5
A=
A12 (111...|121...) for N 7.
(1.6)
Finally we have two ZN neutral fields which are slightly relevant. Since these are the only ones
with a trivial + side, they necessarily form a closed algebra amongst all the slightly relevant
fields. They are
S = (111...|311...) ,
1
A 2 (11|13)
5
A=
A12 (111...|121...)
N
for N = 5,
for N 7.
(1.7)
We observe that the case N = 5 is slightly different from the case N 7. This is somewhat
conventional, caused by the notations adopted in the labeling of the primary fields (we could have
redefined r = 2r to absorb it). But we chose to keep the usual notations for the Br weights. In
the following we will treat preferably the case N 7 when we explicitly write the field A, the
case N = 5 being treated exactly the same way. In particular the final results hold in both cases:
we observe the same phase diagram, with the same structure for the fixed points.
The conformal dimensions of the fields (1.7) are given by (A.6). The dimensions of the fields S
and A have the following values:
S = 1 N,
A = 1 (N 2).
(1.8)
We have defined as follows:

=
1
1
.
p+2 p
(1.9)
Perturbing with the fields S and A corresponds to taking the action of the theory in the form:

2g
2h
d2 x S(x) +
d2 x A(x)
A = A0 +
(1.10)
where g and h are the corresponding coupling constants; the additional factors 2 are added to
simplify the coefficients of the renormalization group equations which follow; A0 is assumed to
(2)
be the action of the unperturbed ZN (p) conformal theory.
It will be shown below that the operator algebra of the fields S and A is of the form:
DSSA
A(x) + ,
|x x|4S 2A
DAAA
A(x) + ,
A(x )A(x) =
|x x|2A
DSSA
S(x) + .
S(x )A(x) =
|x x|2A
S(x )S(x) =
(1.11)
(1.12)
(1.13)
Only the fields which are relevant for the renormalization group flows are shown explicitly in the
r.h.s. of Eqs. (1.11)(1.13). For instance, the identity operator is not shown in the r.h.s. of (1.11)
345
and (1.12) while it is naturally present there. The operator algebra constants in (1.11) and (1.13)
should obviously be equal, as the two equations could be related to a single correlation function
S(x1 )S(x2 )A(x3 ) .
Assuming the operator algebra expansions in (1.11)(1.13), one finds, in a standard way, the
following renormalization group equations for the couplings g and h:
dg
= 2Ng 4DSSA gh,
d
dh
= 2(N 2)h 2DAAA h2 2DSSA g 2 .
d
These equations derive from the potential:
(1.14)
(1.15)
c(g, h)
2
(1.16)
= N g 2 + (N 2)h2 2DSSA g 2 h DAAA h3 .
24
3
These are up to (including) the first non-trivial order of the perturbations in g and h.
The problem now amounts to justify the operator algebra expansions in (1.11)(1.13) and to
calculate the constants DSSA and DAAA .
The efficient method for calculating the operator product expansions and defining the corresponding coefficients is the Coulomb gas formalism [19].
Calculating directly the expansions of the products of the slightly relevant operators (1.7)
(2)
encounters a problem: the explicit form of the Coulomb gas representation for the ZN
(p) theory
(2)
is not known. We shall get around this problem by using the coset representation for the ZN
theory and the related techniques. In particular, we shall generalize the method developed in
papers [12,13] for the SU(2) cosets.
(2)
(p) and the WBr theories
2. Relating Z2r+1
The idea is to realize the parafermionic theory in terms of some simpler conformal theories.
(2)
To do so we start with the coset representing ZN (p) [14]:
SOk (N ) SO2 (N )
(2.1)
, p = N 2 + k.
SOk+2 (N )
Here SOk (N ) is the orthogonal affine algebra of level k. This coset is a particular case of a
symmetric coset Gk Gl /Gk+l , with a shift parameter l = 2. Generally speaking, the higher the
shift parameter, the more complex the theory. For instance the number of sectors is increasing
with this shift parameter, as the chiral algebra becomes richer. Following [12,13], we decompose
the coset with shift l = 2 in terms of the several simpler l = 1 cosets:
(2)
ZN (p) =
SO1 (N ) SO1 (N ) SOk (N ) SO1 (N ) SOk+1 (N ) SO1 (N )

(2.2)
=
.
SO2 (N )
SOk+1 (N )
SOk+2 (N )
The two l = 1 coset factors in the r.h.s., as well as the additional coset factor in the l.h.s.,
correspond to the WBr theories with rank r = N1
2 :
(2)
ZN
(p)
SOk (2r + 1) SO1 (2r + 1)

,
SOk+1 (2r + 1)
Eq. (2.2) reads:
(p)
WBr
(2)
(p)
ZN (p) WB(2r)
= WBr
r
(p+1)
WBr
p = 2r 1 + k.
(2.3)
(2.4)
346
This equation relates the representations of the corresponding algebras. It could be reexpressed in terms of characters of representations, as is being usually done in the analyses of
cosets. But this equation allows also to relate the conformal blocs of correlation functions. In doing so one relates the chiral (holomorphic) factors of physical operators. This latter approach has
been developed and analyzed in great detail in the papers [12,13], for the SU(2) coset theories.
p
A WBr theory is a special case of W theories. They have been defined in [15]. The WBr chiral
(2k)
algebra is made of r bosonic currents W
with conformal dimension 2k, k = 1, . . . , r, and one
fermionic current with dimension r + 1/2, and its central charge is

1
2r(2r 1)
cWB(p) = r +
(2.5)
1
.
r
2
p(p + 1)
A direct consequence of (2.4) is an egality for the central charges:
cZ (2) (p) + cWB(2r) = cWB(p) + cWB(p+1) .
r
(2.6)
2.1. The N = 5 case

(2)
For the sake of simplicity we first treat the N = 5 case. The coset decomposition of Z5 (p)
is
(p=4)
Z5(2) (p) WB2
(p)
(p+1)
= WB2 WB2
(2.7)
.
(p)
(p+1)
Our first step will be to identify Z5(2) (p) primary fields in WB2 WB2
. These fields
(2)
(n|n ) are characterized by the Z5 (p) Kac formula which fixes their conformal dimensions [6]:
((p + 2)
n p
n )2 10
+ B,
4p(p + 2)
n1 + n2 < p + 2,
(n|n ) =
n 1
+ n 2
(2.8)
(2.9)
(2.10)
< p.
p
On the other hand, WB2 primary fields also obey a Kac formula: the primary field (n|n ) has
conformal dimension [15]:
((p + 1)
n p
n )2
2p(p + 1)
n1 + n2 < p + 1,
(n|n ) =
n 1
+ n 2
5
2
+ b,
(2.11)
(2.12)
(2.13)
< p.
(2)
(p=4)
(p)
(p+1)
When identifying fields between Z5 (p) WB2

and WB2 WB2
, one obvious
relation is the equality of the total conformal dimension. Using the Kac formulas of Z5(2) (p) and
(p)
WB2 , one can check the following identity:
(p)
WBr
(
n|k)
p+1
WB
+ r
(k|
n)
(2)
Z (p)
= (nN|n )

n + n 2

+ k
.
2
(2.14)
Here (n|n ) = (n|n ) B is the dimension of (n|n ) minus the boundary term: it corresponds
to the Coulomb gas vertex operator part of the dimension.
347
For operators, the coset relation (2.7) together with (2.14) motivates the following statement:
p
p+1
(2)

(Z (p))
(WB4 )
(WB )
(WB
)
(n|5n ) (s |s )2 =
(2.15)
ak 2 2 .
(
n|k)
k
(k|
n)
The operators in this relation could be primaries or their descendants.

(p)
(p+1)
(2)
(4)
In other words, for any field from Z5 (p), there are fields in WB2 , WB2
and WB2 such
that the product of the fields as in (2.15) have the same correlation functions.
(p)
(p+1)
From (2.15) it appears that there exist some selection rules in WB2 WB2
: only diagonal cross-products are pertinent in this analysis. By diagonal we mean product of the form

(n|k)
(k |
n ) with k = k . These features are discussed in much detail in the paper [12].
Eq. (2.4) should in fact read:
(p)
(p+1)
(2)
(4)
Z5 (p) WB2 = P WB2 WB2
(2.16)
(p)

(
n|k)
where P is a projector. In terms of primary fields, P projects the product of all fields {
(p+1)
(p)
(q |n ) } = WBr
(p+1)
WBr
to the subspace
(p)
(p)
(p+1)
(p+1)
= .
P WB2 WB2
(
n|k)
(2.17)
(k|
n)
2.1.1. The chiral algebra

It is interesting to take a closer look at the descendants of the identity, since they contain the
chiral algebra of the theory. In particular one should be able to build the stress energy tensor
p
p+1
of the second parafermionic theory in terms of fields living in WB2 WB2 . More precisely,
Eq. (2.15) enforces the following assumption: any chiral current (z) of Z5(2) (p) should have a
decomposition of the following form:

(p)
(p+1)
ak
(p=4) =
(2.18)

k
(1,1|k)
(k|1,1)
where the fields involved can be either primary fields or their descendants.
At level = 0, we get the trivial
(2)
IZ5
(p)
I(p=4) = I(p) I(p+1) .
(2.19)
At level = 1, there is not much liberty either: we have only one field of conformal dimen(p=4)
(p)
(p+1)
(2)
and WB2 WB2
:
sion 1 in both Z5 (p) WB2
(2)
IZ5
(p)
(p=4)
(p)
(p+1)
(1,1|3,1) = (1,1|2,1) (2,1|1,1) .
(2.20)
Things get more interesting for = 2. One can prove the following decompositions:

2(p + 5)(p 3) (p)
p
p + 2 (p+1)
(p+1)
(p=4)
(p)
T
=
+
(3,1|1,1) ,
T +
T
5(p + 4)
5(p 2)
5(p + 4)(p 2) (1,1|1,3)
(2.21)

(2)
T Z5
(p)
4 p + 5 (p) 4 p 3 (p+1)
T +
T
5 (p + 4)
5 (p 2)
2(p + 5)(p 3) (p)

(p+1)
(3,1|1,1) .
5(p + 4)(p 2) (1,1|1,3)

(2.22)
348
One can check that these fields obey the required OPEs:
T (p=4) (z)T (p=4) (0) =
1/2 2T (p=4) T (p=4)

+
+
+ O(1),
z
z4
z2
(2)
(2)
Z5 (p)
(2)
T Z5
(p)
(z)T
(2)
Z5 (p)
c(5, p)/2 2T Z5
(0) =
+
z4
z2
(z)T (p=4) (0) = O(1).
(p)
(2)
T Z5
+
z
(p)
(2.23)
+ O(1),
(2.24)
(2.25)
A few remarks are in order at this point: we are dealing with the holomorphic part of the fields
(p)
(p+1)
operators, one does
only. So when doing the expansions of the products of WB2 and WB2
them:
(1) with the square roots of the constants;
(p)
(2) one keeps in these expansions the diagonal cross-products only: the products of WB2
(p+1)
and WB2
operators which appear in coset relations for operators, due to Eq. (2.16).
p
(3) one needs to know some WB2 algebra constants, for which the Coulomb gas is known (cf.
Section 3).
2.1.2. Singlets
(2)
Neutral primary fields in Z5 (p) are referred to as singlets. They belong to the simplest
sector of the N = 5 second parafermionic theory, and they enjoy a zero boundary term in their
conformal dimensions. (n|n ) is a singlet when n 1 n1 = 0 mod 2 and n 2 n2 = 0 mod 4. We
n
will denote such a field S(n|n ) . We remark that k = n+
2 is then an admissible weight of B2 .
Eqs. (2.14) and (2.15) lead to
(p)
S(n|n ) I =
(p)
n
(
n| n+
2 )
(p+1)
.
n
n)
( n+
2 |
(2.26)
(p=4)
is the identity I so it can be dropped. The notation

In the l.h.s. of (2.26) the field in WB2
(p)
(p)
stands for fields living in WB2 .
In general neutral operators are expected to have a simple decomposition. In particular they
(p)
(p+1)
(4)
will have a trivial I term in WB2 , allowing to express them in WB2 WB2
only. The fields
(2)
we will use to perturbate ZN (p) are neutral since we want to keep the ZN symmetry.
For instance we have the following identification:
Z
(2)
(p)
(p)
(p+1)
5
= (11|21) (21|31) .
S = S(11|31)
(2.27)
2.1.3. Doublets
(2)
Doublets DQ are charged primary fields of Z5 (p) w.r.t. the Z5 symmetry. They belong to
a less trivial sector than singlets, and they have a non-trivial boundary term in their dimension.
(p=4)
Therefore their decomposition in (2.15) requires a non-trivial field in WB2
, to make up for
the missing boundary terms in Eq. (2.14).
(2)
But the Z5 (p) fields we use to perturb with have to be neutral to conserve the Z5 symmetry.
Q
Thus we are interested in neutral descendants of doublets, like Q +n DQ with fundamental
gap Q =
Q(QN )
2N
mod [1].
349
Taking into account both the boundary term and the descendant gap, Eqs. (2.14) and (2.15)
give:
Q
Q D(n|n ) I(p=4) =

a(k)
k
(WB2 )

(
n|k)
p+1
(WB2
n )
(k|
(2.28)
where the sum can be restricted, using (2.14), to k obeying: (k

sum is actually finite.
q=1
Let us take an example: the doublet D(11|13) .
n+
n 2
2 )
= BQ + Q , so that the
This is one of the 2 fundamental q = 1 (Q = 2) doublets in Z5(2) (p). The structure of its
module is such that there is only one neutral descendant with gap Q = 25 , because one of the
1
1
degeneracy condition reads 12 D(11|13)
= 1 2 D(11|13)
[6]. Applying (2.28) here gives:
5
1
2 D(11|13)
=
5
ak
k
(WB2 )

(11|k)
p+1
(WB2 )
.

(k|13)
(2.29)
Here the sum occurs for (k (1, 2))2 = BQ + Q = 12 , whose solutions are

(1, 1),
k = (1, 3),
(2, 1).
(2.30)
We note here that the fields in WBr are of two sorts: NeveuSchwartz (if nr n r is even) and
Ramond (if nr n r is odd). Ramond fields have a boundary term in their conformal dimension:
1
Br = 16
, therefore Eq. (2.14) excludes Ramond fields when we decompose a neutral field of
(2)
ZN (p). So that k must also obey kr = 1 mod 2.
(2.29) and (2.30) sum up to:
p
(WB )
(WB
p+1
(WB )
(WB
p+1
(WB )
(WB
p+1
1
2
2
2
2
2
2
2 D(11|13)
= a(11|11)
(11|13)
+ b(11|13)
(13|13)
+ c(11|21)
(21|13)
.
5
(2.31)
The coefficients a, b, c are still to be determined at this point. Several methods can be
used to calculate them. One can use the expression of the stress-energy tensor T (p=4) and de(p=4)
1
mand that the field 2 D(11|13)
I(p=4) has the right conformal dimensions w.r.t. WB2
, i.e.
5
(2)
(p=4) = 0. One other way to determine these constants is through the fusion rules of Z5 (p):
imposing the fusion rule A A A, or S S A, will fix a, b, c uniquely:

DSSA
S(z) S(0) 2 A(0) + O(z) ,
(2.32)
S
A
z

1
A(z) A(0) 2 I + O(z) .
(2.33)
z A
Together with (2.27), (2.32) and (2.33) allow to express the coefficients a, b, c of Eq. (2.31) in
terms of algebra constants of WB2 .
Injecting (2.31) in (2.33) gives:
a 2 + b2 + c2 = 1.
(2.34)
350
Putting (2.31), (2.27) in (2.32) gives:

(p+1)
D(21|31)(21|31)(11|13) = a DSSA ,

(p)
(p+1)
D(11|21)(11|21)(11|13) D(21|31)(21|31)(13|13) = b DSSA ,

(p)
(p+1)
D(11|21)(11|21)(11|21) D(21|31)(21|31)(21|13) = c DSSA .
(2.35)
(2.36)
(2.37)
(p)
For clarity we have adopted the following notations: D(...) stands for a fusion constant of
(p)
(p)
WB2 , while D(...) corresponds to a Z5(2) (p) constant.

(...)
Knowing the WB2 algebra constants D(...) then allows to determine a, b, c and then DSSA
(2)
and DAAA . The problem of evaluating these Z5 (p) algebra constants has been reduced to the
(p)
calculation of some WB2 algebra constants.
As it was said above, the chiral factor operators are related to the conformal bloc functions,
not to the actual physical correlators. On the other hand, the coefficients of the operator algebra expansions are defined by the three point functions. These latter are factorizable, into
holomorphicantiholomorphic functions. So that, when the relation is established on the level
of chiral factor operators, for the holomorphic three point functions, this relation could then be
easily lifted to the relation for the physical correlation functions. Saying it differently, with the relations for the chiral factor operators one should be able to define the square roots of the physical
operator algebra constants.
In Section 3 we will calculate those WB2 constants we need, to obtain for A and S the following decomposition at leading order in :
(p)
(p+1)
S = (11|21) (21|31) ,
1 (p)
1 (p)
(p+1)
(p+1)
A = (11|11) (11|13) + (11|13) (13|13) .
2
2
(2.38)
2.2. The case N 7

(2)
(p)
This construction can be generalized to the case N = 2r + 1 with r 3. For the Z2r+1
parafermionic theory the coset relation reads:
(p)
(p+1)
(2)
Z2r+1 (p) WB(2r)
(2.39)
.
= P WBr WBr
r
(2)
We recall that the two Z2r+1 (p) slightly relevant fields S and A are:
S = S(111...|311...) ,
A = 12 D(111...|121...) .
N
(2.40)
Using the same method as for the N = 5 case we obtain:

(p)
(p+1)
S = (111...|211...) (211...|311...) ,
1 (p)
1 (p)
(p+1)
(p+1)
A = (111...|111...) (111...|121...) + (111...|121...) (121...|121...) .
2
2
(2.41)
351
2.3. Summary
(2)
We want to perturb ZN (p) with the fields A and S. We need to know the algebra constants
DSSS , DSSA , DSAA and DAAA . Here are the relations obtained with the coset construction (2.2):

(p)
(p+1)
DSSS = D(111...|211...)(111...|211...)(111...|211...) D(211...|311...)(211...|311...)(211...|311...) , (2.42)

(p+1)
DSSA = a 1 D(211...|311...)(211...|311...)(111...|121...) ,
(2.43)

b
(p+1)
DSAA =
D(211...|311...)(121...|121...)(121...|121...) ,
(2.44)
a

(p+1)
DAAA = a D(111...|121...)(111...|121...)(111...|121...)

b2
(p+1)
+
D(121...|121...)(121...|121...)(111...|121...)
a

c2
(p+1)
+
D(211...|311...)(211...|311...)(111...|121...) .
(2.45)
a
3. Calculation of the WBr algebra constants
The WBr Coulomb gas is known [15], therefore we have integral representations of the fusion
algebra constants. Unfortunately we do not know how to calculate the most general form of these
integrals. We will show in this part how to obtain the constants we need.
3.1. The WBr Coulomb gas
(p)
We need to calculate some fusion algebra constants of WBr . For these theories the Coulomb
gas representation is made of r bosonic fields i , quantized with a background charge and the
Ising model fields: (free fermion) and (spin operator) [15].
The details about the WBr Coulomb gas are given in Appendix D. Three point functions have
the following form:

ka+
r

1
2 (a) + (a) (a)
d ui Va ui , u i
V1 (0)V2 (1)V3 () = V1 (0)V2 (1)
ka+ ! i=1
a=1

ka

1
(a)
(a)
(a)
d2 vj Va vj , vj
V3 ()
ka ! j =1
(3.1)
where ka are the numbers of screening operators Va required to ensure the neutrality condition:

n1i + n2i n3i 1
ka+ ea =
i ,
a

a
ka ea

n i 1 + n i 2 n i 3 1
i .
(3.2)
As usual in the Coulomb gas approach, the vertex operators representing the primary fields
have non-trivial normalizations.
352
We will denote as Cac b the fusion constants obtained in the Coulomb gas representation (i.e.
3 point functions) and Da b c the actual WBr constants:

Cac b = Va (0)Vb (1)Vc () ,
(3.3)

Da b c = a (0)b (1)c () .
(3.4)
These two quantities are related by
Cac b = Na Nb Nc1 Da b c ,
(3.5)
Da b c being symmetric under any permutation of a, b, c and Na the normalization of the vertex Va :

Na2 = CaI a = Va (0)Va (1) .
(3.6)
So that

N2
ka

1
+ (a) (a)
d2 u(a)
= V (0)V (1)
i
i Va ui , u
+
ka ! i=1
a=1

ka

1
2 (a) (a) (a)
d vj Va vj , vj
V20 () .
ka ! j =1
r

(3.7)
Unfortunately we do not know how to calculate these integrals in the general case.
3.2. Some easy integrals
Evaluating the general form of integrals (3.7) could prove quite involved. Luckily we are
interested in algebra constants involving fields with relatively small indices, so that the number of
screening operators should remain reasonable. Furthermore, since one of the screening operator
is fermionic, we can already predict the vanishing of some integrals: whenever a three point
function requires an odd number of fermionic screening operators, the corresponding constant
will obviously be zero (at least in the NeveuSchwartz sector).
This is the case for the following WBr constants:
(
n, m),

D(n|m)(
= 0,
n|m)(111...|211...)

(3.8)
D(n|m)(
= 0.
n|m)(211...|311...)

(3.9)
(2)
Going back to the fields A and S in ZN (p), this implies the following trivial results:
DAAS = 0,
(3.10)
DSSS = 0.
(3.11)
3.3. Some other integrals

Integrals involving only a few screening operators can be calculated exactly. This is the case
for the following algebra constants:
(
n, m),

(
n|m)

.
C(n|m)(111...|121...)

(3.12)
353
For all these constants the neutrality condition reads:

ka+ ea = 0,
ka ea =
2
(3.13)
which fixes the number of screening operators:

k + = (0, 0, 0, . . . , 0),
k = (1, 2, 2, . . . , 2).
(3.14)
(111...|121...)
For instance let us calculate C(111...|121...)(111...|121...) :

(111...|121...)
C(111...|121...)(111...|121...)

2
2
1
exp
i
.
(0)

exp
i
.
(0)

=
(2!)r1
2
2

(1) (1)
e1
(1)
d2 u1 exp i . u1 , u 1
2

(a) (a)
ea
(a)
d2 ui exp i . ui , u i
2
2ar1 i=1,2

(r) (r)
er
(r) (r) (r)
d2 ui ui u i exp i . ui , u i
2
i=1,2

2

.
exp 2
0 + i . ()
2
(r)
(r)
(3.15)
(r1)
(r1)
This integral can be evaluated by first integrating over (u1 , u2 ), then over (u1
, u2
),
etc.
Proceeding in this fashion we find the following results (we give only the leading order in
= p1 because that is all we need for the renormalization group method):
2r1
= 2(2r 1)
,

2r1
(121...|121...)
C(121...|121...)(111...|121...) = 2(2r 1)
2,

2r + 1 2r1
(211...|311...)
C(211...|311...)(111...|121...) =
,
2

2r1
(111...|211...)
C(111...|211...)(111...|121...) = 2r
.

(111...|121...)
C(111...|121...)(111...|121...)
(3.16)
(3.17)
(3.18)
(3.19)
354
3.4. Some more involved integrals

(121...|121...)
(211...|311...)
On the other hand, the calculation of C(121...|121...)(121...|121...) , C(211...|311...)(121...|121...) is a bit

more involved. Because the number of screening operators is twice as much, the same method
will not work.
k + = (1, 2, 2, . . .),
k = (1, 2, 2, . . .).
(3.20)
Instead we will calculate the following 4 points correlation function, and use it to derive a
simpler expression for these constants:

f (z, z ) = a (0)(111...|121...) (z, z )(121...|111...) (1)a () ,
(3.21)
a being an arbitrary field.
This function is single-channeled, therefore it factorizes: f (z, z ) = f (z)f(z),
f (z) =
P (z)
z(111...|121...) (1 z)2
(3.22)
P (z) = a0 + a1 z + a2 z2 being a polynomial of degree 2 whose coefficients are fixed by the fusion
rules:

(111...|121...) (121...|111...) D(111...|121...),(121...|111...),(121...|121...) (121...|121...) , (3.23)

(111...|121...) a Da,a,(111...|121...) a ,
(3.24)

(121...|121...) a Da,a,(121...|121...) a .
(3.25)
We find that

a0 = a2 = Da,a,(111...|121...) Da,a,(121...|111...) ,

P (1) = D(111...|121...),(121...|111...),(121...|121...) Da,a,(121...|121...) .
(3.26)
(3.27)
2n are proportional to L
An important point here is that for W theories the modes W1
1 = .
So that the only descendant at level 1 of any primary field is just . Thus we can write the
next term in the fusion of (111...|121...) with a :
(111...|121...) (z) a (0)

D(111...|121...),(121...|111...),(121...|121...)
a (0) + 1 za (0) + O z2

(111...|121...)
z
(3.28)
where 1 is fixed by conformal invariance alone:

1 =
(111...|121...)
.
2a
(3.29)
That way we have the additional relation:

P (1) = a0
(111...|121...) (121...|111...)
2a
(3.30)
which translates into
Da,a,(121...|121...) =
(111...|121...) (121...|111...)
2a
355
Da,a,(111...|121...) Da,a,(121...|111...)
.
D(111...|121...),(121...|111...),(121...|121...)
(3.31)
Going back to the Coulomb gas, this involve the following constants:
Caa (111...|121...) ,
Caa (121...|111...)
(3.32)
(121...|121...)
which we know how to calculate (cf. Section 3.3) and the trivial C(111...|121...)(121...|111...) = 1,
since it involves no screening operator.
3.5. The normalization integrals
The only quantity which remains now is the normalization of the Coulomb gas vertex operators. Trying to evaluate directly (3.7) will encounter the same kind of problems we just had: too
many screening operators are involved.
Alternatively, recalling the cyclic symmetry of Da b c , one finds:
Cac b = Na Nb Nc1 Da b c ,
(3.33)
= Na Nc Nb1 Da b c .
(3.34)
Cab c
This leads to the following identity:
2
Cc
Nb
= ab b .
Nc
Ca c
(3.35)
(111...|111...)
2
= C(111...|121...)(111...|121...) cannot be evaluated directly. But (3.35)
For instance N(111...|121...)
gives us the following expression:

c
N(111...|121...) 2 Ca (111...|121...)
(a, c),
(3.36)
= (111...|121...) .
Nc
Ca c
Now, choosing carefully the fields a and c makes the calculation possible: for instance one can
take a = c = (111 . . . | 211 . . .). The constraints, when choosing these fields, are the following:
one has to be able to evaluate Cac b and Cab c ,

one has to be able to calculate Nc ,
Cac b = 0.
We obtain finally:
(111...|211...)
2
N(111...|121...)
C(111...|211...)(111...|121...)
(111...|121...)
C(111...|211...)(111...|211...)
(111...|111...)
C(111...|211...)(111...|211...) .
All the constants appearing here are then evaluated as in Section 3.3. That way we find:
2(2r1)

2
1 + O() .
N(111...|121...) = r(2r + 1)

(3.37)
(3.38)
356
Generalizing this method allows one to evaluate all normalizations we need. For instance:
(211...|121...)
(211...|211...)

N(121...|121...) 2 C(211...|111...)(121...|121...) C(111...|211...)(211...|121...)
= (121...|121...)
(211...|121...)
N(111...|211...)
C(211...|111...)(211...|121...) C(111...|211...)(211...|211...)
(111...|211...)
C(211...|111...)(211...|211...)
(3.39)
(211...|211...)
C(211...|111...)(111...|211...)
which leads to
4(2r1)

(3.40)
1 + O() .
=

This way one obtain the square of the vertex operator normalizations.
One has to be careful

2
N(121...|121...)
when taking the square root, and make an analytic continuation of
N2 as a function of .
3.6. Results
Now we know all the WBr constants we need. We note that (111...|121...) is the only slightly
(p)
relevant field of WBr , with the following algebra constant:
2(2r 1)
.
D(111...|121...)(111...|121...)(111...|121...) =
r(2r + 1)
(p)
(3.41)
(p1)
.
This implies that the WBr , being perturbed by the field (111...|121...) , flows towards WBr
This confirms the observation, made with the SU(2) cosets [12,13] and, more generally, with the
cosets for the simply laced algebras [16], that the perturbation of a coset theory caused by an
appropriate operator drives p to p p, p being equal to the shift parameter of the coset.
Armed with the WBr constants we deduce the following results (at leading order in ):
from (2.35), (2.36) we get the full decomposition (2.31) of the field A:
1
a=b= ,
2
c = 0,
(3.42)
(2)
and then the Z2r+1 (p) constants we need:

(2r 1)
,
DAAA =
r(2r + 1)

2r + 1
.
DSSA =
r
(3.43)
(3.44)
(2)
(p)
4. Renormalization group flows for ZN
4.1. Beta functions

We have obtained the values of DAAA and DSSA at leading order in . The renormalization
group equations for the couplings g and h are then given by

2r + 1
dg
= 2(2r + 1)g 4
gh,
g =
(4.1)
d
r
357
Fig. 1. Fixed points of the renormalization group.

2r + 1 2
dh
(2r 1) 2
h 2
h =
= 2(2r 1)h 2
g .
d
r
r(2r + 1)
These are up to (including) the first non-trivial order of the perturbations in g and h.
These equations derive from a potential:
(4.2)
g = g V (g, h),
(4.3)
h = h V (g, h)
(4.4)
with
2r + 1 2
2 (2r 1) 3
h .
g h
(4.5)
r
3 r(2r + 1)
This potential plays a central role in the renormalization group flows. Let us consider the
function c(g, h) defined by c(g, h) = c0 V (g,h)
24 : this is the c-function introduced by Zamolodchikov, which decreases along the renormalization group flows, and coincide with the central
charge at any fixed point.
At this point we can directly analyze the presence of IR fixed points for the renormalization
group, and predict the corresponding central charges.
The phase diagram of constants g and h contains (Fig. 1):
V (g, h) = (2r + 1)g + (2r 1)h 2
2
the UV fixed point g0 = h0 = 0, which obviously corresponds to the theory ZN (p),

the IR fixed point on the h axis:

g1 , h1 = 0, r(2r + 1) ,
(2)
(4.6)
358
Fig. 2. Renormalization group flows. They are symmetrical with respect to g g.
two additional IR fixed points for non-vanishing values of the two couplings:

1
1
g2 , h2 =
r(2r 1),
r(2r + 1) ,
2
2

1
1
g3 , h3 =
r(2r 1),
r(2r + 1) .
2
2
(4.7)
To identify what conformal theory we have at these IR fixed points, we evaluate the central
charge using the potential (4.5). We find that the value of the central charge at the point (g1 , h1 ) =
(2)
(0, r(2r + 1) ) agrees with that of the theory ZN (p 2). This fixed point was to be expected.
On the other hand, the appearance of two extra fixed points, (g2 , h2 ) and (g3 , h3 ), is somewhat
surprising. By the value of the central charge, the two critical points correspond to the theory
(2)
ZN (p 1). (See Fig. 2.)
To check this statement, we evaluated the anomalous dimensions of some particular fields.
4.2. Some gamma functions
Our classification of these fixed points has further been verified by calculating the critical
dimensions at these points of the operators (1,n,...|1,n,...) and (n,1,...|n,1,...) .
The gamma function giving the evolution of the dimension of a field of type (n|n) is [10,11]:
d2(n|n)
= (n|n) = 4hDA(n|n)(n|n) 4gDS(n|n)(n|n) .
d
(4.8)
(2)
We use the same techniques to evaluate the ZN (p) constants DA(n|n)(n|n) and DS(n|n)(n|n) .
(p)
(p+1)
:
First we identify the field (n|n) in WBr WBr
Z
(2)
(p)
(nN|n)
WB
(p)
WB
(p+1)
= (n|nr) (n|nr)
(4.9)
.
Z
(2)
(p)
We note that (4.9) holds true only because (nN|n)

We recall that
Z
(2)
(p)
WB
(p)
WB
is always a neutral field.
(p+1)
N
r
r
S = S(111...|311...)
= (111...|211...)
(211...|311...)
.
(4.10)
359
We can already see that DS(n|n)(n|n) = 0 since it involves some WBr constants with an odd
number of fermionic screening operators:
DS(n|n)(n|n) = D(111...|211...)(n|n)(n|n) D(211...|311...)(n|n)(n|n) = 0.
(4.11)
Therefore (n|n) simplifies for singlets into

(n|n) = 4hDA(n|n)(n|n) .
(4.12)
The problem now amounts to calculate DA(n|n)(n|n) . We use the expression

p+1
1 (WBpr )
q=1
(WBr )
(111...|121...)
A = 12 D(111...|121...) = (111...|111...)
N
2
p+1
1 (WBpr )
(WBr )
+ (111...|121...)
(121...|121...)
2
to obtain
(p)
(4.13)
(p+1)
DA(n|n)(n|n) = 2D(111...|121...)(n|n)(n|n) .
(4.14)
The integral corresponding to D(111...|121...)(n|n)(n|n) has been estimated in the following cases:
n = (n11 . . .): D(111...|121...)(n|n)(n|n) =
n = (1n1 . . .): D(111...|121...)(n|n)(n|n) =
(n1)(2r+n2)
2,
r(2r+1)
2(n1)(2r+n3)
2.
r(2r+1)
So in these 2 cases the function becomes:
,
(n11...|n11...) = 8h 2 (n1)(2r+n2)
r(2r+1)
(1n1...|1n1...) = 8h 2 2(n1)(2r+n3)

.
r(2r+1)
(p)
(pk)
These values are in agreement with the statement that the field (n|n) flows towards (n|n) :
(p)
(pk)
(n|n) (n|n)
with

k=
(4.15)
2 at the fixed point 1,

1 at fixed points 2 and 3.
5. Discussion
In this paper, we have studied the effect of two slightly relevant perturbations for the second
(2)
parafermionic theory ZN (p), and we have found three fixed points. We have identified the corresponding conformal theories by evaluating the value of the central charge and the anomalous
(2)
dimensions of some fields at these points. One of them is described by the expected ZN (p 2)
parafermionic theory. This confirms the observation, made with the SU(2) cosets [12,13] and,
more generally, with the cosets for the simply laced algebras [16], that the perturbation of a coset
theory caused by an appropriate operator drives p to p p, p being equal to the shift parameter of the coset. In our case the shift parameter of the coset is equal to 2, Eq. (2.1). Note that
the algebra Br SO(2r + 1) is not a simply laced one.
360
On the other hand, the appearance of two extra fixed points, corresponding to the theory
is somewhat surprising.
(2)
We observe that such additional fixed points do not appear in the parafermionic model Z3 :
the second Z3 parafermionic theory with = 4/3 [17]. This model could be realized with the
SU(2) cosets and its perturbations with two slightly relevant operators have been analyzed in
[12,13].
These two additional fixed points would have remained unseen if we had perturbated with the
field A alone.
(2)
We can compare these results with those obtained for the second parafermionic theory ZN
(p)
with N even. Such theories are symmetric cosets on simply laced Lie algebras. The perturbation
(2)
with one particular relevant field of the parafermionic theory ZN (p) with N even has already
(2)
been treated in [16]. The only fixed point obtained correspond to ZN (p 2). On the basis of our
present results, one could expect the presence of a second slightly relevant field and the existence
(2)
of two additional fixed points corresponding to ZN (p 1). This will be examined in [18]. (See
Fig. 2.)
(2)
ZN (p 1),
(2)
Appendix A. A brief review of ZN
(p)
(2)
The details of the second parafermionic theories ZN (p) with N odd can be found in [6].
The chiral algebra is made of N 1 parafermionic currents k (with k = 1, 2, . . . , N 1),
and their operator product expansion is

k k k+k ,

k
(A.1)
I.
(A.2)
Note that in the above equation the ZN -charges k are defined modulo N .
The currents have dimension:
2k(N k)
.
N
This implies the value of the central charge:

N (N 2)
c = (N 1) 1
.
p(p + 2)
k = Nk =
(A.3)
(A.4)
In general, the parafermionic algebra primaries of the second ZN conformal theory are labeled
by 2 vectors (
n, n ) corresponding to two (+ and ) lattices of the Br classical Lie algebra [6]:
(n|n ) = (n1 ,n2 ,...|n 1 ,n 2 ,...) .

+
(A.5)
The first and second vector of indices correspond respectively to the + and Br lattices.
+ and are the usual Coulomb gas type parameters.
The conformal dimension of primary operators takes the form:
(n|n ) =
(
n(p + 2) n p)2 42
+ BQ
4p(p + 2)
(A.6)
361
where
n = (n1 , n2 , . . . , nr ) =
r

ni
i ,
n = (n 1 , n 2 , . . . , n r ) =
i=1
r
n i
i
(A.7)
i=1
and
i , i = 1, . . . , r, are the fundamental weights of the Lie algebra Br .
BQ in Eq. (A.6) is the boundary term which depends on the ZN charge Q = 2q mod N :
Q(N 2Q)
N 1
(A.8)
, Q = 0, 1, 2, . . . ,
.
4N
2
The Q charge of ZN takes values in ZN , so that in the Kac table of this theory one finds the
(2)
ZN neutral fields, of Q = 0, the Q = 1, 2, . . . , N1
2 doublets, and the Z2 disorder fields,
1 N1
2 .
with boundary term BR = 16
BQ =
Introducing xa = na n a for a = 1, 2, . . . , r 1 and xr =

primary operator (n|n ) is given by [8]
! r

"
r

Q(
n n ) =
xb mod 2 .
a=1
nr n r
2
, the doublet charge of the
(A.9)
b=a
n n ) is not an integer.
We note that (n|n ) is a disorder operator if Q(
(2)
(p) theory
Appendix B. Slightly relevant descendants of a doublet in the ZN
By fundamental descendant we mean a field that is still Virasoro primary. The doublets
Any ZN funDQ=2q , Q = 0, 1, . . . , N1
2 , have a non-trivial boundary term in their dimension.
q
q
damental descendant A = x1 1 xn n D Q that satisfies the neutrality condition i 2qi + Q =

)
0 mod N necessarily has a gap i xi equal to the fundamental gap Q = Q(QN
mod [1]. The
2N
conformal dimension of such a descendant is
A
(
n|
n ) =
((p + 2)
n p
n )2 42
+ BQ + Q
4p(p + 2)
(B.1)
)
where Q is the fundamental gap Q = Q(QN
mod 1, and BQ is the boundary term BQ =
2N
Q(N 2Q)
.
4N
Since we want A
(
n|
n ) to be smaller than 1, Q must obey BQ + Q < 1. In that case one can
verify that BQ + Q = 3Q
4 mod 1.
Q
We will denote the doublet as D(1,1,...|1+n1 ,1+n2 ,...) . The dimension of A(1,1,...|1+n1 ,1+n2 ,...)
then reads:
n2
+ BQ + Q + O().
(B.2)
4
Since nr is even for a doublet, we will redefine nr 2nr for the sake of simplicity. We get:
r
2

2
n =
ni
i = n2r + (nr + nr1 )2 + (nr + nr1 + nr2 )2 + + (nr + + n1 )2 .
A
(1,1,...|1+n1 ,1+n2 ,...) =
i=1
(B.3)
362
The condition < 1, which implies
n2
4
< 1, has the following solutions:
n = (1, 0, 0, 0, . . . , 0) which corresponds to Q = 1,

n = (0, 1, 0, 0, . . . , 0) which is a Q = 2 doublet,
n = (0, 0, 1, 0, . . . , 0): a Q = 3 doublet.
This corresponds to the following admissible doublets:
a Q = 1 (q =
N1
2 )
Q=1
doublet: D(1,1,1,...|2,1,1,...) ,
Q=2
a Q = 2 (q = 1) doublet: D(1,1,1,...|1,2,1,...) ,
and a Q = 3 (q =
N3
2 )
Q=3
doublet: D(1,1,1,...|1,1,2,...) .
These fields are the fundamental doublets of charges Q = 1, 2, 3 [6]: they correspond to the
fields with the highest degenerate descendant. In that sense they have less descendants than the
general doublet with the corresponding charge. The analysis of the degeneracy conditions leads
to the following results:
Q=1
D(1,1,1,...|2,1,1,...) has no neutral descendant at level 1 =
N+1
2N ;
Q=2
D(1,1,1,...|1,2,1,...) has one single neutral descendant at level 2 =

Q=2
D(1,1,1,...|1,2,1,...)
= 1 2
N
2
N.
It is A = 12
Q=2
D(1,1,1,...|1,2,1,...) ;
(p)
(p+1)
Q=3
we conjecture, from the identification with WBr WBr

, that D(1,1,1,...|1,1,2,...) has no
fundamental neutral descendant either. It has been explicitly checked in the case N = 7.
Finally there is one single neutral descendant of a doublet, with a trivial + side, that is
slightly relevant:
A = 12 D(1,1,1,...|1,2,1,...) .
Q=2
(B.4)
Appendix C. Conventions for the Br Lie algebra

In this paper we have adopted the following normalization conventions for the roots and
weights.
The simple roots are given by the Cartan matrix Aij = ei . ej, where ej= 2ej ej2 is the coroot
of ej :
2 1 0
1 2 1
0 1 2
A=
..
..
...
.
.
0
0
0
0
0
0
...
...
...
..
.
0
0
0
..
.
...
...
2
1
0
0
0
..
.
(C.1)
Br is non-simply laced since there are r 1 long roots ei2 = 2 for i = 1, . . . , r 1 and one short
root er2 = 1.
363
The fundamental weights

i form the base dual to the simple root basis:
i . ej= ij . The
j }:
Cartan matrix is the transformation matrix relating the two basis {ei } and {
ei = Aij
j .
(C.2)
The scalar product of weights can be expressed in terms of the symmetric quadratic form ij :
ij =
i .
j ,
2 2
2 4
1 2 4
=
.. ..
2
. .
2 4
1 2
2
4
6
..
.
...
...
...
..
.
2
4
6
..
.
6
3
...
...
2(r 1)
r 1
1
2
3
..
.
r 1
(C.3)
(C.4)
r/2
which correspond to
ij = i, i j < r;
i
in = , i < r;
2
n
nn = .
4
We will also introduce the Weyl vector:
i = (1, 1, . . . , 1).
=
(C.5)
(C.6)
(C.7)
(C.8)
Appendix D. The WBr Coulomb gas

The WBr theories have been defined in [15] through their Coulomb gas. It is made of r bosonic
fields i , quantized with a background charge and the Ising model fields: (free fermion) and
(spin operator).
The screening operators are given by

ea
V(a) (z, z ) = :exp i . (z,
(D.1)
z ) :, a = 1, . . . , r 1,
2

e
(r)
z) exp i a . (z,
z ) :,
V (z, z ) = :(z)(
(D.2)
2

p+1
,
+ =
(D.3)
p

p
.
=
(D.4)
p+1
The normalization adopted for the bosonic fields is i (z, z )j (z , z ) = 2i,j log(|z z |2 ).
ea are the simple roots of Br (cf. Appendix C).
The vertex operators representing primary fields:

V (z, z ) = :exp i (n|n ) . (z,
(D.5)
z ) : if nr n r is even,
(
n|
n)
364
V
(
n|
n )

(z, z ) = (z, z ):exp i (n|n ) . (z,
z ) : if nr n r is odd,
(D.6)

1n i
i
with weight (n|n ) = ri=1 ( 1n
i .
i are the fundamental weights of Br (cf.
2 + + 2 ) 2
Appendix C).
The background charge is
+ +
2
0 =
(D.7)
2

with = i
i is the Weyl vector.
We can check the value of the central charge

1
2r(2r 1)
02 + 1/2 = r +
1
c = cbosons + cising = r 24
(D.8)
2
p(p + 1)
obtained with the stress-energy tensor
1
1
. z (z):

+ i 0 . z2 (z)
+ ::.
T (z) = :z (z)
4
2
The dimension of the V (z, z ) is
= 2 2 . 0 = ( 0 )2 02 = 20
(D.9)
(D.10)
so that a primary field has two representations: V and V = V20 .

(p)
Using (D.6) and (D.10), we find the Kac formula of WBr :

((p + 1)
n p
n )2 2
if nr n r is even (NeveuSchwartz field),
2p(p + 1)
((p + 1)
n p
n )2 2
1
(p)
+
if nr n r is odd (Ramond field).
(n|n ) =
2p(p + 1)
16
(p)
(n|n ) =
References
[1] V.A. Fateev, A.B. Zamolodchikov, Sov. Phys. JETP 62 (1985) 215.
[2] N. Read, E. Rezayi, Phys. Rev. B 59 (1999) 8084.
[3] H. Saleur, Commun. Math. Phys. 132 (1990) 657;
H. Saleur, Nucl. Phys. B 360 (1991) 219.
[4] D. Gepner, Nucl. Phys. B 296 (1988) 757.
[5] Vl.S. Dotsenko, J.L. Jacobsen, R. Santachiara, Nucl. Phys. B 656 (2003) 259.
[7] Vl.S. Dotsenko, J.L. Jacobsen, R. Santachiara, Phys. Lett. B 584 (2004) 186.
[9] Vl.S. Dotsenko, B. Estienne, Phys. Lett. B 643 (2006) 362.
[10] A.B. Zamolodchikov, Sov. Phys. JETP Lett. 43 (1986) 730;
A.B. Zamolodchikov, Sov. J. Nucl. Phys. 46 (1987) 1090.
[11] A.W. Ludwig, J.L. Cardy, Nucl. Phys. B 285 (1987) 687.
[12] C. Crnkovic, G.M. Sotkov, M. Stanishkov, Phys. Lett. B 226 (1989) 297.
[13] C. Crnkovic, R. Paunov, G.M. Sotkov, M. Stanishkov, Nucl. Phys. B 336 (1990) 637.
[14] P. Goddard, A. Schwimmer, Phys. Lett. B 206 (1988) 62.
[15] V.A. Fateev, S.I. Lukyanov, Sov. Sci. Rev. A Phys. 15 (1990) 1.
[16] V.A. Fateev, Phys. Lett. B 324 (1994) 45.
[17] V.A. Fateev, A.B. Zamolodchikov, Theor. Math. Phys. 71 (1987) 451.
[18] B. Estienne, in preparation.
[19] Vl.S. Dotsenko, V.A. Fateev, Nucl. Phys. B 251 (1984) 691734.
(D.11)
(D.12)
Anomalous Abelian solitons

Matthias Schmid , Mikhail Shaposhnikov
Institut de Thorie des Phnomnes Physiques, Ecole Polytechnique Fdrale de Lausanne,
CH-1015 Lausanne, Switzerland
Received 14 February 2007; received in revised form 4 March 2007; accepted 7 March 2007
Abstract
The chiral Abelian Higgs model contains an interesting class of solitons found by Rubakov and Tavkhelidze. These objects carry non-zero fermion number NF (or ChernSimons number NCS , what is the same
because of the chiral anomaly) and are stable for sufficiently large NF . In this paper we study the properties
3/4
of these anomalous solitons. We find that their energy-versus-fermion-number ratio is given by E NCS
2/3
or E NCS depending on the structure of the scalar potential. For the former case we demonstrate that
3/4
there is a lower bound on the soliton energy, which reads E c NCS , where c is some parameter expressed
through the masses and coupling constants of the theory. We construct the anomalous solitons numerically
accounting both for Higgs and gauge dynamics and show that they are not spherically symmetric. The thin
wall approximation valid for macroscopic solutions with NCS 1 is discussed as well.
Keywords: Gauge theory; Chiral anomaly; Solitons
1. Introduction
Solitons stable localized solutions to the classical equations of motion of non-linear field
theory represent an interesting class of particle-like states in quantum field theory. Well-known
examples include topological solitons such as the kink in 1 + 1 dimensions [1,2], the vortex
in 2 + 1 dimensions [3,4], the monopole [5,6] and the skyrmion [7] in 3 + 1 dimensions. The
stability of these solutions is ensured by topological reasons. Another class of solutions nontopological solitons or Q-balls [8,9], are stable because of conservation of some global Abelian
E-mail addresses: matthias.schmid@epfl.ch (M. Schmid), mikhail.shaposhnikov@epfl.ch (M. Shaposhnikov).

doi:10.1016/j.nuclphysb.2007.03.012
366
M. Schmid, M. Shaposhnikov / Nuclear Physics B 775 [FS] (2007) 365389
[9] or non-Abelian [10] charge. The above mentioned objects exist in pure bosonic theories.
Yet another type of solitons, electroweak bags [11,12], use the fact that fermions can be trapped
inside a spherical cavity created by a (otherwise unstable) space-dependent scalar field.
In [13], referred to as RT in the following, a new class of solitons was found, for which the
presence of chiral fermions and Abelian gauge symmetry play the key role. In very general terms,
the construction is based on the following observation. Consider an Abelian Higgs model with
chiral fermions and arrange the Yukawa couplings in such a way that all fermions get a mass mF
from the Higgs condensate. Then, the energy of NF well separated fermions is EF mF NF .
Due to the chiral anomaly these fermions can be converted to a gauge configuration carrying the
ChernSimons number NCS = NF , and the energy of the gaugeHiggs system E with non-zero
NCS may appear to be smaller than EF . If true, then one gets a stable soliton characterized by
NCS . Indeed it was demonstrated in RT using a variational principle that the energy E of the
3/4
system with NCS = 0 is bounded from above by E0 NCS , where E0 is some constant. Thus the
bosonic configuration is always energetically more favorable than the collection of NF fermions at sufficiently large NCS . The stable bosonic configuration will be referred to as anomalous
Abelian soliton in the following. In fact, anomalous Abelian solitons have some common features with Hopf solitons, studied in [14].
The Abelian character of the gauge group is essential for absolute stability of anomalous
solitons [15,16]. Indeed, if the gauge group is non-Abelian, fermions can always be converted
into a gauge vacuum configuration [1719], which may carry arbitrary integer ChernSimons
number. In other words, non-Abelian anomalous solitons, if they exist, can only be metastable.
Interestingly, anomalous solitons may potentially exist [15,16] in the Standard Model, since
it contains an Abelian U(1) gauge group and chiral fermions. This issue, however, has not been
clarified yet.
This paper is devoted to the detailed study of anomalous Abelian solitons. We demonstrate,
with the use of different inequalities from functional analysis, that for a certain class of scalar
3/4
potentials also a lower bound on the soliton mass, E c NCS , can be established, where c
is some parameter related to the masses and coupling constants of the underlying theory. We
construct the anomalous solitons numerically, taking into account both the dynamics of the gauge
and the Higgs field, and find that the solitons are not spherically symmetric. We also discuss the
thin wall approximation valid in the limit of large ChernSimons number, which will allow us to
remove the Higgs dynamics from consideration.
The organisation of the article is as follows. In the next section, we review the Tavkhelidze and
Rubakov construction [13,15] of anomalous solitons, in order to make the paper self-contained.
In Section 3 we derive a lower bound on the energy of the solitons and discuss their general
properties. In Section 4 we present the numerical solutions of the field equations as a function of
ChernSimons number and coupling constants of the theory. Section 5 is devoted to the solution
in the thin wall approximation. Finally, we summarize our main results in Section 6.
2. The RubakovTavkhelidze soliton
2.1. The model
Consider an Abelian Higgs model with the complex scalar field , a pair of left-handed
fermions with opposite charges L1 and L2 , a pair of neutral right-handed fermions R1 and R2 ,
and chiral interaction between the fermions and the U (1) gauge field. The Lagrangian is

1
L = F F + |D |2 V || + i L1 ( igA )L1 + i R1 R1
4

+ i L2 ( + igA )L2 + i R2 R2 1 L1 R1 + h.c.

2 L2 R2 + h.c.
with the potential
2

V || = ||2 v 2
367
(1)
(2)
and D = igA . The vacuum expectation value

(VEV) of the scalarfield is equal to v and
the gauge and Higgs bosons obtain the masses mV = 2gv and mH = 2 v, respectively. For
simplicity we choose the two Yukawa couplings to be equal to each other, F 1 = 2 , such
that all fermions have equal masses mF = F v. Note that we are forced to introduce at least two
fermions in order to make the theory free from gauge anomalies.
The total fermionic current
jF
2

Li Li + Ri Ri
i=1
is anomalous in this model,

g2
F F ,
32 2
where f = 2 is the number of left-handed fermions.
jF = f
(3)
2.2. Instability of fermionic matter

In this subsection we follow [13,15] in order to explain why the fermionic matter becomes
unstable at sufficiently high fermionic density.
At zero fermionic density, the ground state of the boson fields is A = 0 and || = v, which is
called the normal state, following RT. Consider now a homogeneous system of infinite volume
containing fermions with number density nF . The system is supposed to be neutral with respect
to the gauge charge. We only consider the weak-coupling limit ( 1 and g 2 1), where we
can treat the fields A and as classical condensates. The neutrality of the system implies
A0 = 0 and that the fields A and are time-independent. The fermions can be characterized by
the chemical potential F , which is, at zero temperature, the energy up to which the Fermi levels
are filled. The fermion number density nF is related to the Fermi energy F by
3F
.
3 2
We will use the unitary gauge Im() = 0 and denote = Re() in the following. The fermions
can be integrated out [20] and, up to corrections of order 1
F , the static energy functional of the
bosonic fields is given by

2

2
f F ij k
B
2
2 2 2
2 2
EB [A, ] =
(4)
Fij Ak d 3 x,
+ () + g A + v
2
32 2
nF = f
where B = A is the magnetic field. The first four terms in (4) are the classical energy
density of bosons, while the last (ChernSimons) term is due to the interaction with fermions.
368
The physical reason for the appearance of the ChernSimons term is as follows. As the gauge
field A increases, some fermionic energy levels cross the zero-energy line and the number of real
fermions decreases by f NCS , where

g2
g2
ij k
3
A B d 3x
F
A
d
x
=
NCS [A] =
(5)
ij
k
32 2
16 2
is the ChernSimons number of the gauge field.
The quadratic part of the static energy functional (4) has a negative mode for
f F > crit ,
where
crit =
16 2
mV .
g2
(6)
Thus the normal ground state of fermionic matter is absolutely unstable at

nF > ncrit =
3crit
.
3 2 f 2
The negative mode is given by

e(x) = e1 cos(k x) + e2 sin(k x) ,
(7)
(8)
where k = mV and e1,2 are real polarization vectors orthogonal to each other and to k. Note that
a perturbation A(x) = ae(x), where a is a small amplitude has the ChernSimons density
nCS =
g2
g2
A
B
=
ka 2 .
16 2
16 2
As the amplitude of the unstable mode grows, the term g 2 A2 2 in the energy acts as a positive
mass term for the Higgs field, which leads to the disappearance of the Higgs field condensate.
The system undergoes a transition to a state with = 0 containing a negligible number of real
fermions and a gauge field condensate A = 0 with non-zero ChernSimons number. We will call
this state the abnormal state, as in RT.
2.3. Domain of abnormal matter
Still following [13,15], we show in this subsection, that a finite domain of the abnormal
state with sufficiently large ChernSimons number has a lower energy than a system containing NF = f NCS real fermions. The stable configuration of gauge and scalar field condensate is
the minimum of the static energy functional for the bosonic fields

2
2

B
2
2 2 2
2 2
E[A, ] =
(9)
+ () + g A + v
d 3 x,
2
under the constraint of a constant ChernSimons number (5). Varying the functional
E[A, ]
16 2
NCS [A]
g2
(10)
369
with respect to A and , where is a Lagrangian multiplier, we get the following field equations

2 g 2 A2 2 2 v 2 = 0,
(11a)
( A) 2( A) + 2g 2 2 A = 0.
(11b)
Solutions of Eqs. (11) with finite energy have to satisfy the boundary conditions
v
and
A 0 for |x| .
For a static solution {Acl (x), cl (x)} of Eqs. (11) we obtain

2

2
16 2
E[Acl , cl ] = 2 NCS +
(cl )2 + cl
v 2 d 3 x,
g
(12)
where the first term in Eq. (9) has been integrated by parts and (11b) has been used.
Let us consider a compact domain of finite volume V in the abnormal state, embedded into
the normal vacuum. We suppose that the size R V 1/3 of the domain is much larger than the
two characteristic length scales in the model, the thickness of the domain wall 1/mH and the
penetration length 1/mV of the magnetic field. Then, the contribution of the scalar field to the
energy (the second term of Eq. (12)) can be written

2

2
(cl )2 + cl
v 2 d 3 x v 4 R 3 IV ,
Escal =
(13)
where IV = V /R 3 is the shape factor for the domain V . The surface energy, which is of order
R 2 v 3 , can be neglected. Since we want to minimize the total energy,
E=
16 2
NCS + v 4 R 3 IV ,
g2
(14)
with respect to R, we have to find how the Lagrangian multiplier scales with R. Since = 0 inside the domain of the abnormal state, it follows from Eq. (11b) that 1/R. Thus we can write
= /(2R), where is a numerical coefficient. The total energy as a function of R becomes
E(R) = 8 2
NCS
+ v 4 R 3 IV
g2 R
and minimizing E(R) with respect to R leads to
64 2 1/4 1 NCS 1/4

R=
,
3IV
mV

16 2 3/4
mV 3/4
E= 2
(IV )1/4 2 NCS ,
3
g
(15)
(16)
(17)
where we expressed R and E in terms of the vector boson mass mV and the parameter defined
by
=
m2H
m2V
2
.
g2
(18)
Both the size of the domain R and the total energy E grow slowly with increasing NCS . If
NCS > Ncrit , the energy (17) is smaller than the energy of the normal state with NF = f NCS
370
4
V
real fermions for which EF mF NF . Parametrically, Ncrit ( gm
2 m ) . Therefore, the domain of
F
abnormal matter is stable. Since R v 1 NCS , we get R v 1 for large NCS , and the surface
1/2
energy R 2 v 3 vNCS becomes indeed negligible at large NCS , as we assumed above.
In [15] the domain was considered to be a sphere of radius R and Eq. (11b) was solved
analytically (remember that 0 inside the domain). The solution can be expressed in terms of
Bessel functions and one gets
1/4
0
,
2R
where 0 is the first node of the Bessel function J3/2 . Inserting = 0 and IV = 4/3 into
Eqs. (16) and (17) we find exactly Eqs. (6.7) and (6.8) of [15].
This completes our review of the main results in RT, who proved the existence of anomalous
Abelian solitons. The remainder of the present paper is devoted to the detailed study of the
structure of these solitons. In particular, we would like to answer the following questions:
=
What is the structure of the gauge and Higgs field condensates of anomalous solitons?
What is their exact shape? Are they spherically symmetric?
Is the exponent 3/4 in the power-law dependence of the energy on NCS universal for anomalous solitons?
Does the structure of the solitons depend on the choice of the scalar potential?
What happens at small NCS , when the dynamics of the scalar field has to be taken into
account?
3. Properties of the soliton
3.1. Lower bound on the energy
In the following we derive a lower bound on the energy of anomalous solitons, which reads
3/4
E cNCS ,
(19)
where c is a constant depending on the VEV of the Higgs field and the coupling constants. Such
a bound on the energy is rather unusual because of the fractional power of the topological charge.
A bound of this type was first obtained by Vakulenko and Kapitanksy [21] for solitons of the nonlinear -model, for which the topological charge is the Hopf number. The Vakulenko bound was
later improved (i.e. increasing the constant c) by Kundu and Rybakov [22] and Ward [23].
The derivation of (19) is based on the use of some inequalities of functional analysis, which
we review in Appendix A. Let us start with an estimate of the ChernSimons number (5),
5/6

1/6

2
3
2
6/5 3
6 3
NCS = g c1 A B d x g c1
(20)
|B| d x
|A| d x
,
where c1 = 1/(16 2 ) and we have used the Hlder inequality (A.2) for p = 6/5 and q = 6. The
first factor on the r.h.s. of (20) is estimated using (A.3),

5/6

1/6

2/3
|B|6/5 d 3 x
|B|2 d 3 x
|B| d 3 x
(21)

,
and the second factor with the use of the GagliardoNirenbergSobolev inequality (A.5),

1/6

2 3 1/2
6 3

|A| d x
|A| d x
c2
.
371
(22)
Rosen [24] found the smallest possible value for the constant c2 in the previous inequality to be

1 2 2/3
c2 =
.
3
The integrand of (22) is further bounded from above,
3
3
3

2
2
2
|A|2 = 1
(A
A)

(
A
)
=
|
A|
+
i Aj j Ai .
i
i j
|A|2
i,j =1
i=1
(23)
i,j =1
Imposing the Coulomb gauge A = 0, the last term in Eq. (23) vanishes through integration
by parts and we obtain

|A|2 d 3 x |B|2 d 3 x.
(24)
Inserting (21), (22) and (24) into (20) yields
2/3

2/3

2
2 3
3
|B| d x
|B| d x
NCS g c1 c2
,
or equivalently,
3/2
NCS g 3 C

|B|2 d 3 x

|B| d 3 x ,
(25)
(26)
where
1
.
(27)
32 4 33/4
At this point, it is worth mentioning [23], where it was claimed that the constant C can even be
reduced to C = 1/(256 4 ). If we define an average magnetic field by

|B|2 d 3 x
,
B =
(28)
|B| d 3 x
C = (c1 c2 )3/2 =
we can write down an inequality for the magnetic energy

1 B 1/2 NCS 3/4
|B|2 3
.
d x
Emagn =
2
2 C
g2
(29)
In order to obtain an inequality for the total energy rather than for Emagn , we consider the
scale transformation
(x) = cl (x),
A (x) = Acl (x),
where {cl (x), Acl (x)} is a static solution of the field equations with finite energy. This transformation leaves the ChernSimons number unchanged. The static energy

E() = E A (x), (x)
372
must have a minimum at = 1. We get for the energy

3
B 2cl 3
1
1
2
2 2 2
(cl ) + g Acl cl d y + 3 V (cl ) d 3 y,
E() =
d y+
2
where y = x. Requiring that

E
=0
=1
leads to the relation

B 2cl 3
2
(30)
(cl )2 + g 2 A2cl cl
+ 3V (cl ) d 3 y,
d y=
2
which is exact and has to be satisfied by any static solution of the field equations. From (30) we
deduce

3
2
4
2
(cl )2 + g 2 A2cl cl
d x
E = Emagn +
(31)
3
3
and using (29) we get

2 B 1/2 NCS 3/4
4
,
E Emagn
(32)
3
3 C
g2
which is a lower bound for the energy of anomalous solitons, provided the average magnetic
field B is bounded from below. We will derive such a bound in the next subsection using known
results from GinzburgLandau theory of superconductors.
3.2. Anomalous solitons and superconductors
The static energy (9) is just the relativistic version of the GinzburgLandau free energy for a
superconductor (see e.g. [25]). The state with = v corresponds to the superconducting state,
while = 0 coincides with the normal-conducting state. It is well known from superconductivity
theory that the superconducting state is destroyed by large external magnetic fields and that the
way in which this superconductivity breaking occurs, depends on the ratio of the Higgs mass
to the gauge boson mass squared (cf. Eq. (18)).
In superconductors of the first kind ( < 1) the superconducting state can persist only up
to a critical magnetic field (see e.g. [26]), which is given by Bc = (2)1/2 v 2 . At B > Bc an
intermediate state is formed, in which normal-conducting domains (with = 0 and B Bc )
emerge within the superconducting state ( = v, B = 0). Thus for type I superconductors the
lower bound on the average magnetic field is provided by the critical magnetic field, i.e.
2
B Bc =
m
2g V
and we finally get the lower bound
2 1/2 1/4 mV 3/4

E
N
C
3
g 2 CS
for 1.
(33)
On the other hand, for superconductors of the second kind ( > 1) the magnetic field penetrates the superconductor along vortex filaments with quantized magnetic flux (Abrikosov vortices [3]). For Bc1 < B < Bc2 the superconducting sample is in a mixed state forming a lattice of
373
Abrikosov vortices. The lower critical field Bc1 marks the point, when the first vortex filaments
appears and is given by
Bc1 =
,
0
where is the energy per unit length of one vortex filament and 0 is the unit flux quantum. For
1 the vortex energy can be calculated analytically and the lower critical field becomes [25]
Bc
Bc1 = ln
(34)
.
2
2
If we suppose, that for 1 the state with the lowest energy is a lattice of vortices of total
length L, we can write the average magnetic field as

|B|2 d 3 x
2L
= 2Bc1 ,
B =
(35)
3
L0
|B| d x
and we obtain for the lower bound

1/2
2 1/2
mV 3/4
N
E
C
ln
3
2
g 2 CS
for 1.
(36)
3.3. Shape of the solitons

From (macroscopic) superconductivity theory we can also gain an insight on the shape of
anomalous solitons at large NCS . Consider a surface separating a normal-conducting and a superconducting domain within a type I superconductor in the intermediate state. Because in the
superconducting domain B = 0, the continuity of the orthogonal component B across the surface tells us, that the magnetic field in the normal-conducting domain has to be tangential on
this surface. Furthermore the magnetic field has to be equal to the critical field Bc . Therefore
we expect, at least for < 1, the magnetic field to be tangential and equal to Bc on the boundary of the solitons. However, it is not possible to construct a constant tangent vector field on a
sphere. More generally, on a closed surface of genus 0 there is no continuous tangent vector field,
which is non-zero everywhere on that surface (PoincarHopf theorem). The simplest manifold
on which such a vector field can exist, is a torus (the surface has to be of genus 1). Consequently,
the highest possible symmetry of anomalous solitons is axial symmetry.
3.4. Dependence on V ()
Up to this point we have discussed anomalous solitons for the potential (2). In this subsection
we consider solitons for different scalar potentials. We shall see, that the 3/4 power-law of the
3/4
soliton energy, E NCS , is not universal, but depends on the scalar potential.
Let us consider three different classes of potentials shown in Fig. 1. For potentials with V ( =
0) = V0 > 0, such as the potential (2) (see Fig. 1(a)), there is a non-zero critical magnetic field,
which is given by
Bc2
(37)
= V0 > 0.
2
The critical magnetic field is the energy density of the scalar field in the abnormal state = 0.
On the boundary of the two phases, = v and = 0, there is a pressure balance between the
374
(a)
(b)
(c)
Fig. 1. Three choices for the scalar potential V (): (a) Potential with V ( = 0) = V0 > 0 (non-zero critical magnetic
field). (b) Potential with V ( = 0) = 0 (zero critical magnetic field). (c) Potential with V ( = 0) < 0 (the ground state
= v is metastable).
pressure of the magnetic field and the vacuum energy density. The surface contribution to the
energy of the scalar field is negligible.
On the other hand for a potential with V0 = 0 (Fig. 1(b)), the critical magnetic field vanishes.
In this case the energy of the Higgs field is dominated by a surface term and is given by
Escal = R 2 v 3 IA ,
(38)
where IA is a numerical coefficient. Instead of Eq. (15) we obtain

E(R) = 8 2
NCS
+ IA v 3 R 2 .
g2 R
Minimization with respect to R shows that
1/3

1 1/3
4 1/6 g
R = 128
N ,
IA
mV CS
2 1/3 m

V
2/3
8 1/6 IA
N .
E = 3 32
g
g 2 CS
(39)
(40)
(41)
In the next section, we will numerically verify this result for a specific potential of the type of
Fig. 1(b).
Finally, for a potential with V0 < 0 (see Fig. 1(c)), there is no critical magnetic field. The
normal ground state = v is metastable and anomalous solitons do not exist in this case.
4. Structure of anomalous solitons
In this section we construct anomalous solitons numerically. We present numerical solutions
to the field equations as a function of the ChernSimons number NCS and the parameter . The
structure of the Higgs and gauge field condensates is analyzed in detail. Finally we discuss the
dependence of the solutions on the scalar potential.
4.1. Numerical construction
In numerical calculations it is convenient to work in dimensionless coordinates x = mV x, and
with dimensionless fields = /v and A = A/v. In terms of these quantities the field equations
375
(11) are

2 1 A 2 2 1 = 0,
2
2
A + 2 A = 0,
( A) 2
(42a)
(42b)
where is the (dimensionless) Lagrangian multiplier. We want to solve Eqs. (42) subject to the
constraint of constant ChernSimons number,

= 1
A B d 3 x,
NCS [A]
(43)
32 2
which determines the Lagrangian multiplier .
It followed from the discussion in Section 3.3 that the highest possible symmetry for the
solitons is axial symmetry. Therefore we restrict ourselves to axially symmetric solutions and
use cylindrical coordinates (r, , z), for which Eqs. (42) are reduced to a system of non-linear
PDEs in the rz-plane:

1

1
r (rr ) + z2 A2r + A2 + A2z 2 1 = 0,
r
2
2
z (z Ar r Az ) 2z A 2 Ar = 0,

1
r r (rA ) + z2 A + 2(z Ar r Az ) 2 A = 0,
r
2

1
r r(r Az z Ar ) +
r (rA ) 2 Az = 0.
r
r
For axial symmetry we have

1
rA (z Ar r Az ) dr dz
NCS =
8
(44a)
(44b)
(44c)
(44d)
(45)
after integration by parts and over the angle . Note that we are omitting hats from now on.
Solutions of Eqs. (44) are calculated by discretizing the fields (r, z) and A(r, z) on a rectangular box [0, L] [Z, Z] in the rz-plane and then solving the corresponding system of
non-linear equations using Newtons algorithm. The boundary conditions on the edges of the
box are
|r=L = 1 and |z=Z = |z=Z = 1
(46a)
for the Higgs field and

A|r=L = 0 and A|z=Z = A|z=Z = 0
(46b)
for the gauge field. On the axis of symmetry we imposed the regularity conditions
r |r=0 = r Az |r=0 = 0 and Ar |r=0 = A |r=0 = 0.
(46c)
For a detailed description of the numerical procedure we used to solve the PDEs, we refer the
reader to Appendix B.
376
Fig. 2. Logarithmic plot of the soliton energy as a function of NCS for = 1. The dots show the energies calculated
3/4
from the numerical solutions for NCS = 30 to NCS = 12 000. The solid line is a fit of the form E = aNCS with
2
a 118.826mV /g . The dashed line corresponds to the lower bound (33).
Fig. 3. Logarithmic plot of the energy as a function of . The dots represent the energies of the numerical solutions for
between 0.4 and 16 (for NCS = 1 600). The solid line is the fit E = b p giving p 0.23 and b 3.007 104 mV /g 2 .
The dashed line corresponds to the lower bound (33).
4.2. Dependence of the soliton energy on NCS and

We calculated solutions for 1 < NCS < 1.2 104 and 0.4 < < 16. In Fig. 2 we show the
energies obtained from the numerical solutions as a function of NCS for = 1. The energy
3/4
satisfies E = aNCS , with a 118.83mV /g 2 , which confirms the 3/4 power-law dependence on
NCS . The numerical solution gives an energy about three times larger than the lower bound (33),
3/4
which is 39.7mV /g 2 NCS for = 1.
Fig. 3 shows the soliton energy as function of at NCS = 1 600. We find E 3.01
4
10 mV /g 2 p , with p 0.23. The 1/4 dependence is expected from Eq. (17) (or from (33))
for 1.
All the solutions we found consist of a single domain with v and |B| > Bc , surrounded
by normal vacuum. For small ChernSimons number NCS < 20 the soliton has a size of the order
of the thickness of the domain wall ( 1/mH ), but becomes larger when NCS is increased.
377
This is the behavior we expect for 1 (cf. intermediate state of type I superconductors). For
> 1 one could imagine solutions containing a network of closed vortices similar to Abrikosov
vortices in type II superconductors, which could have a lower energy than our solutions for
> 1. Therefore we point out, that our solutions for > 1 might be metastable with respect to
the decay into a vortex-type solution of equal ChernSimons number. However, we did not find
any solutions of this type.
4.3. Structure of the Higgs and the gauge fields
Let us now look at the scalar and gauge field configurations of the solitons. We will concentrate on the specific example with = 1 and NCS = 12 000.
Figs. 4 and 5 show a 3D-plot of the scalar field in the rz-plane and a 2D-plot of along
the axis of symmetry r = 0, respectively. The thickness of the domain wall ( 1/mH ), which
separates the two phases v and = v, is small compared to the size of the soliton (thin wall
approximation).
Moreover Fig. 4 shows that the soliton is not spherically symmetric. We can determine the
shape of the soliton numerically. For this end, we define the contour of constant scalar field
= 0.5v as the boundary of the soliton and fit a 2-dimensional surface to that boundary. This is
done in Fig. 6, in which we show the contour = 0.5v in the rz-plane. The contour follows very
precisely a circle of radius rs 27.132/mV centered at (18.163/mV , 0). The contour deviates
from a perfect circle just in a narrow region around the axis of symmetry, whose size is of the
order of a few 1/mV . Only in this small region, the fields behave microscopically. Motivated
by the amazing numerical result of Fig. 6, we conjecture that asymptotically for large NCS ,
the soliton has the shape of a so-called spindle torus. A spindle torus is the surface of revolution
obtained by rotating a circular arc around an axis, where the arc is intersecting that axis. However,
we were not able to prove this result analytically.
Fig. 4. The scalar field in the rz-plane for NCS = 12 000 and = 1. Note that is shown in units of v and the
coordinates r and z are in units of 1/mV . The rectangular box in the rz-plane used to obtain the solution in this case was
[0, 52] [40, 40].
378
Fig. 5. The scalar field along the axis of symmetry r = 0 (again for NCS = 12 000 and = 1). The dots show the
values of on the grid points of the numerical solution. The resolution is 0.8/mV .
Fig. 6. The dotted line shows the contour of constant scalar field = 0.5 in the rz-plane. The solid line is the best fitting
circle to the dotted line, which is the circle around (18.163, 0) with radius rs 27.132 (in units of 1/mV ).
We now turn to the gauge field. As depicted in Fig. 7, the magnetic field is confined to the region where the scalar field is vanishing or small. Inside this domain the amplitude of the magnetic
field |B| is larger than the critical magnetic field Bc .
On the boundary |B| = Bc and B is tangential to the domain boundary (see Fig. 8). Remember that this behavior is expected from (type I) superconductors in the intermediate state: on the
boundaries of normal-conducting regions the magnetic field is tangential and equal to Bc . Outside this boundary, where the scalar field starts to deviate from zero, the magnetic field decays
exponentially, because in the broken phase the gauge boson becomes massive.
379
Fig. 7. 3D-plot of the magnetic field |B| in the rz-plane in units of the critical magnetic field Bc .
Fig. 8. The dots show the tangential component B of the magnetic field along the boundary of the domain as a function
of the distance r from the axis of symmetry (units for B : Bc = 1). Sufficiently far away from the axis of symmetry the
magnetic field is constant and equal to the critical field Bc , which is represented by the solid line.
To get an idea of how the magnetic field lines behave, we show in Fig. 9 a plot of the toroidal
component B and in Fig. 10 the poloidal component B p = (Br , Bz ) of the magnetic field.
The toroidal component of the magnetic field is maximal at (20.8/mV , 0) and vanishes on the
boundary of the spindle torus, where the field is completely poloidal. The poloidal field is maximal at the origin and zero at (27.4/mV , 0). As it was already shown in Fig. 8, |B p | = Bc along
the boundary of the spindle torus. In short terms, the poloidal component of the magnetic field
wraps around the toroidal component, giving rise to the non-zero helicity of the magnetic field
(which is the same as ChernSimons number for Abelian fields).
Finally we show in Fig. 11 the energy density of the soliton (NCS = 12 000 and = 1). The
different parts of the energy are

mV
B2 3
Emagn =
(47)
d x 1.00158 105 2 ,
2
g

mV
()2 + g 2 A2 2 d 3 x 3.985 103 2 ,
ED =
(48)
g
380
Fig. 9. Contour plot of the toroidal component B of the magnetic field (in units of the critical magnetic field Bc ). The
shown contours are from 0 to 2.4 with a step size of 0.4 between the contours. The maximal value of B is located at
(r, z) (20.8, 0) and is equal to 2.4Bc .
Fig. 10. The poloidal component B p of the magnetic field. The solid lines are projections of the field lines onto the
rz-plane. The colors indicate the modulus of B p (in units of Bc ). The poloidal field is maximal at the origin and vanishes
at (27.4, 0). On the boundary of the spindle torus |B p | = Bc .

EV =
2

mV
2 v 2 d 3 x 3.2123 104 2 .
g
(49)
The virial theorem (30) is very well satisfied (Emagn = 3EV + ED to an accuracy of less than 1%).
The part ED of the energy is a surface term and is negligible for sufficiently large ChernSimons
number.
381
Fig. 11. 3D plot of the energy density for the soliton with NCS = 12 000 and = 1 (in units of the vacuum energy
density v 4 ). The maximum of the energy density is at the origin and is equal to Emax 12.56v 4 = 6.28Bc2 .
4.4. Solitons for different scalar potentials

In Section 3.4, we have seen that for a scalar potential like the one shown in Fig. 1(b), the
2/3
energy of anomalous solitons is E NCS .
In order to verify this conjecture, we also solved the field equations for the potential
V () = ( v)2 2 .
(50)
If we again use dimensionless fields and coordinates as before, the field equations (42) for the
potential (50) read

1

2 A2 ( 1)2 + 2 ( 1) = 0,
(51a)
2
4
( A) 2 A + 2 A = 0,
(51b)
2
with = 2 /g . The constraint equation (43) remains unchanged. We solved Eqs. (51) for 10 <
NCS < 700 and = 1. In Fig. 12 we show the energy of the obtained solutions as a function of
NCS . We find
mV 2/3
E 94.866 2 NCS ,
(52)
g
which confirms the claim of Section 3.4. Fig. 13 shows the energy density the soliton with NCS =
500 and = 1. The main contribution still comes from the magnetic field, but one can clearly
see that there is a contribution located on the surface, which is the energy of the scalar field. As
well in this case, the solitons are not spherical.
5. Thin wall approximation
In this section we construct the anomalous solitons in the thin wall approximation valid for
large NCS . In this limit the solitons are described by a compact domain V , inside which = 0.
Outside the domain we have = v and A = 0. In order to determine A inside V we have to solve
Eq. (11b) for = 0,
( A) = 2( A),
(53)
382
Fig. 12. Logarithmic plot of the energy as a function of NCS for anomalous solitons obtained with the potential (50).
2/3
The dots show the values of our numerical solutions and the solid line is the best fit of the form E = aNCS , which gives
2
a 94.866mV /g .
Fig. 13. The energy density of the soliton with NCS = 500 and = 1 in units of v 4 /16 (which is the potential barrier
between the two minima of the potential).
where A = 0 on the boundary V . In principle, the surface V has also to be determined by

minimization. But because we have found numerically in the previous section, that the solitons
have the shape of a spindle torus for large NCS , we take V to be a spindle torus of fixed size
described by two parameters Rs and rs > Rs (see Fig. 14). We solve Eq. (53) inside the spindle
torus as a function of Rs and rs . Then we minimize the soliton energy with respect to these two
parameters. We use dimensionless cylindrical coordinates (, , ) defined by
x = Rs cos ,
y = Rs sin ,
z = Rs .
383
Fig. 14. Cross-section of a spindle torus in the rz-plane with radii Rs and rs . The spindle torus is obtained by rotating
the circle centered at (Rs , 0) with radius rs > Rs around the z axis.
In these coordinates (53) simplifies to a system of linear partial differential equations for the
potentials A , A and A in the -plane
( A A ) = A ,

1
2
A (A ) = ( A A ),

1
1
( A A ) = (A ),

(54a)
(54b)
(54c)
with = 2Rs .
Integration of (54a) and (54c) gives
A = A A = Rs B
and using (55) in (54b) leads to

1
(A ) + 2 A + 2 A = 0.

The ChernSimons number becomes

g2
A B d 3x
NCS =
16 2

g2
=
A B d 3 x
8 2

g2
A2 d 3 x,
=
8 2 Rs
(55)
(56)
(57)
where the second line is found by partial integration and Eq. (55) has been used on the third line.
We solved Eq. (56) numerically with A = 0 on the boundary of the spindle torus and under the
constraint (57). Because Eq. (56) is a linear eigenvalue problem, the solution which minimizes
the energy is the eigenfunction to the lowest positive eigenvalue 2 , normalized with Eq. (57). It
turns out, that the eigenvalue is a function of the ratio s = rs /Rs , which is shown in Fig. 15.
The total energy of the solution is given by Eq. (15) for R = Rs . After minimization with respect
384
Fig. 15. The eigenvalue as a function of the parameter s .
Fig. 16. The total energy E as a function of the parameter s . The minimal energy is at s 1.495 and is
3/4
Emin 119.06 1/4 (mV /g 2 )NCS .
to Rs , the size of the spindle torus Rs and the energy are given by (cf. Eqs. (16) and (17))

64 2 (s ) 1/4 1 NCS 1/4

,
Rs (s ) =
3IV (s )
mV

16 2 (s ) 3/4
1/4 mV 3/4
IV (s )
N ,
E(s ) = 2
3
g 2 CS
where the shape factor for the spindle torus is

3/2
4 2
1
2 2
2
2
IV (s ) = s + 2 s 1 +
+ 2s arcsin
s 1
.
3
s
(58)
(59)
(60)
In Fig. 16 we show the function E(s ), which can be minimized numerically with respect to s .
There is a minimum at s 1.495, for which we get

1 NCS 1/4
,
Rs 1.742
(61)
mV
mV 3/4
E 119.065 1/4 2 NCS .
(62)
g
385
Fig. 17. The amplitude of the magnetic field |B| in the plane z = 0 of the soliton with NCS = 12 000 and = 1.
The dotted line shows the solution of the previous section, the solid line is the solution in the thin wall (macroscopic)
approximation discussed in the present section. The units are Bc = 1.
These results are in excellent agreement with the results of the previous section, where we obtained

1 NCS 1/4
,
Rs 1.726
(63)
mV
mV 3/4
E 118.826 1/4 2 NCS .
(64)
g
The macroscopic solutions discussed in the present section coincide with the solutions of the
previous section for large NCS . To illustrate this, we compare in Fig. 17 the magnetic fields of
the macroscopic solution and the solution of the previous section.
6. Conclusions
In this article we studied anomalous Abelian solitons, a class of solitons of the chiral Abelian
Higgs model found by Rubakov and Tavkhelidze. These solitons carry a non-zero ChernSimons
number NCS and they are stable for large NCS . We showed that their energy-versus-fermion3/4
2/3
number ratio is given by E NCS or E NCS depending on the structure of the scalar
3/4
potential. For the former case we derived a lower bound for the soliton energy, E cNCS ,
where c is a constant depending on the coupling constants and the VEV of the Higgs field.
We discussed the structure of anomalous solitons by solving numerically the field equations
for both the gauge and the Higgs field. It turned out that the solitons are not spherically symmetric. For large ChernSimons number they have the shape of a spindle torus. In this limit we also
constructed the solutions in the thin wall (macroscopic) approximation.
At the moment it is not clear, if (metastable) anomalous solitons exist in the Standard Model.
It will be interesting to extend the present study to the full electroweak theory with SU(2) U (1)
gauge group.
Acknowledgements
This work was supported by the Swiss National Science Foundation. We thank Andrei Gruzinov for helpful discussions.
386
Appendix A. Inequalities
In this appendix we review the inequalities from functional analysis, which have been used in
Section 3.1 to derive the lower bound on the energy of the anomalous solitons.
A.1. Hlder inequality
Let be an open subset of Rd and p, q 1 such that
1
1
+ = 1.
p q
Then, for all functions f Lp () and g Lq (), the product f g L1 () satisfies the Hlder
inequality,
f gL1 f Lp gLq ,
(A.1)
where the norm on Lp () is defined by

1/p
|f |p dx
.
f Lp :=
Note that for p = q = 2 the Hlder inequality reduces to the CauchySchwarz inequality. For
vector fields u(x) : Rd the Lp -norm is defined by

1/p
p/2
p
(u, u) dx
,
uL =
where (, ) denotes the scalar product of vectors in Rd . Thus for vector fields the Hlder inequality reads

(u, v) dx uLp vLq .
(A.2)

A.2. Modified Hlder inequality

As a consequence of (A.1) we have for s > 1 that
f sLs f 2s
f 2s2
,
L1
L2
f Ls ().
The proof of (A.3) is very simple: defining the functions g and h by

g = |f |2s ,
h = |f |2s2 ,
we get
f sLs = ghL1
gL1/(2s) hL1/(s1)

2s

s1
=
|g|1/(2s) dx
|h|1/(s1) dx
(A.3)

=
2s

|f | dx
387
2s2
|f | dx
2
= f 2s
f 2s2
,
L1
L2
where (A.1) has been used on the second line.
A.3. GagliardoNirenbergSobolev inequality
The Sobolev spaces W n,p (), where p 1 and n N, are the normed spaces of functions
defined by

W n,p () = f Lp () Nd , || n: x f Lp () .
For 1 p < d we define the Sobolev conjugate of p by
p =
dp
.
d p
Then, there is a constant c(d, p), such that

f Lp c(d, p)Df Lp ,
f W 1,p (),
(A.4)
where D is the weak derivative. Eq. (A.4) is known as GagliardoNirenbergSobolev (GNS)

inequality. For = Rd the best possible constant in (A.4) is

1/d
d
1 d1
p = 1,
d |S
|
c(d, p) = p1 dp 1/p

1/d
(d+1)
p > 1.
dp d(p1)
(d/p)(d+1d/p)|S d1 |
Here, |S d1 | denotes the surface area of the (d 1)-dimensional unit sphere,
d
S = 2d d/2 (d/2) .
(d)
In Section 3.1 we have used (A.4) for continuously differentiable functions with compact support
and (d, p) = (3, 2). In this case the weak derivative is just the ordinary gradient and the GNS
inequality reads

1 2 2/3
f L2 , f Cc1 R3 .
f L6 c(3, 2)f L2 =
(A.5)
3
Appendix B. Numerical procedure
In this appendix we explain the numerical procedure which was used to obtain numerical
solutions of the PDEs (44) subject to the constraint (45) and boundary conditions (46). Solutions
are calculated on the rectangular box [0, L] [Z, Z] in the rz-plane. The dimensions of the box
1/4
1/4
L and Z have to be at least a few times NCS , since the expected size of the soliton is R NCS .
The box is divided into a grid of Nr Nz points (ri , zi ) given by
ri = hr (i 1), for i = 1, . . . , Nr ,
hz
zj = (2j 1 Nz ), for j = 1, . . . , Nz ,
2
388
where hr = L/(Nr 1) and hz = 2Z/(Nz 1). Because the coordinates (r, z) are in units of
1/mV and we expect the fields to vary on scales of order 1/mH and 1/mV , hr and hz have
to be smaller than 1/2 to ensure a sufficient resolution of the solutions. Therefore, the number
of grid points has to be increased going either to larger NCS or to lower . As an example for
NCS = 12 000 and = 1, we were using the rectangular box [0, 52][40, 40] with a resolution
of 0.8/mV , which lead to a grid with 6 666 points.
The derivatives in the PDEs (44) are approximated by finite differences in a straightforward
way,
f (ri , zj ) fi,j ,
1
(fi+1,j fi1,j ),
2hr
1
(z f )(ri , zj )
(fi,j +1 fi,j 1 ),
2hz
2
1
r f (ri , zj ) 2 (fi+1,j 2fi,j + fi1,j ),
hr
1
(fi+1,j +1 fi+1,j 1 fi1,j +1 + fi1,j 1 ),
(r z f )(ri , zj )
4hr hz
2
1
z f (ri , zj ) 2 (fi,j +1 2fi,j + fi,j 1 ),
hz
(r f )(ri , zj )
where f (ri , zj ) stands for one of the fields Ar , A , Az or at the position (i, j ) on the grid.
Using these replacement rules the PDEs reduce to a system of non-linear equations on the internal
points of the grid (i.e. i = 2, . . . , Nr 1 and j = 2, . . . , Nz 1). On the axis of symmetry r = 0
the finite difference approximation for the PDEs requires special care. The regularity conditions
(46c) at r = 0 for the fields Ar and A imply
Ar,1,j = 0,
(B.1)
A,1,j = 0,
(B.2)
for j = 2, . . . , Nz 1. For discretizing Eqs. (44a) and (44d) at r = 0 we have to use the replacements

1
4
r (rr )r=0 = 2r2 r=0 2 (2,j 1,j ),
r
hr

4
1
r (rr Az )r=0 = 2r2 Az r=0 2 (Az,2,j Az,1,j ),
r
hr
1
1
(Ar,2,j +1 Ar,2,j 1 ),
r (rz Ar )r=0 = 2r z Ar |r=0
r
hr hz
2
1
r (rA )r=0 = 2r A |r=0 A,2,j .
r
hr
The system of non-linear equations is completed by the boundary conditions at r = L and z =
Z, which read
i,1 = i,Nz = 1 and Ai,1 = Ai,Nz = 0
Nr ,j = 1
for i = 1, . . . , Nr ,
and ANr ,j = 0 for j = 1, . . . , Nz .
(B.3a)
(B.3b)
389
Finally, the constraint equation (45) is given by

Nr 1 N
z 1
Ar,i,j +1 Ar,i,j 1 Az,i+1,j Az,i1,j
hr hz
NCS =
hr (i 1)A,i,j
.
8
2hz
2hr
i=2 j =2
(B.4)
Including the constraint equation (B.4) we obtain a system of 4Nr Nz + 1 non-linear equations for
the unknowns i,j , Ar,i,j , A,i,j , Az,i,j and the Lagrangian multiplier . This system is solved
using a standard Newtonian algorithm for non-linear equations. In every step of the Newton
iteration, one has to solve a system of 4Nr Nz + 1 linear equations.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
D. Finkelstein, C.W. Misner, Ann. Phys. 6 (1959) 230.

D. Finkelstein, J. Math. Phys. 7 (1966) 1218.
A.A. Abrikosov, Sov. Phys. JETP 5 (1957) 1174.
H.B. Nielsen, P. Olesen, Nucl. Phys. B 61 (1973) 45.
A.M. Polyakov, JETP Lett. 20 (1974) 194.
G. t Hooft, Nucl. Phys. B 79 (1974) 276.
T.H.R. Skyrme, Nucl. Phys. 31 (1962) 556.
R. Friedberg, T.D. Lee, A. Sirlin, Phys. Rev. D 13 (1976) 2739.
S.R. Coleman, Nucl. Phys. B 262 (1985) 263.
A.M. Safian, S.R. Coleman, M. Axenides, Nucl. Phys. B 297 (1988) 498.
S.Y. Khlebnikov, M.E. Shaposhnikov, Phys. Lett. B 180 (1986) 93.
E.J. Copeland, E.W. Kolb, K.M. Lee, Nucl. Phys. B 319 (1989) 501.
V.A. Rubakov, A.N. Tavkhelidze, Phys. Lett. B 165 (1985) 109.
L.D. Faddeev, A.J. Niemi, Nature 387 (1997) 58, hep-th/9610193.
V.A. Rubakov, Prog. Theor. Phys. 75 (1986) 366.
V.A. Matveev, et al., Nucl. Phys. B 282 (1987) 700.
G. t Hooft, Phys. Rev. Lett. 37 (1976) 8.
R. Jackiw, C. Rebbi, Phys. Rev. Lett. 37 (1976) 172.
C.G. Callan Jr, R.F. Dashen, D.J. Gross, Phys. Lett. B 63 (1976) 334.
A.N. Redlich, L.C.R. Wijewardhana, Phys. Rev. Lett. 54 (1985) 970.
A.F. Vakulenko, L.V. Kapitansky, Sov. Phys. Dokl. 24 (1979) 433.
A. Kundu, Y.P. Rybakov, J. Phys. A 15 (1982) 269.
R.S. Ward, Nonlinearity 12 (1999) 241, hep-th/9811176.
G. Rosen, SIAM J. Appl. Math. 21 (1971) 30.
L.D. Landau, L.P. Pitaevskii, Statistical Physics: Part 2, second ed., Pergamon Press, Oxford, 1981.
L.D. Landau, E.M. Lifshitz, Electrodynamics of Continuous Media, first ed., Pergamon Press, Oxford, 1960.

Nucl - Phys.B v.775

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Nucl - Phys.B v.775

Enviado por

Direitos autorais:

Formatos disponíveis

NUCLEAR PHYSICS B

Journal devoted to the experimental and theoretical study of the fundamental

Nuclear Physics B 775 (2007) 130

Anomalous dimensions of high-spin operators beyond

E-mail address: korchems@th.u-psud.fr (G.P. Korchemsky).

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

PT (z) = zPS (1/z),

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

The paper is organized as follows. In Section 2, we apply well-known one-loop expressions

Here the mixing matrix is given by a series in the coupling constant

with k (N ) being matrices, and the beta-function is defined as

The beta-function coefficients are given in QCD by

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

2.1. Nonsinglet anomalous dimensions in QCD

with (z) being an arbitrary test function.

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

a t a = i [D , D ] is the light-like component of the gauge field strength tensor

with the functions X(N) and Y (N) defined as

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

3. Parity preserving relations at higher loops

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

and obtains solution to this equation as

3.1.1. Parity preserving relations

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

tives (N )n f = ( 1 + 1/(4J 2 )J )n f runs in odd/even powers of 1/J depending on parity of n.

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

Then, one takes into account (3.14) and obtains

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

2CF sin(a) (4 + 2a)

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

As a result, one finds from (4.2) after some algebra

In distinction with q (N ), the gluon dominated anomalous dimension g

(N ) = 2a + O(1/0 ) into this relation, one finds in the large 0 limit

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

with ki being nonnegative integer. In gauge theory,

Here the string energy E is given by the series in 1/

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

one finds from the second relation in (5.4) for = 0

anomalous dimension [50].

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

and the first subleading correction

We observe that the two functions, L (N ) and fL (N ), vanish for 0. Expansion of fL (N )

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

where = g 2 Nc /(L)2 and ellipses denote terms suppressed by powers of ln( + 12 ). We

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

Appendix A. Large N expansion of the anomalous dimensions

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

where the notation was introduced for the functions

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

B. Basso, G.P. Korchemsky / Nuclear Physics B 775 (2007) 130

[35] S. Mandelstam, Nucl. Phys. B 213 (1983) 149;

Nuclear Physics B 775 (2007) 3144

where yi are constants to be determined by imposing that J transforms as a 1 . In fact it is

Similarly, by requiring that J given by

with its complex conjugate .

S,T (z) = S,T / V , (z) = / V

When Im 5 = 0 we need to consider the angular range as given in (3.7).