Você está na página 1de 26

Diagnostic Accuracy of Neonatal

Assessment for Gestational Age


Determination: A Systematic Review
Anne CC Lee, MD, MPH,​a,​b Pratik Panchal, MD, MPH,​c,​d Lian Folger, BA,​a Hilary Whelan, MD,​e Rachel Whelan,
MPH, BA,​f Bernard Rosner, PhD,​b,​g Hannah Blencowe, MRCPCH, MBChB, Msc,​h,​i Joy E. Lawn, MBBS, PhDh,​i

CONTEXT: An estimated 15 million neonates are born preterm annually. However, in low- and
middle-income countries, the dating of pregnancy is frequently unreliable or unknown. abstract
OBJECTIVE: To conduct a systematic literature review and meta-analysis to determine the
diagnostic accuracy of neonatal assessments to estimate gestational age (GA).
DATA SOURCES: PubMed, Embase, Cochrane, Web of Science, POPLINE, and World Health
Organization library databases.
STUDY SELECTION: Studies of live-born infants in which researchers compared neonatal signs or
assessments for GA estimation with a reference standard.
DATA EXTRACTION: Two independent reviewers extracted data on study population, design, bias,
reference standard, test methods, accuracy, agreement, validity, correlation, and interrater
reliability.
RESULTS: Four thousand nine hundred and fifty-six studies were screened and 78 included.
We identified 18 newborn assessments for GA estimation (ranging 4 to 23 signs). Compared
with ultrasound, the Dubowitz score dated 95% of pregnancies within ±2.6 weeks (n =
7 studies), while the Ballard score overestimated GA (0.4 weeks) and dated pregnancies
within ±3.8 weeks (n = 9). Compared with last menstrual period, the Dubowitz score
dated 95% of pregnancies within ± 2.9 weeks (n = 6 studies) and the Ballard score, ±4.2
weeks (n = 5). Assessments with fewer signs tended to be less accurate. A few studies
showed a tendency for newborn assessments to overestimate GA in preterm infants and
underestimate GA in growth-restricted infants.
LIMITATIONS: Poor study quality and few studies with early ultrasound-based reference.
CONCLUSIONS: Efforts in low- and middle-income countries should focus on improving dating
in pregnancy through ultrasound and improving validity in growth-restricted populations.
Where ultrasound is not possible, increased efforts are needed to develop simpler yet
specific approaches for newborn assessment through new combinations of existing
parameters, new signs, or technology.

aDepartment of Pediatric Newborn Medicine, and Departments of gChanning Division of Network Medicine, Medicine, Brigham and Women’s Hospital, Boston, Massachusetts; bHarvard

Medical School, Harvard University, Boston, Massachusetts; cDepartment of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, Massachusetts; dDepartment
of Clinical Research, OpenBiome, Somerville, Massachusetts; eDepartment of Pediatrics, University of Rochester Medical Center, Rochester, New York; fDepartment of Research, Community

To cite: Lee AC, Panchal P, Folger L, et al. Diagnostic Accuracy of Neonatal Assessment for Gestational Age Determination: A Systematic Review. Pediatrics.
2017;140(4):e20171423

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017:e20171423 Review Article
Of the estimated 14.9 million annual on a combination of neurologic initially done in March 2015 and
preterm births, 13.6 million (91%) and physical signs, which dated updated in June 2016 (‍Fig 1).
occur in low- and middle-income pregnancies within 5 days of Databases we searched included
countries (LMIC) .‍1,​2‍ Preterm birth LMP in their original study. Since PubMed, Embase, Cochrane, Web
is the leading cause of mortality in then, several simplified clinical of Science, POPLINE, and the World
children less than 5 years of age assessments have been described Health Organization Global Health
globally, accounting for 1 million in the literature.15–‍‍ 18
‍ The Ballard Libraries and regional databases
neonatal deaths annually, almost all of score‍19 is one of the most commonly (Latin American and Carribbean
which are in LMIC.‍3 In these settings, usedand was revised to the New Health Sciences, Index Medicus for
early recognition of the preterm infant Ballard score in 1991 to improve the Eastern Mediterranean Region,
may facilitate the timely delivery accuracy for early preterm infants.20 African Index Medicus). The review
of life-saving interventions, such as Newborn assessment for GA dating was registered with the International
continuous positive airway pressure has become less relevant in high- Prospective Register of Systematic
or kangaroo mother care. income settings, where ultrasound Reviews (PROSPERO registration
coverage is high and uncertainty number: CRD42015020499). The
Ultrasound dating in early pregnancy
of antenatal pregnancy dating is Preferred Reporting Items for
is the most accurate method currently
less common than in LMIC. In LMIC Systematic Reviews and Meta-
available to assess gestational age
settings without widespread access Analyses (PRISMA) statement, review
(GA) and is a standard of care in high-
to early ultrasound dating and where protocol, and detailed search terms
income countries. In LMIC, pregnancy
accuracy of LMP recall is highly are available in the Supplemental
dating is challenging, and GA of the
variable, clinical assessment of the Information.
infant is frequently unknown or
inaccurate. Maternal recall of last newborn remains the commonest Inclusion Criteria
menstrual period (LMP) is often available tool to evaluate GA.
Accurate GA is necessary to identify There were no language restrictions.
unavailable or unreliable, particularly
preterm and small-for-gestational- Abstracts of non-English articles
in populations with high rates of
age (SGA) babies and provide them were translated via Google Translate,
maternal illiteracy.‍4,​5‍ The shortage
with effective interventions. and if eligible, the full text was
of health care providers in LMIC,
translated to English by fluent
currently estimated at 7.9 million,​‍6 The Every Newborn Action Plan
speakers. Articles were considered
contributes to poor coverage of was launched in 2014 with the
for inclusion if the study met the
antenatal care. In sub-Saharan Africa aim to end preventable neonatal
following criteria: (1) included live-
and Southeast Asia, fewer than one- deaths and stillbirths by 2030.‍34 GA
born neonates; (2) compared at
third of mothers in households in measurement was identified as a least 2 methods of GA estimation,
the poorest quintile receive at least 1 priority area‍35 for improving (1) the 1 of which was a neonatal clinical
antenatal care visit.‍7 Furthermore, the epidemiology of preterm birth and assessment, score or individual
timing of the first visit for antenatal SGA and (2) the comparability of clinical sign(s); and (3) reported at
care is late, occurring typically late neonatal mortality estimates through least 1 statistic assessing correlation,
in the second trimester.8,​9‍ Moreover, stratification by GA and birth weight. agreement, or validity of GA
access to ultrasonography is low, with
In this systematic review, we aim estimation. Prenatal assessments (eg,
<7% of pregnant women having access
to (1) identify individual neonatal symphysis fundal height, ultrasound)
to ultrasound in rural sub-Saharan
signs and combined clinical scores and neonatal anthropometrics
Africa.‍4 Traditional sonography in late
or assessments that have been used (eg, foot length) were reviewed
pregnancy is notably inaccurate for
to ascertain GA of newborns; and separately and will be reported
determining GA (±4 weeks).‍10,​11 ‍
(2) assess the diagnostic accuracy elsewhere.
Clinical assessment of newborn and reliability of these methods
maturity has long been used as a for estimating GA, compared with Exclusion Criteria
proxy to estimate GA after birth dating by a reference standard (ie, We excluded studies in which
(‍Table 1). In 1966, Farr et al‍12 ultrasound or LMP). researchers did not provide
defined a classification for the data describing the correlation,
development of external physical agreement, or validity of neonatal
characteristics in the newborn. In Methods clinical assessment compared with
1968, Amiel-Tison‍13 described the a reference method of pregnancy
assessment of neonatal neurologic Search Strategy dating (ie, ultrasound or LMP). We
maturation. Dubowitz et al‍14 We conducted a systematic review excluded studies from specialized
developed a score for GA based of the published and gray literature, subpopulations (eg, infants of

Downloaded from www.aappublications.org/news by guest on July 23, 2019


2 Lee et al
TABLE 1 Neonatal Assessments/ Scoring Systems by Level of Complexity
Clinical No. of Physical Criteria Neuromuscular Criteria Other Criteria Reference Original Reported Study Setting and Location Sample Year
Scoring criteria Standard Accuracy or Correlation Size
System or with GA
Name
Amiel-Tison 23 Skin color, skin opacity, skin Return to flexion of forearms, — BOE Correlation of individual Port-Royal-Baudelocque 397 1999
et al‍21 texture, edema, lanugo, scarf sign, popliteal angle, foot signs in manuscript Hospital; Paris, France
skull hardness, ear form, dorsiflexion, righting reaction,
ear firmness, genitals, raise to sit, back to lying, finger
breast size, nipple grasp and response to traction,
formation, plantar creases nonnutritive sucking, crossed
extension, vision fix and track
Feresu et al‍22 22 Edema, skin texture, skin Posture, square window, Birth weight LMP Correlation of individual Maternity unit, Harare 364 2002
color, skin opacity, lanugo, dorsiflexion of foot, arm recoil, (BW) signs in manuscript Central Hospital; Harare,

PEDIATRICS Volume 140, number 6, December 2017


plantar creases, nipple leg recoil, popliteal angle, Zimbabwe
formation, breast size, heel-to-ear, scarf sign, head lag,
ear form, ear firmness, ventral suspension
genitals
Dubowitz 21 Edema, skin texture, skin Posture, square window, ankle — LMP 95% CI: ±2.0 wk NICU, Jessop Hospital 167 1970
et al‍14 color, skin opacity, lanugo, dorsiflexion, arm recoil, leg for Women; Sheffield,
plantar creases, nipple recoil, popliteal angle, heel to England
formation, breast size, ear, scarf sign, head lag, ventral
ear form, ear firmness, suspension
genitals

Dubowitz and 17 Skin texture, skin color, skin Posture, square window, — LMP r = 0.878 Alexandra Maternity 710 1976
Farr (from opacity, lanugo, plantar dorsiflexion of foot, popliteal Hospital; Athens, Greece
Nicolopoulos creases, nipple formation, angle, heel-to-ear, scarf sign,
et al‍23) breast size, ear form, ear head lag, ventral suspension
firmness
Finnström‍24 12 Breast size, nipple formation, — — LMP r = 0.84 for 5 external University Hospital; Umea, 174 1972
skin opacity, scalp hair, characteristics Sweden
hair-forehead border,
eyebrows, ear cartilage,
fingernails, xiphoid
process, external genitalia,

Downloaded from www.aappublications.org/news by guest on July 23, 2019


plantar skin creases,
pupillary membrane
Ballard et al‍19 12 Skin color, lanugo, plantar Posture, square window (wrist), — LMP and r = 0.852 NICU, Cincinnati General 252 1979
creases, breast size, ear arm recoil, popliteal angle, scarf clinical Hospital; Cincinnati, Ohio
firmness, genitals sign, heel-to-ear data
Ballard et 12 Skin, lanugo, plantar crease, Posture, square window (wrist), — BOE r = 0.97 NICUs and nurseries; 530 1991
al (New breast maturity, eye and/ arm recoil, popliteal angle, scarf Cincinnati, Ohio
Ballard or ear, genitals sign, heel-to-ear
score)‍20

3
4
TABLE 1  Continued
Clinical No. of Physical Criteria Neuromuscular Criteria Other Criteria Reference Original Reported Study Setting and Location Sample Year
Scoring criteria Standard Accuracy or Correlation Size
System or with GA
Name
Farr‍25 10 — Spontaneous motor activity, — LMP Accurate ±1 wk: 61% Aberdeen, Scotland 82 1968
reaction of pupils to light, rate
of sucking, closure of mouth
when sucking, stripping action
of the tongue, resistance against
passive movement, recoil of
forearms, plantar grasp, pitch of
cry, intensity of cry
Tunçer et al‍26 8 Skin texture, ear form, Posture, arm recoil, scarf sign — LMP r = 0.945 Hacettepe University, NICU; 100 1981
firmness, breast size and Ankara, Turkey
nipple formation, plantar
creases, facial appearance
Eregie‍17 8 Skin texture, ear form, breast Posture, scarf sign Head Dubowitz Accurate ±2 wk: 92% University teaching 262 1991
size, genitalia circumference, hospitals; Benin, Nigeria
mid-arm
circumference
Capurro et al‍15 7 Skin texture, nipple Scarf sign, head lag — LMP r = 0.9 Montevideo, Uruguay 115 1978
formation, ear form,
breast size, plantar
creases
Kollée et al‍27 7 Skin color, skin texture, — AVCL NS 95% CI: ±19.9 d Catholic University; 229 1985
plantar creases, breast Nijmegen, Netherlands
size, ear firmness, nail
length
Klimek et al‍28 6 Lanugo, plantar creases, Posture, angle forearm to arm, — Ballard r = 0.72 Tertiary care hospitals; 800 2000
breast size pulling an elbow to the body Poland
Simplified 6 Breast size, skin texture, ear Square window, popliteal angle, — Ultrasound Mean difference: 0.4 wk Private Hospitals; Northern 98 2009
Dubowitz bending (substituted from scarf sign (95% LOA: −2.8 to 1.9) Territory, Australia
(from Allan ear firmness because
et al‍29) some Aboriginal babies
have less ear cartilage)

Downloaded from www.aappublications.org/news by guest on July 23, 2019


Narayanan 5 Skin color, ear form, plantar — AVCL LMP 95% CI: ±11 d Kalawati Saran Children’s 356 1982
et al‍30 skin crease, breast Hospital; New Delhi,
formation, skin texture India
Robinson‍31 (from 5 — Pupil reaction, traction, glabellar — Dubowitz 95 CI: ±1 wk; r = 0.85 South Africa 73 1966
Serfontein and tap, neck righting, head turning
Jaroszewicz‍32)
Parkin et al‍16 4 Skin texture, breast size, — — LMP 95 CI: ±18.1 d University hospital; 392 1976
edema, plantar skin Newcastle, England
creases, nail length, nail
texture, ear firmness, skull
hardness, lanugo hair,
genitalia

Lee et al
diabetic mothers), editorials or used to report ranges and medians.

2014
Year

reviews without original data, The mean individual-level differences


individual case reports, and duplicate between 2 methods of GA assessment
28–37 studies. were pooled using the Stata metan
1000; GA
Sample

wk command, which provided the pooled


Size

Data Extraction mean-difference estimate and 95%


All articles were reviewed confidence interval (CI). The variance
and SD around the pooled estimate
Study Setting and Location

independently by 2 researchers and


Thiruvananthapuram,

extracted into a standard Excel file. were calculated using the following
Differences were resolved by a third formula‍38:
Medical College;

Kerala, India

independent reviewer. The study


characteristics extracted are listed in ​​  ​​​(​  ​​​n ​i​​ − 1​)​​​​Si​ 2​  ​
​∑ ​ ki=1
Supplemental Information 2 . Variance​ pooled​​ = ​ _______________  ​​ 
k   ​ (​ ​n​ i​​ − 1)
​∑ ​ i=1

Study Quality Assessment


Mean difference: −0.58 wk;

For studies in which researchers


Accuracy or Correlation

Two independent reviewers graded


Original Reported

reported the percent of test measures


the methodological quality of the within ±1 to 2 weeks of a reference,
with GA

studies of diagnostic accuracy using percentages were logit transformed


the Quality Assessment of Diagnostic and SEs were calculated. Meta-
r = 0.91

Accuracy Studies-2 (QUADAS–2)‍36 analysis was conducted with a


tool, modified for the context of this random effects model. The Higgins I2
review (Supplemental Information, statistic was calculated to assess
Reference
Standard

“Study Quality Assessment” section).


LMP

heterogeneity. For reports of


Individual studies were evaluated diagnostic accuracy, forest plots
for limitations and biases in the were generated in R to summarize
Other Criteria

following domains: patient selection, diagnostic accuracy across studies.


test method, reference standard,

Because pooling of sensitivity


and patient flow and timing. Studies and specificity separately fails to
with a reference standard GA of account for the interrelatedness of
ultrasonography or best obstetric the measures, hierarchical bivariate
estimate (BOE) (including ultrasound models are recommended for meta-
Neuromuscular Criteria

confirmation of dating) were graded analysis.‍39 These were analyzed


as highest quality. Though LMP by using MetaDisc 1.4 and RStudio
may be considered gold standard (Mada package). Hierarchal summary

in high-resource settings (where receiver operating characteristic


rates of literacy and early antenatal curves were generated.
care are high), in LMIC, LMP recall
is considered less reliable because Subgroup analyses were conducted
of low literacy rates and late by assessment method, reference
presentation to antenatal care.‍11,​37

Skin texture, breast size, ear

standard type, and country income


Additionally, we assessed the
level. Correlation coefficients were
Physical Criteria

firmness, genitalia)

generalizability of study results to


LOA, limits of agreement; NS, not stated; —, not applicable.

not pooled, given that in many studies


LMIC.
type of coefficient (ie, Spearman
or Pearson) was not indicated, and
Statistical Analysis
furthermore, methods for pooling
Stata 13 (StataCorp, College Spearman correlation coefficients
Station, TX) and R (R Foundation have not been well described.‍38
criteria

for Statistical Computing, Vienna,


No. of

Austria) were used for analyses. The


TABLE 1  Continued

definition of preterm birth was a live Results


Bindusha et

birth <37 weeks’ gestation. Studies


al‍18 (from
Bhagwat et

Neonatal Clinical Assessments


System or

were grouped by method of newborn


Scoring
Clinical

al‍33)
Name

assessment and reference standard. We identified 3862 titles, and


Simple descriptive statistics were 66 articles were included, some

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017 5
low. In over half of the studies, there
was a high risk of bias related to
patient selection, test method, or
reference standard.

Neonatal Clinical Assessments or


Scores

We identified 18 different neonatal


assessments or scoring systems
(combining >1 individual clinical
sign) for GA determination (‍Table 1).
Twelve were developed in high-
income countries (HICs) and 7 in
LMIC (4 in Africa, 2 in Asia, 1 in
Turkey). The reference standard from
which the scores were derived was
ultrasound/BOE in only 2 studies.
The most complex score, Amiel-
Tison,​‍21 has 23 criteria, including a
large number of neurologic signs.
The simplest score, the Parkin,​‍16
includes only 4 external physical
criteria. One simplified score was
developed in Nigeria (Eregie‍17) and
includes physical anthropometrics
(head circumference and midarm
circumference).

Individual External Physical Criteria and


Signs

‍ able 2 shows 12 studies in which


T
researchers reported the correlation
of individual external physical
criteria with GA. Correlation
coefficients were generally higher for
comparisons with an LMP reference,
FIGURE 1 for which median correlation
Neonatal clinical assessment: flow diagram. Diagram of the screening process to identify studies for
inclusion in neonatal assessment review; adapted from the PRISMA (Moher D, Liberati A, Tetzlaff J, coefficients ranged from 0.60 to 0.75
Altman DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the for most signs. Three studies used an
PRISMA statement. PLoS Med. 2009 Jul 21;6(7):e1000097). *Note: Several papers reported on>1 score. ultrasound or BOE GA reference, and
lower correlations were reported in
reporting on more than one scoring on preterm and/or low birth weight
2 of these studies, neither of which
system (22 articles reported on the (LBW) populations. For the reference included early preterm infants.‍21,​40

Dubowitz score, 31 on the Original standard, there were 31 studies in The physical characteristics with
and/or New Ballard score,​and 25 on which researchers had ultrasound- the highest median correlation were
other clinical scores)(‍Fig 1). Basic
based dating, 42 in which they used breast size, plantar skin creases, ear
study characteristics of all included
LMP, and 3 in which researchers used firmness, and skin texture.
studies are in Supplemental Table
dating based on another neonatal
10. The studies were published Individual Neuromuscular Signs
assessment.
between 1968 and 2016, with fewer In 10 studies, researchers reported
than half from LMIC. Most studies The overall QUADAS–2 summary is the correlation of individual
(n = 62) were conducted in health in Supplemental Fig 6. In general, the neuromuscular criteria with GA
facilities, with 19 conducted in NICUs quality of the studies was relatively (‍Table 2). The median correlation

Downloaded from www.aappublications.org/news by guest on July 23, 2019


6 Lee et al
TABLE 2 Correlation of Individual Physical or Neuromuscular Criteria With GA
Amiel- Lee et al‍40 Ballard et Parkin et Dubowitz Raghu et Feresu et Dubowitz Finnström‍24 Ballard et Tunçer et Narayanan Summary Across
Tison et al (New al‍16 and Farr al‍41 al‍22 and Farr al‍19 al‍26 et al‍30 All Studies,
al‍21 Ballard)‍20 (Nicolopoulos (Sunjoh et Median (Minimum,
et al‍23) al‍42) Maximum)
N (sample size) 397 710 530 392 710 160 364 358 174 252 220 356
Study Setting and Tertiary Community, NICUs and University Maternity University Maternity Tertiary University NICU NICU; Ankara, Children’s —
Location Hospital; Sylhet, nurseries; Hospital; Hospital; hospital; unit; Hospital; hospital; andnursery; Turkey Hospital;
Paris, Bangladesh Cincinnati, Newcastle, Athens, Lusaka, Harare, Yaounde, Umea, Cincinnati, New Delhi,
France Ohio England Greece Zambia Zimbabwe Cameroon Sweden Ohio India
GA range included 37–41 wk 34–42 wk 20–44 wk 25.2–45.2 28–44 wk NS 24–45 wk 25–44 wk 32.1–34 26–44 wk, 27–41 wk 26–44.4 wk —
wk wk 760–5460
g
Reference standard BOE Ultrasound BOE LMP LMP LMP LMP LMP LMP LMP LMP LMP —

PEDIATRICS Volume 140, number 6, December 2017


Physical criteria
  Skin color 0.19 0.05 — 0.78 0.76 0.52 0.45 0.8 0.48 0.84 — 0.74 0.63 (0.05, 0.84)
  Ear form 0.11 0.02 0.73 — 0.76 0.64 0.57 0.72 0.41 0.84 0.62 — 0.63 (0.02, 0.84)
  Ear firmness 0.18 0.03 — 0.78 0.76 0.65 0.53 0.72 — — — 0.85 0.69 (0.03, 0.85)
  Plantar skin 0.34 0.02 0.72 0.76 0.77 0.56 0.64 0.76 0.65 0.79 0.64 0.87 0.69, (0.02, 0.87)
creases
  Breast size 0.25 — 0.8 0.75 0.76 0.66 0.57 0.76 0.62 0.89 0.66 0.81 0.75 (0.25, 0.89)
  Nipple formation 0.19 0.14 — — 0.72 0.62 0.55 0.75 0.68 — — — 0.62 (0.14, 0.75)
  Skin texture 0.28 0.14 0.75 0.72 0.77 0.59 0.57 0.8 — — 0.65 0.77 0.69 (0.14, 0.80)
  Genitalia 0.17 0.02 0.82 0.66 0.65 0.36 0.62 0.63 0.43 0.67 — — 0.63 (0.02, 0.82)
  Lanugo hair 0.2 −0.01 0.81 0.62 0.73 0.55 0.49 0.71 — 0.77 — — 0.62 (−0.01, 0.81)
  Edema 0.16 — — 0.59 0.64 0.67 0.22 0.41 — — — — 0.50 (0.16, 0.67)
  Skin opacity 0.09 0.02 — — 0.72 0.22 0.35 0.7 0.48 — — — 0.35 (0.02, 0.72)
  Nail texture — — — 0.57 — — — — — — — — 0.57 (0.57, 0.57)
  Nail length — — — 0.51 — — — — — — — — 0.51 (0.51, 0.51)
  Facial — — — — — — — — — — 0.77 — 0.77 (0.77, 0.77)
appearance
  Skull hardness 0.15 — — — — — — — — — — — 0.15 (0.15, 0.15)
Neuromuscular — — — — — — — — — — — — —
criteria
  Posture — 0.12 0.82 0.75 0.72 0.31 0.65 0.76 — 0.69 0.48 — 0.69 (0.12, 0.82)
  Square window — — 0.79 0.21 0.73 0.58 0.64 0.69 — 0.7 — 0.69 (0.21, 0.79)

Downloaded from www.aappublications.org/news by guest on July 23, 2019


  Scarf sign 0.23 0.08 0.82 0.67 0.72 0.51 0.63 0.72 — 0.71 0.41 — 0.65 (0.08, 0.82)
  Popliteal angle 0.23 0.05 0.74 0.48 0.76 0.39 0.63 0.7 — 0.77 — — 0.63 (0.05, 0.77)
  Arm recoil 0.19 0.07 0.71 0.62 0.65 0.29 0.55 0.56 — 0.61 0.36 — 0.56 (0.07, 0.71)
  Heel to ear — 0.04 0.81 0.51 0.76 0.5 0.59 0.66 — 0.72 — — 0.63 (0.04, 0.81)
  Leg recoil — — — 0.59 0.55 0.3 0.47 0.52 — — — — 0.52 (0.30, 0.59)
  Ventral — — — 0.59 0.72 0.42 0.7 0.71 — — — — 0.70 (0.42, 0.72)
suspension
  Head lag — — — 0.47 0.71 0.36 0.59 0.65 — — — — 0.59 (0.36, 0.71)
  Ankle 0.21 — — 0.37 0.74 0.47 0.59 0.66 — — — — 0.53 (0.21, 0.74)
dorsiflexion
  Nonnutritive 0.24 — — — — — — — — — — — 0.24 (0.24, 0.24)
sucking reflex

7
  Crossed 0.16 — — — — — — — — — — — 0.16 (0.16, 0.16)
extension
Median (Minimum, coefficients ranged from 0.52 to 0.70 dating) fell within ±2.6 weeks (n =
Summary Across

0.10 (0.10, 0.10)

0.07 (0.07, 0.07)


0.15 (0.15, 0.15)
0.03 (0.03, 0.03)
0.11 (0.11, 0.11)
in the studies using an LMP reference 7 studies). In the studies in which
All Studies,

Maximum)
standard GA. Of the 3 studies researchers reported on the percent
that used an ultrasound-based agreement within weeks (n = 3), the
reference standard GA, correlation Dubowitz GA fell within 1 week of
coefficients were again lower in ultrasound dates in 53% of infants
Narayanan

the same 2 studies as they were (pooled estimate, 95% CI: 47% to
et al‍30

for physical criteria.‍21,​40




71%), and within 2 weeks in 59%



‍ The signs
with the highest median correlation of newborns (pooled estimate, 95%
coefficients were ventral suspension, CI: 41% to 74%). Researchers in 1
Tunçer et

square window, and posture. study reported on the diagnostic


al‍26




accuracy of the Dubowitz score to


Validity of Neonatal Clinical Scores identify preterm infants compared to
of GA ultrasound-based dating (sensitivity
Ballard et

Studies in which researchers 61%, specificity 99%).‍50 Among


al‍19




reported on the validity or agreement studies done in LMIC, there was


of neonatal assessments with a no significant bias compared with
Finnström‍24

reference standard are shown ultrasound dating, and the precision





in ‍Table 3 (Dubowitz), ‍Table 4 of GA dating by the Dubowitz score


(Ballard), and Supplemental Table 12 was similar to HICs (Supplemental
(other assessments). Table 11).
(Sunjoh et
Dubowitz
and Farr

al‍42)




Dubowitz Score In 4 studies, there was evidence


of greater bias of Dubowitz
There were 26 studies in which scoring among preterm infants
Feresu et

researchers validated the Dubowitz (Supplemental Table 12). In 4


al‍22




score (11 ultrasound/BOE; 19 LMP studies, researchers reported that


reference). Ten studies were from the Dubowitz score systematically
LMIC. In most studies, the neonatal overestimated GA in preterm infants
Raghu et

assessment was performed by by up to 2.6 weeks‍48–‍ 50


‍ and more so
al‍41




physicians or nurses. among early preterm infants.‍46,​48–‍ 50



(Nicolopoulos

LMP Reference Standard


Dubowitz
and Farr

Ultrasound or BOE Reference Standard


et al‍23)




In 2 studies, researchers reported The correlation of GA determined


the correlation of GA dating by by Dubowitz scoring and LMP GA
Dubowitz score and BOE (r = 0.73 was reported in 14 studies and was
Parkin et
al‍16




and 0.90, respectively). In 7 studies, generally high, ranging from 0.41


researchers reported a mean to 0.94 (median = 0.89). The pooled
difference in GA between Dubowitz mean difference was 0.65 weeks
Ballard et

Ballard)‍20
al (New

and ultrasound-based dating, ranging (n = 6, 95% CI: 0.01 to 1.30), in­dicating





from −2.2 weeks (underestimation) a systematic overestimation com­


to +0.7 weeks (overestimation). pared with LMP-based GA (‍Table 5,
The pooled mean difference was Supplemental Fig 7). 95% of the
Lee et al‍40

not statistically different from the differences fell within ±2.9 weeks




null hypothesis (ie, difference = 0), of the mean. The GA determined by


indicating no evidence of Dubowitz assessment fell within
overall systematic bias (‍Table 5, 1 week of LMP dates in 59% of
Tison et
Amiel-

0.07
0.15
0.03
0.11
al‍21

0.1

Supplemental Fig 7). The precision newborns (n = 4, 95% CI: 41% to


NS, not stated; —, not applicable.

of the estimate is reflected in the SD 74%) and within 2 weeks in 87% (n =


of the mean difference, which, at the 6, 95% CI: 71% to 95%). Researchers
TABLE 2  Continued

  Righting reaction

  Finger grasp and

individual study level, ranged from in 1 study reported on the diagnostic


response to
  Vision: fix and

  Back to lying
  Raise to sit

0.52 to 1.94 weeks. The pooled SD accuracy of the Dubowitz score to


traction
track

across the studies was 1.3 weeks, identify preterm infants (sensitivity
indicating that 95% of the differences 81.5%, specificity 98.6%).‍41 Among
in GA (Dubowitz score–ultrasound LMIC studies (n = 2), there was a

Downloaded from www.aappublications.org/news by guest on July 23, 2019


8 Lee et al
TABLE 3 Agreement and Validity of the Dubowitz Score
Author Year Study Setting and Location GA of Cohort Sample Assessment Agreement Validity
Size Version Correlation Mean SD of the Bland- Percent Percent Sensitivity Specificity <37 <37
(Total, Coefficient Difference Mean Altman Within 1 Within 2 Preterm Preterm wk wk
Physical or (R) With (wk) Difference 95% LOA wk wk <37 wk <37 wk PPV NPV
External, Reference (wk) ±1.96 SD (%) (95% (%) (95%
Neurologic) GA (LL, UL) CI) CI)
(wk)
Ultrasound
  HICs
  Allan et al‍29 2009 Tertiary Hospitals; 29.6–41.7 wk 98 Total — 0.10 1.10 (−2.3, 2.0) — — — — — —
Northern Territory,
Australia
  Roberts 1979 University Hospital; Cardiff, NS 118 Total — — — — 68.6 89.8 — — — —

PEDIATRICS Volume 140, number 6, December 2017


et al‍43 Wales
  Vik et al‍44 1997 Trondheim and Bergen, All GAs 970 Total — −0.20 1.12 (−2.3, 2.1) — — — — — —
Norway
  Awoust and 1982 University Hospital; NS 130 Total — 0.50 1.04 — — — — — — —
Levi‍45 Brussels, Belgium
  Sanders 1991 NICU; Baltimore, MD <1500 g, >20 110 Total 0.73 3.00 — — 18.2 39.1 — — — —
et al‍46 wk
  Wariyar 1997 Newcastle, UK 32–42 wk 347 Total — 0.71 1.17 (−1.57, 3.0) — — — — — —
et al‍47 <30 wk 105 Total — 2.86 2.48 (−2.0, 7.71)
  Robillard 1992 NICU; Guadalupe, French <2500 g 384 Total — 0.64 1.94 — 61.0 82.0 — — — —
et al‍48 West Indies
  Shukla 1987 University hospitals; New Preterm <38 25 Total 0.90 — — — — 48.0 — — — —
et al‍49 York, NY wk, AGA
  LMIC
  Moore 2015 Refugee clinics; Thai- All GAs 250 Total — 2.57a 1.04a (0.49, — — 61 99 — —
et al‍50 Myanmar border 4.65)a
  Rosenberg 2009 Special Care Nursery; ≤33 wk 355 Total — 0.56 0.52 (−1.57, — — — — — —
et al‍51 Dhaka, Bangladesh 0.47)
  Karunasekera 2002 Teaching Hospital; Ragama, 35–42 wk 200 Total — −2.18 1.43 — — — — — — —
et al‍52 Sri Lanka External — −0.45 2.39 — — — — — — —
LMP

Downloaded from www.aappublications.org/news by guest on July 23, 2019


  HICs
  Ballard et 1979 NICU and nursery; NS 224 Total 0.85 — — — — — — — — —
al‍19 Cincinnati, OH
  Capurro 1978 Tertiary care center; NS 115 Total 0.91 — — — — — — — — —
et al‍15 Montevideo, Uruguay
  Mitchell‍53 1979 Newborn nursery,​; London, NS 20 Total 0.41 — — — — — — — — —
England
  Nicolopoulos 1976 Maternity Hospital and 28–44 wk 710 Total 0.91 — — — — — — — — —
et al‍23 clinics; Athens, Greece External 0.88 — — — — — — — — —
Corrected 0.85 — — — — — — — — —
neurologic
  Roberts 1979 University Hospital; Cardiff, NS 118 Total — — — — 67.8 79.6 — — — —
et al‍43 Wales

9
10
TABLE 3  Continued
Author Year Study Setting and Location GA of Cohort Sample Assessment Agreement Validity
Size Version Correlation Mean SD of the Bland- Percent Percent Sensitivity Specificity <37 <37
(Total, Coefficient Difference Mean Altman Within 1 Within 2 Preterm Preterm wk wk
Physical or (R) With (wk) Difference 95% LOA wk wk <37 wk <37 wk PPV NPV
External, Reference (wk) ±1.96 SD (%) (95% (%) (95%
Neurologic) GA (LL, UL) CI) CI)
(wk)
  Vogt et al‍54 1981 Tertiary care center; All GAs 242 Total — — — — — 90b — — — —
Norway
  Vik et al‍44 1997 Trondheim and Bergen, All GAs 970 Total — −0.40 1.43 (−3.2, 2.4) — — — — — —
Norway
  Latis et al‍55 1981 Neonatal unit; Milano, Italy 27–42 wk 92 Total — 0.44 1.62 — — 80.7 — — — —
  Dubowitz 1970 Newborn and Special Care All GAs 167 Total 0.93 — — — — 95.0 — — — —
et al‍14 Nurseries; Sheffield, External 0.91 — — — — — — — — —
England Neurologic 0.89 — — — — — — — — —
  Allan et al‍29 2009 Tertiary Hospitals; 29.6–41.7 wk 56 Total — 0.30 0.92 (−1.5, 2.1) — — — — — —
Northern Territory,
Australia
  Hertz 1978 General Hospital; Cleveland, All GAs 126 Total 0.86 — — — — — — — — —
et al‍56 OH
  Sanders 1991 NICU; Baltimore, MD <1500 g, >20 110 Total 0.68 2.80 2.1 — 23.6 46.3 — — — —
et al‍46 wk
  LMIC
  Feresu 2002 Maternity unit; Harare, All GAs 364 Total 0.81 — — — — — — — — —
et al‍22 Zimbabwe External 0.77 — — — — — — — — —
Neurologic 0.79 — — — — — — — — —
  Sunjoh 2004 Tertiary Hospitals; 25–44 wk 358 Total 0.94 0.50 1.31 — — 93.0 — — — —
et al‍42 Yaounde, Cameroon
  Tunçer 1981 University Hospital; Ankara, 27–41 wk 120 Total 0.88 — — — — — — — — —
et al‍26 Turkey
  Cevit et al‍57 1998 Tertiary care center; Sivas, 28–38 wk; 91 Total 0.85 0.30 — — 60.4 98.9 — — — —
Turkey <2500 g
  Jaroszewicz 1973 Tertiar Hospital; Cape NS 100 Total 0.9 — — — — — — — — —
and Boyd‍58 Town, South Africa
1977 Maternity units; Ibadan, 29–43 wk 100 Total 0.90 0.38 1.41 (−2.39, 74.0 94.0 81.5 98.6 95.7 93.5

Downloaded from www.aappublications.org/news by guest on July 23, 2019


  Dawodu
et al‍59 Nigeria 3.15)
  Raghu 1981 Premature unit, University NS 160 Total 0.90 — — — — — — — — —
et al‍41 Hospital; Lusaka, Zambia External 0.82
Neurologic 0.80
LL, lower limit; LOA, limits of agreement; NPV, negative predictive value; NS, not stated; PPV, positive predictive value; UL, upper limit; —, indicates that the data were not available for that paper.
a For a 34-wk newborn with a weight-for-age z score of 0. There was evidence of a significant trend across GA; mean bias decreased by 0.35 wk per week increase in newborn GA.
b Percent within ±3 wk of LMP (reference) GA.

Lee et al
TABLE 4 Agreement and Validity of the Ballard Score
Study Year Study Setting GA of Sample Assessment Agreement Validity
and Location Cohort Size Version Correlation Mean SD of Mean Bland- Percent Percent Sensitivity Specificity <37 <37 wk
(Original or Coefficient Difference Difference Altman Within 1 Within 2 Preterm Preterm wk NPV
New Ballard ) (R) with (wk) (wk) 95% LOA wk wk <37 wk (%) <37 wk (%) PPV
Reference ±1.96 SD (95% CI) (95% CI)
GA (LL, UL)
(wk)
Ultrasound
  HICs
  Scher and 1987 NICU; 23–30 wk 24 Original — 1.35 2.62 (−3.79, 56.5 69.6 — — — —
Barmada‍60 Pittsburgh, by LMPa Ballard 6.49)
PA
  Alexander 1992 University 28–44 4193 Ballard 0.79 — — — — — 72.2 97.1 83.2 94.6

PEDIATRICS Volume 140, number 6, December 2017


et al‍61 Hospital; wk by
Charleston, Ballard
SC
  Sanders 1991 NICU; <1500 g; 110 Ballard 0.69 2.70 — — 22.7 45.4 — — — —
et al‍46 Baltimore, <37 wk
MD
  Smith et al‍62 1999 University <2500 g; 85% 82 Ballard 0.86 1.40 1.15 — — 85 — — — —
Hospital; preterm
Houston, TX
  Dombrowski 1992 Women's 24–43 wk 38 318 Ballard — — — — — 85.4 — — — —
et al‍63 Hospital;
Detroit, MI
  Gagliardi 1992 NICUs, tertiary <37 wk; 227 Ballard — −0.21 1.76 — 20.5 40.4 — — — —
et al‍64 hospitals; <2500 g
Milano,
Italy
  Wariyar et al‍47 1997 Newcastle, UK 32–42 wk 347 Ballard — 0.57 1.31 (−2.0, 3.14) — — — — — —
<30 wk 105 Ballard — 3.43 1.97 (−0.43,
7.29)
<30 wk 105 New Ballard — 1.57 1.75 (−1.86, 5.0)
  Ballard et al‍20 1991 NICUs and All GAs; 530 New Ballard 0.97 0.15 1.46 — — — — — — —

Downloaded from www.aappublications.org/news by guest on July 23, 2019


nurseries; 20−44
Cincinnati, wk
OH
  Amato et al‍65 1991 NICU; Bern, All preterm, 38 Ballard — — — — — — — — — —
Switzerland LBW (physical)
  LMIC
  Karl et al‍66 2015 Health 25.5–43.7 623 Ballard 0.35 0.86 2.41 (−3.86, — — 39.0 92.0 21.0 97.0
facilities; wk; 5.57)
Madang, 900–
Papua New 4250 g
Guinea
668 (External) 0.33 — — — — — 58.0 81.0 14.0 97.0

11
12
TABLE 4  Continued
Study Year Study Setting GA of Sample Assessment Agreement Validity
and Location Cohort Size Version Correlation Mean SD of Mean Bland- Percent Percent Sensitivity Specificity <37 <37 wk
(Original or Coefficient Difference Difference Altman Within 1 Within 2 Preterm Preterm wk NPV
New Ballard ) (R) with (wk) (wk) 95% LOA wk wk <37 wk (%) <37 wk (%) PPV
Reference ±1.96 SD (95% CI) (95% CI)
GA (LL, UL)
(wk)
668 (Neurologic) 0.39 — — (−3.57, — — 23.0 93.0 14.0 96.0
6.57)
  Rosenberg 2009 Special Care Preterm, 355 Ballard — −0.41 1.08 (−0.7, 1.51) — — — — — —
et al‍51 Nursery; all <33
Dhaka, wk
Bangladesh
  Lee et al‍40 2016 Community; 33–45 wk 710 Ballard 0.12 −0.40 2.22 (−4.7, 4.0) 32.0 64 15.0 87.0 9.0 92.0
Sylhet,
Bangladesh
  Moraes and 2000 Maternity NS 116 New Ballard — — — — — — 57.0 (41.0 97.0 (90.0 — —
   Reichenheim‍67 unit; Rio to 73.0) to 99.0)
(translated) de Janeiro,
Brazil
  Sreekumar 2013 NICU and 24–41.2 wk 284 New Ballard — −0.04 — — — — — — — —
et al‍68 postnatal
wards;
Bengaluru,
India
  Wylie et al‍69 2013 Ndirande All GAs 177 New Ballard — 0.80 2.19 (−3.5, 5.1) — — — — — —
Clinic;
Blantyre,
Malawi
  Taylor et al‍70 2010 Community; All GAs 80 Ballard — −2.23 1.56 (−5.3, 0.82) — — — — — —
Keneba, (external)
The Gambia
  Thi et al‍71 2015 General 30–42 wk by 391 New Ballard 0.90 — — — — — — — — —
Hospital; ultrasound
Hoa Binh,

Downloaded from www.aappublications.org/news by guest on July 23, 2019


Vietnam
LMP
  HICs
  Baumann et al‍72 1993 University 27–35 wk 60 Ballard (total) 0.91 — — — — — — — — —
(translated) Hospital; AGA
Bern,
Switzerland
28–36 wk 29 0.66 — — — — — — — — —
SGA
27–35 wk 60 Ballard 0.83 — — — — — — — — —
AGA (external)

Lee et al
TABLE 4  Continued
Study Year Study Setting GA of Sample Assessment Agreement Validity
and Location Cohort Size Version Correlation Mean SD of Mean Bland- Percent Percent Sensitivity Specificity <37 <37 wk
(Original or Coefficient Difference Difference Altman Within 1 Within 2 Preterm Preterm wk NPV
New Ballard ) (R) with (wk) (wk) 95% LOA wk wk <37 wk (%) <37 wk (%) PPV
Reference ±1.96 SD (95% CI) (95% CI)
GA (LL, UL)
(wk)
28–36 wk 29 0.66 — — — — — — — — —
SGA
27–35 wk 60 Ballard 0.65 — — — — — — — — —
AGA (Neurologic)
28–36 wk 29 0.66 — — — — — — — — —
SGA

PEDIATRICS Volume 140, number 6, December 2017


  Constantine 1987 AK, NY, MA, All GAs 1246 Ballard 0.81 0.60 2.18 — — — 85 81 89 75
et al‍73 FL, PA, TX,
WA, CN
(Physical) 0.83 −0.1 2.14 — — — 92 74 87 87
(Neurologic) 0.71 1.4 2.72 — — — 70 84 89 60
  Scher and 1987 NICU; Pittsburgh, 23–30 wk 24 Ballard — 1.42 2.32 (−3.13, 45.8 62.5 — — — —
Barmada‍60 PA by LMP 5.96)
  Mackanjee 1996 NICU; London, 23–33 wk; 47 Ballard 0.87 1.50 1.50 — — — — — — —
et al‍74 Ontario, <1500 g
Canada
  Dombrowski 1992 Women's 24–46 wk 38 818 Ballard — — — — — 69.9 — — — —
et al‍63 Hospital;
Detroit, MI
  Alexander 1990 Univerity 20–45 wk 10 794 Ballard 0.76 0.48 — — 52.7 80.3 — — — —
et al‍75 Hospital;
Charleston,
SC
  Ballard et al‍19 1979 NICU and 26–44 wk, 224 Ballard 0.85 — — — — — — — — —
nursery; 760–
Cincinnati OH 5460 g
  Alexander 1992 University 28–44 wk; 3480 Ballard 0.82 0.53 — — — 68.2 — — — —
et al‍76 Hospital; all African

Downloaded from www.aappublications.org/news by guest on July 23, 2019


Charleston, American
SC population
28–44 wk; 2091 Ballard 0.86 0.17 — — — 70.6 — — — —
all white
population
  Ballard et al‍20 1991 NICUs and 20–44 wk 578 New Ballard 0.96 — — — — 88.0 — — — —
nurseries;
Cincinnati,
OH

13
14
TABLE 4  Continued
Study Year Study Setting GA of Sample Assessment Agreement Validity
and Location Cohort Size Version Correlation Mean SD of Mean Bland- Percent Percent Sensitivity Specificity <37 <37 wk
(Original or Coefficient Difference Difference Altman Within 1 Within 2 Preterm Preterm wk NPV
New Ballard ) (R) with (wk) (wk) 95% LOA wk wk <37 wk (%) <37 wk (%) PPV
Reference ±1.96 SD (95% CI) (95% CI)
GA (LL, UL)
(wk)
  Ahn‍77 2008 Neonatal units, All GA, 213 New Ballardb 0.85 0.46c — — — — — — — —
university 773–
hospital; 4870 g
Incheon,
South Korea
  Sanders 1991 NICU; Baltimore, <1500 g; 110 Ballard 0.66 2.60 2.2 — 28.2 51.0 — — — —
et al‍46 MD <37 wk
  LMIC
  Cevit et al‍57 1998 Tertiary care Preterm 91 Ballard — 0.10 — — 59.3 98.9 — — — —
center; Sivas, 28–38 wk;
Turkey <2500 g
  Feresu et al‍22 2002 Maternity unit; 24–45 wk 364 Ballard 0.80 — — — — — — — — —
Harare,
Zimbabwe
(Physical)d 0.75 — — — — — — — — —
(Neurologic)d 0.74 — — — — — — — — —
  Sunjoh et al‍42 2004 Tertiary Hospitals; 25–44 wk 358 New Ballard 0.93 0.34 1.52 — — 86.0 — — — —
Yaounde,
Cameroon
  Bindusha 2014 Tertiary hospital; 28–37 wk 1000 New Ballard 0.92 0.31 — — — — <36 wk: <36 wk: <36 wk: <36 wk:
et al‍33 Kerela, India 85.6 94.6 98.0 53.6
  Sasidharan 2009 NICU; Northern 29–35 wk 129 New Ballard — — — — — 100.0 — — — —
et al‍78 India
  Moraes and 2000 Maternity NS 140 New Ballard — — — — — — 68.0 (49.0 92.0 (85.0 — —
Reichenheim‍67 unit; Rio de to 82.0) to 96.0)
(translated) Janeiro, Brazil

2015 General Hospital; 30–43 wk 282 New Ballard 0.81

Downloaded from www.aappublications.org/news by guest on July 23, 2019


  Thi et al‍71 — — — — — — — — —
Hoa Binh , by LMP
Vietnam
  Taylor et al‍70 2010 Community; All GAs 76 Ballard — −2.2 3.3 (−8.67, — — — — — —
Keneba, The (external) 4.27)
Gambia
  Verhoeff 1997 Tertiary Hospitals; All GAs; 76 Ballard — 0.87 — — — — — — — —
et al‍79 Chikwawa literate (external)
District, Malawi mothers
LL, lower limit; LOA, limits of agreement; NPV, negative predictive value; NS, not stated; PPV, positive predictive value; UL, upper limit; —, indicates that the data were not available for that paper.
a All infants in this study died; all deaths occurred after the assessments.
b This study used an “Extended New Ballard” scoring system to estimate GA (simply the standard New Ballard score extended to be used to estimate a greater GA range, which was calculated mathematically).
c For infants <39 wk GA. Mean difference = −0.58 wk for infants >39 wk GA.

Lee et al
d This study used a “revised” version of the physical and neurologic portions of the Ballard assessment.
tendency of the Dubowitz score

Pooled Specificity

95.1 (94.5 to 95.7)

83.5 (79.5 to 87.0)

96.7 (95.7 to 97.5)


(%) (95% CIs)
to overestimate GA (0.48 weeks),
Specificity

although the precision of the GA

98.6



99
estimates was similar to HIC studies
(Supplemental Table 11).
Validity

In 2 studies, researchers showed


Pooled Sensitivity

64.1 (60.8 to 67.4)

84.1 (81.6 to 86.3)

42.7 (35.6 to 50.0)


(%) (95% CIs)

evidence that Dubowitz scoring


81.5

tended to overestimate GA in early



61
Sensitivity

preterm infants (Supplemental Table


12).‍42,​54


— Ballard and New Ballard Score
N

3
We identified 30 studies in which
74.8 (44.7–

87.0 (71.2–

72.2 (53.8–

75.8 (70.6–

93.4 (91.3–

79.2 (65.3–
(95% CIs)
Pooled %

91.6)

94.8)

85.3)

80.5)

95.1)

88.6)
Percent Within

researchers assessed the validity of


the Original Ballard score (n = 20),


2 wk

the New Ballard score (n = 9), or both


(n = 1) (‍Table 4) (17 ultrasound/
N

0
2

BOE, 20 LMP reference), with 14


Pooled % (95%

from LMIC. The Original and New


53.4 (46.6–

58.5 (40.9–

34.0 (21.8–

44.6 (24.9–

40.1 (34.7–
Percent Within 1 wk

71.3)

74.2)

44.6)

66.2)

45.8)

Ballard scores assess the same


CIs)


clinical signs, with the New Ballard


score‍20 having additional scoring
categories for early preterm infants.
Studies in which researchers used
N

0
0

the Ballard score (Original or New)


Agreement

were combined for this analysis.


Pooled SD

1.27

1.45

1.90

2.10

1.97

1.96

Ballard assessments were performed


by medically trained health workers


(physicians, nurses, or research
Pooled Difference (95% CIs)

assistants) in the majority of studies


−0.17 (−0.26– −0.08)
Mean Difference

and by community health workers in


0.02 (−0.51– 0.55)

0.11 (−0.02– 0.23)


0.65 (0.01–1.30)

0.40 (0.00–0.81)

1.25 (0.64–1.87)

2 studies.
TABLE 5 Pooled Data for Agreement and Validity of Neonatal Clinical Assessments

Ultrasound or BOE Reference Standard


The correlation coefficients
comparing Ballard score GA versus
ultrasound or BOE ranged from 0.12
to 0.97 (median = 0.85, n = 7 studies).
N

3
1

The mean GA difference ranged from


−0.41 weeks (underestimation) to
Reference Standard

Ultrasound or BOE

Ultrasound or BOE

Ultrasound or BOE

Ultrasound or BOE

+1.4 weeks (overestimation) in 9


studies. The pooled mean difference
was 0.40 weeks (95% CI: 0.00 to
0.81) (‍Table 5, Supplemental Fig
LMP

LMP

LMP

8), indicating a trend towards


overestimation of GA. The pooled
Identified
Studies

SD across the studies was 1.9 weeks,


No. of

20

14

18
9

3
2

indicating that 95% of the differences


in GA by Ballard assessment versus
ultrasound dates fell within ±3.8
—, not applicable.

weeks (n = 9 studies, ‍Table 5) of


Assessment

the mean. For the studies in which


Dubowitz

Capurro
Ballard

Parkin
Eregie

researchers reported on agreement


Type

in weeks, Ballard score dates fell

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017 15
within 1 week of ultrasound dates within 2 weeks of LMP in 76% (n = 9, The κ for the classification of preterm
in 34% (n = 3; 95% CI: 22% to 44%) 95% CI: 71% to 81%) of newborns. births ranged from 0.73 to 0.93 (good
of infants and within 2 weeks in The Ballard score had a pooled to excellent; n = 3).‍20,​67,​
‍ 85‍ The GA
72% (n = 5, 95% CI: 54% to 85%) sensitivity (n = 2) of 84.1% (95% estimates were also highly correlated
of newborns. The Ballard score had CI: 81.6% to 86.3%) and specificity (r = 0.71–0.95)‍20,​86 and without
a pooled sensitivity (n = 4) of 64% of 83.5% (95% CI: 79.5% to 87.0%) significant differences between
(95% CI: 61% to 67%) and specificity for identifying preterm newborns raters.‍49,​62,​
‍ 64,​
‍ 78

of 95% (95% CI: 95% to 96%) for (‍Fig 2). There were an inadequate
identifying preterm newborns. number of studies to stratify analysis Anterior Vascularity of Lens
Among LMIC studies, the trend of GA by LMIC versus HICs. The literature searches for
overestimation was similar to HIC examination of the anterior vascular
studies. However, the imprecision In 2 studies, researchers capsule of the lens (AVCL) yielded a
of GA estimation was greater in demonstrated overestimation of total of 344 unique manuscripts (‍Fig 3),
LMIC compared with HIC studies GA among preterm infants by the of which 10 met inclusion criteria
(pooled SD of 2.12 vs 1.49 weeks) Original Ballard exam,​‍73,​79
‍ but (‍Table 6). Three were from LMIC (2
(Supplemental Table 11). researchers in 1 study used the from South Asia, 1 from Africa). The
External Ballard only (Supplemental studies were generally of smaller
In several studies, researchers Table 12).‍79 In addition, researchers
reported evidence of greater bias sample size (N = 30–356), and the
in 2 studies found that the Original latest was published in 1993. In
in Ballard scoring among smaller Ballard performed differently
babies (Supplemental Table 12). In general, study quality was poor, with
among SGA infants: Baumann et a high risk of bias related to patient
3 studies, researchers reported that al‍72 reported that the correlation of
the Original Ballard systematically selection and reference standard. The
Ballard with GA was lower among overall QUADAS–2 assessment is in
overestimated GA by up to 2 to 3 SGA infants compared with those
weeks, in particular among preterm Supplemental Fig 9.
appropriate for gestational age.
infants,​‍46,​47,​
‍ 61‍ and generally, the Constantine et al73 showed that for Assessments were typically
trend was toward increasing bias in SGA babies, the bias for GA dating performed at <72 hours of life by
lower GAs. However, in a study in was 1 to 1.5 weeks lower than for physicians in tertiary health facilities,
Papua New Guinea, Karl et al‍66 found non-SGA infants. with most studies performed in NICU
the opposite trend. Wariyar et al47 settings and including only preterm
reported that the New Ballard and/or LBW infants. An ultrasound/
Other Clinical Assessments
overestimated GA to a lesser degree BOE-based date was available in
than the Original Ballard in infants only 2 studies. Pupil dilation was
Eighteen studies were identified in
<30 weeks (1.6 vs 3.4 weeks, performed before the assessment in
which researchers reported on the
respectively). Among SGA infants, 3 studies.
validity of other clinical methods of
researchers in 2 studies showed
GA assessment (ie, Eregie et al,​‍40,​42,​ ‍ 80

that GA was underestimated by the 15,​
4 0,​
81– 84 Correlation of AVCL Grading With GA
Capurro et al,​‍ ‍ ‍‍ ‍ Parkin
original Ballard.‍40,​61

et al,​16,​40,​
‍ 47,​
‍ 52,​
‍ 54,​
‍ 68 Bhagwat et al,​‍18,​33,​
‍ 40‍ Hittner et al‍87,​‍89 reported that as
Tunçer et al,​‍ 26,​5 7 Finnström,​‍ 24
LMP Reference Standard the infant matures in gestation, the
Narayanan et al,​‍30 and Robinson‍32,​47 ‍ ). AVCL disappears in stages. In Grade
The correlation coefficients of Ballard These findings are reported in 4 (27–28 weeks), the entire anterior
and LMP GA ranged from 0.66 to Supplemental Information 3 and surface of the lens is vascularized,
0.96 (median = 0.85; n = 13). The Supplemental Table 13. In general, reducing to no vasculature in Grade
mean difference in GA was reported the majority of these exams were 0 (>34 weeks). Of note, the reference
in 6 studies, ranging from 0.34 to 2.6 simplified assessments with standard in the original Hittner
weeks (overestimation). The pooled fewer signs and were found to be study‍87 was the Dubowitz score.
mean difference was 0.70 weeks less accurate than the Dubowitz
(95% CI: 0.36 to 1.04), indicating or Ballard scores for GA dating In 2 studies, researchers presented
systematic overestimation (‍Table (Supplemental Information 3; data on the average GA determined
5, Supplemental Fig 8). Ninety five Supplemental Table 13; Table 5). by Hittner’s AVCL grading system
percent of mean differences fell (‍Table 6).‍46,​91
‍ The correlation of
within ±4.2 weeks (n = 5 studies)
Interrater Agreement
AVCL grade with GA ranged from
of the mean. Ballard GA fell within 1 In 10 studies, researchers reported −0.84 to −0.96 (median: −0.88,
week of LMP GA in 45% (n = 3, 95% upon the interrater agreement of GA n = 7) for preterm and/or LBW
CI: 25% to 66%) of newborns and estimates (Supplemental Table 14). populations For the 2 studies in

Downloaded from www.aappublications.org/news by guest on July 23, 2019


16 Lee et al
FIGURE 2
Forest plots of the Ballard score sensitivity and specificity for identifying preterm births compared with ultrasound (A, B) and LMP (C, D).

which researchers analyzed all GA ultrasound and dated 95% of infants The majority of individual physical
populations, correlation was lower within ±3.8 weeks of this mean. and neurologic signs that have been
(−0.64 to −0.45).‍24,​30 Among SGA Newborn clinical assessments used in different scoring systems
preterm newborns, the median tend to overestimate GA among had fair to moderate correlation
correlation coefficient was −0.77 preterm infants and therefore may with GA. Skin opacity was the most
(range: −0.68 to −0.91, n = 3).‍72,​87,​
‍ 89‍ misclassify preterm infants as term. weakly correlated and is perhaps
They also tended to underestimate the most affected by the timing of
Other Signs GA in growth-restricted babies. the assessment after birth. Although
Simplified assessments were less neurologic signs may be more
The results of searches for accurate. Although researchers in affected by neonatal morbidity
intermammillary distance, skin several studies showed promise of (birth asphyxia, neonatal infection,
impedance, and palmar creases are in the anterior vascularity of the lens to maternal medications, etc), the
Supplemental Information 4. classify GA <34 weeks, few compared correlation coefficients of most
AVCL with an ultrasound-based signs were in a similar range to the
reference standard. physical criteria. In 2 studies‍21,​40
‍ in
Discussion which researchers excluded early
Accurate GA determination is a public Study quality was a major limitation to moderate preterm infants, the
health priority to target and reduce of the studies identified in the review, correlation of clinical signs with
preterm birth–related morbidity with half of studies having high GA was lower, suggesting that the
and mortality in LMIC. The Every risk of bias. Many of the original criteria may be more discriminating
Newborn Action Plan has prioritized validation studies were from the at lower GAs.
GA measurement as a high-priority 1970s, when LMP was the gold
area to improve the epidemiology standard for pregnancy dating A critical consideration in LMIC is
of preterm birth and SGA.‍34 In our and ultrasound was not widely the validity of neonatal assessments
systematic literature review, we available. Many hospital-based in populations with high rates of
identified 18 different newborn studies were performed in NICUs SGA. Distinguishing whether a small
assessments that have been used among LBW babies and thus were baby is preterm, SGA, or both is a
for GA dating. The most commonly prone to selection and measurement challenge in these settings. Most
reported and validated scores in the biases (eg, lack of blinding). Fewer neonatal assessments were designed
literature were the Dubowitz and than half of the studies were in to measure infant maturity as
Ballard scores. The Dubowitz score LMIC, and studies in HICs may not opposed to gestational length. SGA
dated 95% of newborns within ±2.6 be generalizable to LMIC settings infants may act less mature during a
weeks of ultrasound dating. The because of health worker availability neonatal clinical assessment. Three
Ballard score tended to overestimate and training, and differences in the studies have revealed that among
GA by 0.4 weeks compared with prevalence of SGA and preterm birth. SGA infants, neonatal clinical exams

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017 17
(HICs: ±3.0 weeks; LMIC: ±4.2 weeks).
The validity of a clinical assessment
may vary with the level of medical
training of the assessor.‍40,​70

Most of the LMIC studies used
physicians, nurses, or midwives, and
there were few studies with frontline
health workers. The validity of the
newborn assessment has primarily
been studied in the facility and/or
hospital-based setting, and the few
studies in home-based settings had
poorer performance.‍40,​70
‍ Certain
factors may improve the validity in
the hospital setting, including the
timing of assessment sooner after
birth, being in a more controlled
environment, and lighting. The
development of some characteristics
may vary by ethnicity. For example,
plantar creases progress differently
in African American populations93
and skin color may vary. Morbidities,
such as gestational diabetes, are more
common in specific populations‍94 and
may affect the maturity assessment.
Finally, the performance may also be
affected by the GA ranges in which
it is tested. The performance and
validity of the assessments may vary
in a general population with a larger
representation of late preterm and
near-term infants compared with a
NICU.
FIGURE 3
AVCL: flow diagram. Diagram of the screening process to identify studies for inclusion in AVCL
review; adapted from the PRISMA (Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. Preferred Feasibility and scalability are critical
reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009 factors to consider in LMIC. As shown
Jul 21;6(7):e1000097). in this review, there is a positive
correlation between the number of
parameters and accuracy of a GA
tend to systematically underestimate growth-restricted or SGA infants.‍87 assessment. Yet there is likely to be
GA.‍40,​61,​
‍ 73‍ Improving the validity of An important consideration is that a negative correlation between the
the neonatal assessment in growth- the AVCL completely disappears number of parameters (especially
restricted populations is a critical after ∼34 weeks’ GA; thus, it may neurologic) and the feasibility of
research need in LMIC.‍30,​87,​92
‍ not help with GA dating >34 weeks. use. While the Dubowitz score had
The disappearance of the AVCL, or Furthermore, the AVCL exam the best accuracy, the assessment is
pupillary membrane, was found to requires specialized skills with an complex, may take 15 to 20 minutes
correlate well with GA, although opthalmoscope, which may limit the to complete, and includes more
overall study quality was poor, feasibility and scalability in LMIC. difficult-to-train neurologic criteria.
with few studies with ultrasound- Several factors should be considered In South Asia and sub-Saharan Africa,
based references. AVCL may in interpreting and generalizing the approximately half of births occur
show promise in LMIC with high validity of neonatal GA assessments outside of hospital facilities, and
rates of fetal growth restriction in different settings. Imprecision of community-based health workers
because the grading correlated the Ballard score was greater in LMIC or traditional birth attendants may
relatively well with GA, even among studies compared with HIC studies be the first point of contact for

Downloaded from www.aappublications.org/news by guest on July 23, 2019


18 Lee et al
TABLE 6 Correlation of AVCL With GA
Author Year Study Setting and Population Sample Reference Time of Correlation GA
Location Size (N) Standard Assessment After Coefficient (R) ith A) Range or B) Mean (SD) (n)
Birth Reference GA
Grade 0a Grade 1 Grade 2 Grade 3 Grade 4
Finnström‍24 1972 University Hospital; All GAs 174 LMP From birth up to 0.45b — — — — —
Umea, Sweden 60 h
Hittner et al‍87 1977 Tertiary Hospital; Preterm (27–34 100 LMP and Within 30 h −0.88 — — — — —
Houston, TX wk) Dubowitz
Subpopulation: 12 LMP and Within 30 h −0.91 — — — — —
preterm SGA Dubowitz
Guillory et al‍88 1980 Tertiary Hospital; Preterm 43 LMP and “Soon after birth” −0.88 — — — — —
Houston, TX Dubowitz 24 h after birth −0.86 — — — — —
Hittner et al‍89 1981 Tertiary Hospital; “Preterm SGA” 33 Dubowitz Within 24 h −0.77 — A) >33 wk A) 31–34 A) 33 wk (n A) 28 wk; (n

PEDIATRICS Volume 140, number 6, December 2017


Houston, TX (n = wk (n = 1) = 1)
24c) = 7)
Krishnamohan 1982 NICUs; Farmington Preterm (28–32 30 Ballard, within Within 24 h −0.94 — — — — —
et al‍90 and Hartford, CT wk) 2 wk of LMP
Narayanan et al‍30 1982 Children’s Hospital; All newborns; all 356 LMP, or OB Within 48 h −0.64 — — — — —
New Delhi, India GAs estimate if
available
Subpopulation: 184 Same as above Within 48 h −0.96 — — — — —
<35 wk GA
Sasivimolkul 1986 University Hospital; LBW 80 Ballard and 24–48 h −0.839 B) 36.3 B) 34.0 B) 32.4 B) 29.9 B) 27.8 (0.8)
et al‍91 Bangkok, LMP (1.86) (n (2.1) (1.4) (n (0.4); (n (n = 5)
Thailand = 43) (n = = 12) = 7)
13)
Subpopulation: 40 Ballard and 24–48 h −0.88 — — — — —
LBW ≥34 wk LMP
Skapinker and 1987 University Hospital; Preterm (<35 wk) 58 Ballard Within 36 h −0.84 — — — — —
Rothberg‍92 Johannesburg,
South Africa
Sanders et al‍46 1991 NICU; Baltimore, MD Preterm and birth 89 BOE Within 72 h — B) 32.4 B) 30.4 B) 29.8 B) 28.7 B) 26.7
weight <1500 g (ultrasound
was
available

Downloaded from www.aappublications.org/news by guest on July 23, 2019


for 92% of
women)
Baumann et al‍72 1993 University <34 wk AGA 60 Ultrasound NS −0.92 ± 0.04 (CI: — — — — —
(translated) Hospital; Bern, 0.81– 0.97)
Switzerland
<34 wk SGA 29 Ultrasound NS −0.68 ± 0.09 (CI: — — — — —
0.49–0.82)
OB, obstetric; NS, not stated; —, indicates that the data were not available for that paper.
a The AVCL grading system is as described in Hittner et al.‍87
b Finnström‍24 used the Harnack and Oster (1958) grading system, a classification system with grades 1–3, in which 1 equals most vascularity and 3 equals no vascularity. Therefore, the correlation between disappearance of the AVCL and increasing

GA is noted as positive under this classification system but would be negative by the Hittner grading system.
c N = 24 for both Grades 1 and 0 combined; the GA range stated (≥ 33 wk) comprises infants that scored either a 1 or 0.

19
newborns. These health workers
may not have the medical training
or the time required to perform the
assessment. The duration of the
assessment as well as the feasibility
of training, standardization,
and quality control are critical
considerations for scalability in LMIC.

Finally, when evaluating


methods of GA assessment,
the clinical, research, and
programmatic objectives should
be weighed. For the clinician, the
primary objective is to identify
preterm infants requiring
special care, and individual-level
misclassification may result in
missed intervention opportunities. FIGURE 4
Research priorities to improve GA dating in LMIC.
A measurement tool with high
sensitivity is desired to identify
all preterm infants, perhaps at the assessment, the Ballard score, Acknowledgments
expense of specificity. A very simple tended to overestimate GA and We acknowledge the students who
tool based on a single parameter had wide margins of error. The were also part of the GA working group
(such as foot size or another Dubowitz score had improved in the Brigham and Women’s Hospital
anthropometric parameter) may accuracy, although feasibility is global newborn health laboratory
be suitable to meet these needs. a critical consideration in LMIC, (Chelsea Clark). We also thank the
On the other hand, for research, and the complexity, training, and Brigham and Women’s Hospital
a more precise and continuous time to conduct the assessment are Department of Newborn Medicine and
measurement of GA is desirable challenges to scale up. Additional Dr Terrie Inder for their support of this
and early pregnancy ultrasound high-quality studies are needed in work. Finally, we thank the following
should be used. At the population LMIC to determine the accuracy individuals for their assistance in
level, inaccuracy and imprecision of neonatal assessment compared translating foreign articles: Madeline
in GA dating may result in biased with an early ultrasound reference, Gilbert, Alison Leschen, Maria
estimates of preterm birth rates particularly in settings with SGA, Dąbrowska, Susan Throckmorton, Felix
and epidemiologic associations with as well as to explore the feasibility Bergmann, and Lina Driouk.
preterm birth.‍95 Determining the of implementation of complex
optimal precision (ie, a 95% CI GA assessments. This work also
of ±1 or 2 vs 3 weeks) and diagnostic underlines the importance of future
accuracy is also critical to choosing focus on increasing the maternal Abbreviations
an appropriate method of GA demand for knowledge of the GA AVCL: anterior vascular capsule
measurement for LMIC. Future of their pregnancy, improving of the lens
research priorities for improving coverage of early pregnancy BOE: best obstetric estimate
GA determination in LMIC are ultrasound scans, and innovations CI: confidence interval
shown in ‍Fig 4. to improve GA assessment in late GA: gestational age
pregnancy, such as novel ultrasound HIC: high-income country
approaches. In settings where LBW: low birth weight
Conclusions early ultrasound is not possible, LMIC: low- and middle-income
As part of the Metrics Group of increased efforts and innovation are countries
the Every Newborn Action Plan, urgently needed to develop simpler LMP: last menstrual period
we have conducted the first yet specific approaches for clinical QUADAS–2: Quality Assessment
systematic review and meta-analysis GA assessment of the newborn, of Diagnostic
assessing the diagnostic accuracy either through new combinations of Accuracy Studies–2
of neonatal GA assessments and existing parameters, new signs, or SGA: small for gestational age
scores. The most commonly used technology.

Downloaded from www.aappublications.org/news by guest on July 23, 2019


20 Lee et al
Partners International, Yangon, Myanmar; and hFaculty of Epidemiology and Population Health and iThe Centre for Maternal, Adolescent, Reproductive, and Child Health (MARCH),
London School of Hygiene and Tropical Medicine, London, United Kingdom

Dr Lee conceptualized and designed the study, coordinated and supervised data collection, completed secondary data extraction, and drafted, reviewed,
revised, and finalized the manuscript; Dr Panchal designed the database searches, carried out initial screening and data extraction for postnatal clinical exams,
conducted meta-analyses, and reviewed and revised the manuscript; Ms Folger screened and extracted data for anterior vascularity of the lens, helped write
sections of the manuscript, and formatted, reviewed, and revised the manuscript; Dr Whelan undertook initial screening and data extraction for postnatal clinical
exams and reviewed the manuscript; Ms Whelan coordinated and supervised data collection and data extraction, reviewed the extracted data, and reviewed
and revised the manuscript; Dr Rosner advised the statistical analysis of the data extracted, provided feedback on analyses, and reviewed and revised the
manuscript; Drs Blencowe and Lawn helped synthesize the data and data analysis and critically reviewed and revised the manuscript; and all authors approved
the final manuscript as submitted.
This systematic review was registered with the International Prospective Register of Systematic Reviews. PROSPERO registration number: CRD42015020499.
DOI: https://​doi.​org/​10.​1542/​peds.​2017-​1423
Accepted for publication Jul 25, 2017
Address correspondence to Anne CC Lee, MD, MPH, Department of Pediatric Newborn Medicine, Brigham and Women’s Hospital, BB502A, 75 Francis St, Boston, MA
02115. E-mail: alee6@bwh.harvard.edu
PEDIATRICS (ISSN Numbers: Print, 0031-4005; Online, 1098-4275).
Copyright © 2017 by the American Academy of Pediatrics
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.
FUNDING: This work was supported by the Bill & Melinda Gates Foundation through grant OPP1130198.
POTENTIAL CONFLICT OF INTEREST: The authors have indicated they have no potential conflicts of interest to disclose.

References
1. Blencowe H, Cousens S, Oestergaard 6. World Health Organization . Global 12. Farr V, Mitchell RG, Neligan GA,
MZ, et al. National, regional, and health workforce shortage to reach Parkin JM. The definition of some
worldwide estimates of preterm birth 12.9 million in coming decades. external characteristics used in the
rates in the year 2010 with time trends 2013 . Available at: www.​who.​int/​ assessment of gestational age in the
since 1990 for selected countries: a mediacentre/​news/​releases/​2013/​ newborn infant. Dev Med Child Neurol.
systematic analysis and implications. health-​workforce-​shortage/​en/​ . 1966;8(5):507–511
Lancet. 2012;379(9832):2162–2172 Accessed May 28, 2017 13. Amiel-Tison C. Neurological evaluation
2. World Bank . World bank country and 7. UNICEF . Antenatal care: current status of the maturity of newborn infants.
lending groups. 2017 . Available at: and progress. 2017 . Available at: https://​ Arch Dis Child. 1968;43(227):89–93
https://​datahelpdesk.​worldbank.​org/​ data.​unicef.​org/​topic/​maternal-​health/​
14. Dubowitz LM, Dubowitz V, Goldberg
knowledgebase/​articles/​906519-​world-​ antenatal-​care/​. Accessed April 17, 2017
C. Clinical assessment of gestational
bank-​country-​and-​lending-​groups . 8. Bucher S, Marete I, Tenge C, et al. age in the newborn infant. J Pediatr.
Accessed May 21, 2017 A prospective observational description 1970;77(1):1–10
3. Liu L, Johnson HL, Cousens S, et al; of frequency and timing of antenatal
15. Capurro H, Konichezky S, Fonseca
Child Health Epidemiology Reference care attendance and coverage of
D, Caldeyro-Barcia R. A simplified
Group of WHO and UNICEF. Global, selected interventions from sites in
method for diagnosis of gestational
regional, and national causes Argentina, Guatemala, India, Kenya,
age in the newborn infant. J Pediatr.
of child mortality: an updated Pakistan and Zambia. Reprod Health.
1978;93(1):120–122
systematic analysis for 2010 with 2015;12(suppl 2):S12
time trends since 2000. Lancet. 16. Parkin JM, Hey EN, Clowes JS. Rapid
9. Wang W, Alva S, Wang S, Fort A. Levels
2012;379(9832):2151–2161 assessment of gestational age at birth.
and Trends in the Use of Maternal
Arch Dis Child. 1976;51(4):259–263
4. Aliyu LD, Kurjak A, Wataganara T, Health Services in Developing
et al. Ultrasound in Africa: what Countries. Calverton, MD: USAID; 2011 17. Eregie CO. Assessment of gestational
can really be done? J Perinat Med. age: modification of a simplified
10. Committee opinion no 611: method for
2016;44(2):119–123 method. Dev Med Child Neurol.
estimating due date. Obstet Gynecol.
1991;33(7):596–600
5. Savitz DA, Terry JW Jr, Dole N, Thorp 2014;124(4):863–866
18. Bhagwat VA, Dahat HB, Bapat NG.
JM Jr, Siega-Riz AM, Herring AH. 11. Blencowe H, Cousens S, Chou D, et al;
Determination of gestational age of
Comparison of pregnancy dating by Born Too Soon Preterm Birth Action
newborns–a comparative study. Indian
last menstrual period, ultrasound Group. Born too soon: the global
Pediatr. 1990;27(3):272–275
scanning, and their combination. epidemiology of 15 million preterm
Am J Obstet Gynecol. births. Reprod Health. 2013; 19. Ballard JL, Novak KK, Driver M. A
2002;187(6):1660–1666 10(suppl 1):S2 simplified score for assessment of

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017 21
fetal maturation of newly born infants. examination. Arch Dis Child. 42. Sunjoh F, Njamnshi AK, Tietche F, Kago
J Pediatr. 1979;95(5, pt 1):769–774 1966;41(218):437–447 I. Assessment of gestational age in
20. Ballard JL, Khoury JC, Wedig K, Wang L, 32. Serfontein GL, Jaroszewicz AM. the Cameroonian newborn infant: a
Eilers-Walsman BL, Lipp R. New Ballard Estimation of gestational age at birth. comparison of four scoring methods.
score, expanded to include extremely Comparison of two methods. Arch Dis J Trop Pediatr. 2004;50(5):285–291
premature infants. J Pediatr. Child. 1978;53(6):509–511 43. Roberts CJ, Hibbard BM, Evans DR, et al.
1991;119(3):417–423 Precision in estimating gestational
33. Bindusha S, Rasalam CS, Sreedevi
21. Amiel-Tison C; Maillard, F; Lebrun, F; age and its influence on sensitivity
N. Gestational age assessment of
Breart, G; Papiernik E. Neurological of alphafetoprotein screening. BMJ.
newborn- clinical trial of a simplified
and physical maturation in normal 1979;1(6169):981–983
method. Transworld
growth singletons from 37 to 41 Med J. 2014;2(1):24–28 44. Vik T, Vatten L, Markestad T, Jacobsen
weeks’ gestation. Early Human G, Bakketeig LS. Dubowitz assessment
34. World Health Organization (WHO);
Development. 1999;54:145–156 of gestational age and agreement with
United Nations International Children’s
22. Feresu SA, Gillespie BW, Sowers MF, prenatal methods. Am J Perinatol.
Emergency Fund (UNICEF). Every
Johnson TR, Welch K, Harlow SD. 1997;14(6):369–373
Newborn: An Action Plan to End
Improving the assessment of gestational Preventable Deaths (ENAP). Geneva, 45. Awoust J, Keuwez JJ, Levi S.
age in a Zimbabwean population. Switzerland: World Health Organization; Comparison between three methods
Int J Gynaecol Obstet. 2002;78(1):7–18 2014 for assessment of fetal age. J Foetal
23. Nicolopoulos D, Perakis A, Papadakis Med. 1982;2(1):11–15
35. World Health Organization (WHO) .
M, Alexiou D, Aravantinos D. Estimation Every Newborn Action Plan Metrics. 46. Sanders M, Allen M, Alexander GR,
of gestational age in the neonate: a In: WHO technical consultation on et al. Gestational age assessment
comparison of clinical methods. newborn health indicators ; December in preterm neonates weighing
Am J Dis Child. 1976;130(5):477–480 3–5, 2014 ; Ferney Voltaire, France less than 1500 grams. Pediatrics.
24. Finnström O. Studies on maturity 1991;88(3):542–546
36. Whiting PF, Rutjes AW, Westwood ME,
in newborn infants. II. External et al; QUADAS-2 Group. QUADAS-2: a 47. Wariyar U, Tin W, Hey E. Gestational
characteristics. Acta Paediatr Scand. revised tool for the quality assessment assessment assessed. Arch
1972;61(1):24–32 of diagnostic accuracy studies. Ann Dis Child Fetal Neonatal Ed.
25. Farr V. Estimation of gestational Intern Med. 2011;155(8):529–536 1997;77(3):F216–F220
age by neurological assessment 37. Kramer MS, McLean FH, Boyd 48. Robillard PY, De Caunes F, Alexander
in first week of life. Arch Dis Child. ME, Usher RH. The validity of GR, Sergent MP. Validity of postnatal
1968;43(229):353–357 gestational age estimation by assessments of gestational age
26. Tunçer M, Yilgör E, Erdem G. A menstrual dating in term, preterm, in low birthweight infants from a
new, simple three-step method and postterm gestations. JAMA. Caribbean community. J Perinatol.
for determining gestational age. 1988;260(22):3306–3308 1992;12(2):115–119
Turk J Pediatr. 1982;23(2):85–97 49. Shukla H, Atakent YS, Ferrara A, Topsis
38. Rosner B. Fundamentals of
27. Kollée LA, Leusink J, Peer PG. Biostatistics. 8th ed. Boston, MA: J, Antoine C. Postnatal overestimation
Assessment of gestational age: a Cengage Learning; 2016 of gestational age in preterm infants.
simplified scoring system. J Perinat Am J Dis Child. 1987;141(10):1106–1107
39. Macaskill P, Gatsonis C, Deeks JJ,
Med. 1985;13(3):135–138 50. Moore KA, Simpson JA, Thomas KH,
Harbord RM, Takwoingi Y. Analysing
28. Klimek R, Klimek M, Rzepecka-Weglarz and Presenting Results. In: Deeks JJ, et al. Estimating gestational age in
B. A new score for postnatal clinical Bossuyt PM, Gatsonis C, eds. Cochrane late presenters to antenatal care
assessment of fetal maturity in Handbook for Systematic Reviews of in a resource-limited setting on the
newborn infants. Int J Gynaecol Obstet. Diagnostic Test Accuracy Version 1.0, Thai-Myanmar border. PLoS One.
2000;71(2):101–105 10. The Cochrane Collaboration; 2010. 2015;10(6):e0131025
29. Allan RC, Sayers S, Powers J, Singh Available at: http://​srdta.​cochrane.​ 51. Rosenberg RE, Ahmed AS, Ahmed S, et al.
G. The development and evaluation of org/​ Determining gestational age in a
a simple method of gestational age 40. Lee AC, Mullany LC, Ladhani K, et al; low-resource setting: validity of last
estimation. J Paediatr Child Health. Projahnmo Study Group. Validity menstrual period. J Health Popul Nutr.
2009;45(1–2):15–19 of newborn clinical assessment 2009;27(3):332–338
30. Narayanan I, Dua K, Gujral VV, Mehta to determine gestational age 52. Karunasekera KA, Sirisena J,
DK, Mathew M, Prabhakar AK. A simple in Bangladesh. Pediatrics. Jayasinghe JA, Perera GU. How
method of assessment of gestational 2016;138(1):e20153303 accurate is the postnatal estimation
age in newborn infants. Pediatrics. 41. Raghu MB, Patel YS, Gupta K. of gestational age? J Trop Pediatr.
1982;69(1):27–32 Estimation of gestational age in 2002;48(5):270–272
31. Robinson RJ. Assessment of Zambian newborn infants. Ann Trop 53. Mitchell D. Accuracy of pre-
gestational age by neurological Paediatr. 1981;1(4):245–247 and postnatal assessment of

Downloaded from www.aappublications.org/news by guest on July 23, 2019


22 Lee et al
gestational age. Arch Dis Child. 65. Amato M, Hüppi P, Claus R. Rapid of femur length. J Ultrasound Med.
1979;54(11):896–897 biometric assessment of gestational 1996;15(2):115–120
54. Vogt H, Haneberg B, Finne PH, age in very low birth weight infants. 75. Alexander GR, Hulsey TC, Smeriglio
Stensberg A. Clinical assessment of J Perinat Med. 1991;19(5):367–371 VL, Comfort M, Levkoff A. Factors
gestational age in the newborn infant. 66. Karl S, Li Wai Suen CS, Unger HW, influencing the relationship between
An evaluation of two methods. Acta et al. Preterm or not–an evaluation a newborn assessment of gestational
Paediatr Scand. 1981;70(5):669–672 of estimates of gestational age maturity and the gestational age
55. Latis GO, Simionato L, Ferraris G. in a cohort of women from Rural interval. Paediatr Perinat Epidemiol.
Clinical assessment of gestational age Papua New Guinea. PLoS One. 1990;4(2):133–146
in the newborn infant. Comparison 2015;10(5):e0124286 76. Alexander GR, de Caunes F, Hulsey
of two methods. Early Hum Dev. 67. Moraes CL, Reichenheim ME. [Validity TC, Tompkins ME, Allen M. Ethnic
1981;5(1):29–37 of neonatal clinical assessment variation in postnatal assessments
56. Hertz RH, Sokol RJ, Knoke JD, Rosen for estimation of gestational age: of gestational age: a reappraisal.
MG, Chik L, Hirsch VJ. Clinical comparison of new ++Ballard+ score Paediatr Perinat Epidemiol.
estimation of gestational age: rules for with date of last menstrual period and 1992;6(4):423–433
avoiding preterm delivery. Am J Obstet ultrasonography]. Cad Saude Publica. 77. Ahn Y. Assessment of gestational
Gynecol. 1978;131(4):395–402 2000;16(1):83–94 age using an extended New Ballard
57. Cevit O, Bayram B, Toksoy HB, 68. Sreekumar K, d’Lima A, Nesargi S, examination in Korean newborns.
Gültekin A, Gökalp A. Gestational age Rao S, Bhat S. Comparison of New J Trop Pediatr. 2008;54(4):278–281
assessment in preterm neonates Ballards score and Parkins score for 78. Sasidharan K, Dutta S, Narang A.
weighing less than 2500 grams. J Trop gestational age estimation. Indian Validity of New Ballard score until 7th
Pediatr. 1998;44(1):57–58 Pediatr. 2013;50(8):771–773 day of postnatal life in moderately
58. Jaroszewicz AM, Boyd IH. Clinical 69. Wylie BJ, Kalilani-Phiri L, Madanitsa preterm neonates. Arch Dis Child Fetal
assessment of gestational age M, et al. Gestational age assessment Neonatal Ed. 2009;94(1):F39–F44
in the newborn. S Afr Med J. in malaria pregnancy cohorts: a 79. Verhoeff FH, Milligan P, Brabin BJ,
1973;47(44):2123–2124 prospective ultrasound demonstration Mlanga S, Nakoma V. Gestational age
project in Malawi. Malar J. 2013;12:183 assessment by nurses in a developing
59. Dawodu A, Qureshi MM, Moustafa IA,
Bayoumi RA. Epidemiology of clinical 70. Taylor RA, Denison FC, Beyai S, Owens country using the Ballard method,
hyperbilirubinaemia in Al Ain, United S. The external Ballard examination external criteria only. Ann Trop
Arab Emirates. Ann Trop Paediatr. does not accurately assess the Paediatr. 1997;17(4):333–342
1998;18(2):93–99 gestational age of infants born 80. Eregie CO. A new method for maturity
60. Scher MS, Barmada MA. Estimation at home in a rural community of determination in newborn infants.
of gestational age by electrographic, The Gambia. Ann Trop Paediatr. J Trop Pediatr. 2000;46(3):140–144
clinical, and anatomic criteria. Pediatr 2010;30(3):197–204
81. Oliveira S, Kimura AMR . Evaluation of
Neurol. 1987;3(5):256–262 71. Thi HN, Khanh DK, Thu HT, Thomas EG, the gestational age through prenatal
61. Alexander GR, de Caunes F, Hulsey Lee KJ, Russell FM. Foot length, chest and postnatal data. In: 4th World
TC, Tompkins ME, Allen M. Validity circumference, and mid upper arm Congress of Perinatal Medicine ;
of postnatal assessments of circumference are good predictors of Buenos Aires, Argentina ; 1999 . 1091
gestational age: a comparison of the low birth weight and prematurity in – 1094
method of Ballard et al. and early ethnic minority newborns in Vietnam:
82. Pereira AP, Dias MA, Bastos MH,
ultrasonography. Am J Obstet Gynecol. a hospital-based observational study.
da Gama SG, Leal MC. Determining
1992;166(3):891–895 PLoS One. 2015;10(11):e0142420
gestational age for public health care
62. Smith LN, Dayal VH, Monga M. Prior 72. Baumann C, Hüppi P, Amato M. users in Brazil: comparison of methods
knowledge of obstetric gestational [Prenatal and postnatal determination and algorithm creation. BMC Res
age and possible bias of Ballard score. of gestational age of small newborn Notes. 2013;6:60
Obstet Gynecol. 1999;93(5, pt 1):712–714 infants]. Z Geburtshilfe Perinatol.
83. Laveriano WRV. [Reliability of the post
1993;197(3):135–140
63. Dombrowski MP, Wolfe HM, Brans natal gestational assessment: Capurro
YW, Saleh AA, Sokol RJ. Neonatal 73. Constantine NA, Kraemer HC, Kendall- test compared with ultrasound at 10+0
morphometry. Relation to obstetric, Tackett KA, Bennett FC, Tyson JE, Gross to 14+2 weeks of gestation]. Rev Peru
pediatric, and menstrual estimates RT. Use of physical and neurologic Ginecol Obstet. 2015;61(2):115–118
of gestational age. Am J Dis Child. observations in assessment of
84. Neufeld LM, Haas JD, Grajéda R,
1992;146(7):852–856 gestational age in low birth weight
Martorell R. Last menstrual period
infants. J Pediatr. 1987;110(6):921–928
64. Gagliardi L, Scimone F, DelPrete A, provides the best estimate of
et al. Precision of gestational age 74. Mackanjee HR, Iliescu BM, Dawson WB. gestation length for women in rural
assessment in the neonate. Acta Assessment of postnatal gestational Guatemala. Paediatr Perinat Epidemiol.
Paediatr. 1992;81(2):95–99 age using sonographic measurements 2006;20(4):290–298

Downloaded from www.aappublications.org/news by guest on July 23, 2019


PEDIATRICS Volume 140, number 6, December 2017 23
85. Lee Anne CC, Uddin J, Shah R, et al . steroid administration on the anterior capsule of the lens. J Med Assoc Thai.
Validation of community health worker vascular capsule of the lens (AVCL) 1986;69(suppl 2):38–45
clinical assessment of gestational age in preterm infants. Pediatr Res. 92. Skapinker R, Rothberg AD. Postnatal
in rural Bangladesh. In: Proceedings 1980;14(4):456 regression of the tunica vasculosa
from the Pediatric Academic 89. Hittner HM, Gorman WA, Rudolph lentis. J Perinatol. 1987;7(4):
Societies Annual Meeting ; May 2013 ; AJ. Examination of the anterior 279–281
Washington, DC vascular capsule of the lens: II. 93. Damoulaki-Sfakianski E, Robertson A,
86. Aslan Y, Yildiran A, Sen Y, Erduran Assessment of gestational age in Gordero L. Skin creases on the sole of
E, Kasim S, Gedik Y. Assessment of infants small for gestational age. the foot as a physical index of maturity:
gestational age in healthy neonates J Pediatr Ophthalmol Strabismus. comparison between Caucasian
by auxiliary health personnel using a 1981;18(2):52–54 and Negro infants. Pediatrics.
simple scoring system. Turk Klin J Med 90. Krishnamohan VK, Wheeler MB, 1972;50(3):483–485
Res. 2000;18(3):121–125 Testa MA, Philipps AF. Correlation of 94. Fujimoto W, Samoa R, Wotring
87. Hittner HM, Hirsch NJ, Rudolph AJ. postnatal regression of the anterior A. Gestational diabetes in high-
Assessment of gestational age by vascular capsule of the lens to risk populations. Clin Diabetes.
examination of the anterior vascular gestational age. J Pediatr Ophthalmol 2013;31(2):90–94
capsule of the lens. J Pediatr. Strabismus. 1982;19(1):28–32 95. Martin JA, Hamilton BE, Osterman
1977;91(3):455–458 91. Sasivimolkul W, Siripoonya P, Tejavej MJ, Curtin SC, Matthews TJ. Births:
88. Guillory C, Carsia-Prats JA, Hittner A. Gestational age assessment by the final data for 2013. Natl Vital Stat Rep.
HM, Rudolph J. Effect of prenatal examination of the anterior vascular 2015;64(1):1–65

Downloaded from www.aappublications.org/news by guest on July 23, 2019


24 Lee et al
Diagnostic Accuracy of Neonatal Assessment for Gestational Age Determination:
A Systematic Review
Anne CC Lee, Pratik Panchal, Lian Folger, Hilary Whelan, Rachel Whelan, Bernard
Rosner, Hannah Blencowe and Joy E. Lawn
Pediatrics 2017;140;
DOI: 10.1542/peds.2017-1423 originally published online November 17, 2017;

Updated Information & including high resolution figures, can be found at:
Services http://pediatrics.aappublications.org/content/140/6/e20171423
References This article cites 85 articles, 15 of which you can access for free at:
http://pediatrics.aappublications.org/content/140/6/e20171423#BIBL
Subspecialty Collections This article, along with others on similar topics, appears in the
following collection(s):
Fetus/Newborn Infant
http://www.aappublications.org/cgi/collection/fetus:newborn_infant_
sub
Neonatology
http://www.aappublications.org/cgi/collection/neonatology_sub
International Child Health
http://www.aappublications.org/cgi/collection/international_child_he
alth_sub
Permissions & Licensing Information about reproducing this article in parts (figures, tables) or
in its entirety can be found online at:
http://www.aappublications.org/site/misc/Permissions.xhtml
Reprints Information about ordering reprints can be found online:
http://www.aappublications.org/site/misc/reprints.xhtml

Downloaded from www.aappublications.org/news by guest on July 23, 2019


Diagnostic Accuracy of Neonatal Assessment for Gestational Age Determination:
A Systematic Review
Anne CC Lee, Pratik Panchal, Lian Folger, Hilary Whelan, Rachel Whelan, Bernard
Rosner, Hannah Blencowe and Joy E. Lawn
Pediatrics 2017;140;
DOI: 10.1542/peds.2017-1423 originally published online November 17, 2017;

The online version of this article, along with updated information and services, is
located on the World Wide Web at:
http://pediatrics.aappublications.org/content/140/6/e20171423

Data Supplement at:


http://pediatrics.aappublications.org/content/suppl/2017/11/15/peds.2017-1423.DCSupplemental

Pediatrics is the official journal of the American Academy of Pediatrics. A monthly publication, it
has been published continuously since 1948. Pediatrics is owned, published, and trademarked by
the American Academy of Pediatrics, 141 Northwest Point Boulevard, Elk Grove Village, Illinois,
60007. Copyright © 2017 by the American Academy of Pediatrics. All rights reserved. Print ISSN:
1073-0397.

Downloaded from www.aappublications.org/news by guest on July 23, 2019

Você também pode gostar