Escolar Documentos
Profissional Documentos
Cultura Documentos
Mobile Application
Rating Scale (MARS)
A new tool for assessing the quality of health mobile applications
December 2014
Institute of Health and Biomedical Innovation (IHBI), School of Psychology and Counselling, Queensland
University of Technology (QUT), Brisbane, Australia
Science and Engineering Faculty (SEF), Queensland University of Technology (QUT), Brisbane, Australia
ISBN: 978-0-9925966-5-1
Suggested citation: Hides, L et al. 2014, Mobile Application Rating Scale (MARS): A new tool for assessing the
quality of health mobile applications, Young and Well Cooperative Research Centre, Melbourne.
Copies of this guide can be downloaded from the Young and Well CRC website youngandwellcrc.org.au
This project was funded by the Young and Well Cooperative Research Centre (Young and Well CRC). The Young
and Well CRC (youngandwellcrc.org.au) is an Australian-based, international research centre that unites young
people with researchers, practitioners, innovators and policy-makers from over 70 partner organisations.
Together, we explore the role of technology in young people’s lives, and how it can be used to improve the mental
health and wellbeing of young people aged 12 to 25. The Young and Well CRC is established under the
Australian Government’s Cooperative Research Centres Program.
We would like to acknowledge Associate Professor Susan Keys and Michael Gould for their assistance with the
development of the original version of the MARS.
Our gratitude goes out to Dimitrios Vagenas for his statistical advice.
Associate Professor Leanne Hides is supported by an Australian Research Council Future Fellowship.
Young and Well Cooperative Research Centre Queensland University of Technology (QUT)
The Young and Well Cooperative Research QUT is a top Australian university with global
Centre is an Australian-based, international connections and an applied emphasis in courses
research centre that unites young people with and research best suited to the needs of industry
researchers, practitioners, innovators and policy- and the community. QUT has a reputation for
makers from more than 70 partner organisations. quality undergraduate and postgraduate courses
Together, we explore the role of technology in and has 42,000 students, including 6000 from
young people’s lives, and how it can be used to overseas. Courses are in high demand and its
improve the mental health and wellbeing of young graduate employment rate is well above the
people aged 12 to 25. The Young and Well CRC national average for Australian universities. The
is established under the Australian Government’s CRC for Young People, Technology and
th
Cooperative Research Centres Program. Wellbeing is QUT’s 10 CRC, and is a
partnership between the Faculties of Health,
youngandwellcrc.org.au Science and Engineering, Business and
Creative Industries at QUT
qut.edu.au
Objective
This study aimed to develop a reliable, multidimensional measure for trialling, classifying and rating the quality of
mobile health applications.
Methods
A literature search was conducted to identify articles containing explicit web or app quality rating criteria published
between January 2000 and January 2013. Existing criteria for the assessment of app quality were categorised by
an expert panel to develop the new Mobile App Rating Scale (MARS) subscales, items, descriptors and anchors.
Sixty wellbeing apps were randomly selected using an iTunes search for MARS rating. Ten were used to pilot the
rating procedure, and the remaining 50 provided data on inter-rater reliability.
Results
372 explicit criteria for assessing web or app quality were extracted from 25 published papers, conference
proceedings, and online resources. Five broad categories of criteria were identified including four objective quality
scales: engagement, functionality, aesthetics, information quality; and one subjective quality scale; which were
refined into the 23-item MARS. The MARS demonstrated excellent internal consistency (α = .90) and inter-rater
reliability (ICC = .79).
Conclusions
The MARS is a simple, objective and reliable tool for classifying and assessing the quality of mobile health apps.
It can also be used to provide a checklist for the design and development of new high quality health apps.
Given the rapid proliferation of smartphone apps, it is increasingly difficult for users, health professionals, and
researchers to readily identify and assess high quality apps (Cummings, Borycki & Roehrer 2013). Little
information on the quality of apps is available, beyond the star-ratings published on retailers’ webpages, and app
reviews are subjective by nature and may come from suspicious sources (Kuehnhausen & Frost 2013). Selecting
apps on the basis of popularity yields little or no meaningful information on app quality (Girardello & Michahelles
2010).
Much of the published literature focuses on technical aspects of websites, presented mostly in the form of
checklists, which do not assess the quality of these features (Aladwani and Palvia 2002; Olsina and Rossi 2002;
Seethamraju 2004). Website quality can be described as a function of: i) content, ii) appearance and multimedia,
iii) navigation, iv) structure and design, and v) uniqueness (Moustakis et al. 2004). A synthesis of website
evaluation criteria conducted by Kim et al. (1999) shortlisted 165 evaluation criteria, grouped in 13 groups (for
example, design and aesthetics, ease of use). However, 33 criteria were unable to be grouped and were coded as
"miscellaneous’, highlighting the complexity of the task. While many website criteria may be applicable to mobile
apps, there is a need to consider whether a specific quality rating scale may be needed for apps.
Attempts to develop mobile health (mHealth) evaluation criteria are often too general, complex, or specific to a
particular health domain. Handel (2011) reviewed 35 health and wellbeing mobile apps based on user ratings of:
i) ease of use, ii) reliability, iii) quality, iv) scope of information, and v) aesthetics. While these criteria may cover
important aspects of quality, no rationale for these specific criteria was provided. Khoja et al. (2013) described the
development of a matrix of evaluation criteria, divided into seven themes for each of the four stages of an app’s
lifecycle: i) development, ii) implementation, iii) integration, and iv) sustained operation. While this matrix provides
comprehensive criteria for rating app quality, the complex and time-consuming nature of the evaluation scheme
would be difficult to apply in routine practice and research. Furthermore, the matrix omits any evaluation of the
visual aesthetics of the app as a criterion.
Guidelines for evaluating the usability of mHealth apps were also compiled by the Health Care Information and
Management Systems Society (HIMSS) (2012). While the criteria were extensive, and included usability criteria
for rating efficiency, effectiveness, user satisfaction, and platform optimisation, no criteria for rating information
quality were included. This is problematic as it is important to evaluate the quality and quantity of health
information contained in mHealth apps to avoid potential harm to users (Su 2014). The HIMSS guidelines also
use a “Strongly agree” to “Strongly disagree” likert scale to rate each criterion, which does not provide an
indication of their quality. Strong agreement that a criterion is met (that is, clarity in whether a feature is present) is
not necessarily equivalent to meeting the criterion to a high degree.
A multidimensional, reliable and objective instrument is needed to rate the degree that mHealth apps
satisfy quality criteria. This scale should be easy to understand and use, and ideally should be
applicable to the needs of app developers, researchers, and health professionals.
OBJECTIVES
To develop a reliable, multidimensional instrument for app developers, researchers, and health professionals to
trial, classify and rate the quality of mobile health apps.
Three key websites, including the EU’s Usability.net, Nielsen Norman Group’s user experience (UX) criteria, and
Healthcare Information and Management Systems Society (HiMSS) were searched for relevant information.
References of retrieved articles were also hand-searched. Professional research manuals, unpublished
manuscripts and conference proceedings were also explored for additional quality criteria. After initial screening of
title and abstract, only studies that reported quality assessment criteria for applications or web content were
included.
Website and app assessment criteria identified in previous research were extracted. Criteria irrelevant to mobile
content and duplicates were removed. An advisory team of psychologists, interaction and interface designers and
developers, and professionals involved in the development of mHealth apps worked together to classify
assessment criteria into categories and sub-categories, and develop the scale items and descriptors. Additional
items assessing the app’s description in the online store and its evidence base were added. Corrections were
made until agreement between all panel members was reached.
App inclusion criteria were: i) English language; ii) free of charge; iii) availability in the Australian iTunes store; iv)
from iTunes categories: “Health & Fitness”; “Lifestyle”; “Medical”; “Productivity”; “Music”; “Education”; “Utilities”.
The category inclusion criteria were based on careful scrutiny of the titles and types of applications present in
those categories.
Sixty apps were randomly selected using randomization.com. The first ten were used for training and piloting
purposes. Two expert raters: a research officer with a Research Masters in Psychology and two years’ experience
in mobile app development, and a PhD candidate with a Masters degree in Applied Psychology and over nine
years’ I.T. experience, trialled each of the first ten apps for a minimum of ten minutes and then independently
rated their quality using the MARS. The raters convened to compare ratings and address ambiguities in the scale
content until consensus was reached. The MARS was revised based on that experience, and the remaining 50
mental health and wellbeing related apps were trialled and independently rated.
Table 1: Number of criteria for evaluation of mHealth app quality identified in the literature search
Information – quality, quantity, visual information, credibility, goals, description 113 32%
The classification category collected descriptive information on the app (for example, price, platform, rating) as
well as its technical aspects (for example, login, password-protection, sharing capabilities). Additional sections
collect information on the target age group of the app (if relevant) as well as information on what aspects of health
(including physical health, mental health, wellbeing) the app targets. These domains may be adapted to
include/exclude specific content areas as needed.
The app quality criteria were clustered within the engagement, functionality, aesthetics, information quality and
subjective quality categories, to develop 23 subcategories from which the 23 individual MARS items were
developed. Each MARS item used a five-point scale (1-Inadequate, 2-Poor, 3-Acceptable, 4-Good, 5-Excellent):
descriptors for these rating anchors were written for each item. In cases where an item may not be applicable for
all apps, an option of ‘Not applicable’ was included. The MARS items and rating descriptor terminology were
scrutinised by the expert panel, to ensure appropriate and consistent language was used throughout the scale.
The MARS is scored by calculating the mean scores of the engagement, functionality, aesthetics and information
quality objective subscales, and an overall mean app quality total score. Mean rather than total scores are used
because of ‘not applicable’ (NA) items. Additionally, mean scores are used to provide quality ratings
corresponding to the familiar format of star ratings. The subjective quality items can be scored separately as
individual items, or a mean subjective quality score. The MARS app classification section is for descriptive
purposes only.
IDENTIFICATION
518 remaining
SCREENING/ELIGIBILITY
18 excluded: faulty or
with irrelevant content
and randomly replaced
INCLUDED
60 rated by MARS
On attempting to rate the initial ten apps, it was found that one was faulty and could not be rated. MARS ratings of
the remaining nine apps indicated the scale had a high level of internal consistency (α = .78) and fair inter-rater
reliability (two-way mixed ICC = .57, 95 percent CI .41–.69). The ‘not applicable’ option was removed from items
within the engagement category, as this feature was considered to be an important and universal component of
all high-quality apps. The meaning of ‘visual information’ was clarified and the item rephrased. The stated or
inferred age target of ‘young people’ was defined as app users aged 16-25. The descriptor of ‘goals’ was clarified
to read: “Does the app have specific, measurable and achievable goals (specified in app store description or
within the app itself)?” to help distinguish it from the item ‘accuracy of app description’, which often relates to the
app’s goals. On the information subscale, raters found it difficult to determine when lack of information within an
app should be rated as ’not applicable’ or as a flaw: this item was therefore revised to require that information be
rated unless the apps were purely for entertainment. The final version of the MARS is provided in Appendix 3.
Independent ratings on the overall MARS total score of the remaining 50 mental health and wellbeing apps
demonstrated an excellent level of inter-rater reliability (two-way mixed ICC = .79; 95 percent CI .75-.83). The
MARS total score had excellent internal consistency (α = .90) and was highly correlated with the MARS star rating
item (#23), r(50) = .89, p < .001. Internal consistencies of the MARS subscales were also very high (α = .80-.89,
Median = .85), and their inter-rater reliabilities were fair to excellent (ICC = .50-.80; Median = .65). Detailed item
and subscale statistics are presented in Table 2.
Corrected Mean SD
# Subscale / Item Item-Total
correlation
Engagement α = 0.89, ICC = 0.80 (95% CI 0.73-0.85)
1 Entertainment .63 2.49 1.24
2 Interest .69 2.52 1.20
3 Customisation .60 2.27 1.15
4 Interactivity .65 2.70 1.22
5 Target group .61 3.41 0.93
a
Item 19 ‘Evidence base’ was excluded from all calculations, as it currently contains no measurable data.
b
The subjective quality subscale was excluded from the total MARS ICC calculation.
Only 15 of the 50 mental health and wellbeing apps extracted from the iTunes app-store had received the five
user ratings required for a star rating to be displayed. These apps showed a moderate correlation between the
iTunes star rating and the total MARS score (r(15) = .55, p < 0.05).
The use of objective MARS item anchors and the high level of inter-rater reliability obtained in the current study
should allow health practitioners and researchers to use the scale with confidence. Both the app quality total
score and four app-quality subscales had high internal consistency, indicating that the MARS provides raters with
a reliable indicator of overall app quality as well as the quality of app engagement, functionality, aesthetics and
information quality. The exclusion of the subjective quality subscale from the overall mean app quality score, due
to its subjective nature, strengthens the objectivity of the MARS as a measure of app quality. Nevertheless, the
high correlation between the MARS quality total score and its overall star rating provides a further indication that it
is capturing perceived overall quality. It should be noted that the MARS overall star rating is likely to be influenced
by the prior completion of the 19 MARS app quality items. Nevertheless, the iTunes app store star ratings
available on 15 of the 50 mental health apps rated, were only moderately correlated with the MARS total score.
This was unsurprising, given the variable criteria likely to be used by different raters, the subjective nature of
these ratings and the lack of reliability of the iTunes star ratings, as has been highlighted in previous research
(Kuehnhausen & Frost 2013). In addition, the MARS overall star rating score was only moderately correlated with
the iTunes app store star rating. The MARS star rating is likely to provide a more reliable measure of overall app
quality, as it is rated following completion of the entire MARS and is therefore informed by the preceding items.
If multiple MARS raters are utilised, it is recommended that raters develop a shared understanding of the target
group for the apps, clarify the meaning of any MARS items they find ambiguous and determine if all MARS items
and subscales are relevant to the specific health area of interest. App-quality ratings should be piloted and
reviewed until an appropriate level of inter-rater reliability or consensus ratings are reached.
Due to the generic nature of the mHealth app quality indicators included in the MARS a number of “App Specific”
items have been added to the MARS to obtain information on the perceived impact of the app on the user’s
knowledge, attitudes and intentions related to the target health behaviour. (See App Specific section of the
MARS).
For convenience the MARS was piloted on iPhone, rather than Android apps. Since initial testing, however, the
scale has been applied to multiple Android apps and no compatibility issues were encountered. Thus, preliminary
data indicates the MARS is applicable to both iPhone and Android apps.
Following training, trainees are encouraged to complete an app rating exercise. This requires trainees to
download an app and rate it using the MARS. This activity is designed to allow trainees to check their
understanding of the MARS classification, quality items and their anchor points. Consensus expert ratings of the
app are also provided for trainees to check the reliability of their MARS ratings.
The ‘MARS app user’ version was developed to collect user ratings without a need for professional expertise. This
scale does not require raters to undertake training or review the app store descriptions, however it is
recommended that raters trial the app for at least ten minutes and check all components, buttons and functions of
the app for functionality and quality.
3.3 LIMITATIONS
While the original search strategy to identify the MARS app quality rating criteria was conducted using guidelines
for a systematic review, few peer-reviewed journal articles were identified. As a result, the search strategy was
expanded to include conference proceedings and online resources, which may not have been as extensively peer
reviewed. Suggested guidelines for scale development were followed (Oppenheim 1992) whereby a qualitative
analysis of existing research was conducted to extract app quality criteria and then develop app quality
categories, sub-categories, MARS items and their anchor ratings via a thematic review and expert panel ratings.
Despite these efforts, and the corrections made after piloting the scale, two MARS items on the functionality
subscale (‘ease of use’ and ‘navigation’) achieved only moderate levels of inter-rater reliability (ICC = 0.50). The
text ‘menu labels/icons’, was moved to ‘navigation’ component. ‘Ease of use’ was clarify to rate the availability of
clear instructions. These items are currently being tested for future revisions of the scale.
Researchers are yet to determine the effectiveness of the mental health apps included in this study in randomised
controlled trials. As a result, the MARS item ‘evidence base’ was not applicable for any of the apps in the current
study and we were unable to test its performance. It is hoped that as the evidence base for health apps develops,
the applicability of this MARS item will be tested. This important component of health-related mobile apps should
in the future become a standard for all applications making health-related claims (Su 2014).
3.5 CONCLUSION
The MARS provides a multidimensional, reliable, and flexible app-quality rating scale for researchers, developers
and health-professionals. Current results suggest that the MARS is a reliable measure of health app quality,
provided raters are sufficiently and appropriately trained.
Aladwani, AM & Palvia, PC 2002, Developing and validating an instrument for measuring user-perceived web
quality. Information & Management; vol 39, no 6, pp 467-476.
Cisco & Co, 2014. Cisco visual networking index: global mobile data traffic forecast update, 2013–2018, accessed
from http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-
vni/white_paper_c11-520862.html.
Cummings, E, Borycki, E & Roehrer, E 2013, Issues and considerations for healthcare consumers using mobile
applications. Studies in Health Technology and Informatics, vol 182, pp 227-231.
Daum, A 2013, 11% quarterly growth in downloads for leading app stores, accessed from
http://www.canalys.com/newsroom/11-quarterly-growth-downloads-leading-app-stores.
Dredge, S 2013, Mobile apps revenues tipped to reach $26bn in 2013, accessed from
http://www.theguardian.com/technology/appsblog/2013/sep/19/gartner-mobile-apps-revenues-report.
Girardello, A & Michahelles, F 2010, AppAware: which mobile applications are hot? Proceedings of the 12th
international conference on Human computer interaction with mobile devices and services.
Hallgren, KA 2012, Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in
quantitative methods for psychology, vol 8, no 1, pp 23-34.
Handel, MJ 2011, Mhealth (mobile health)-using apps for health and wellness. Explore-the Journal of Science and
Healing; vol 7, no 4, pp 256-261.
Health Care Information and Management Systems Society. Selecting a mobile app: evaluating the usability of
medical applications. mHIMSS App Usability Work Group, 2012.
Khoja, S, Durrani, H, Scott, RE, Sajwani, A & Piryani U 2013, Conceptual framework for development of
comprehensive e-health evaluation tool, Telemed J E Health, vol 19, no 1, pp 48-53.
Kim, P, Eng, TR, Deering, MJ & Maxfield, A 1999, Published criteria for evaluating health related web sites:
review, British Medical Journal, vol 318, no 7184, pp 647-649.
Kuehnhausen, M & Frost, VS 2013, Trusting smartphone apps? To install or not to install, that is the question.
Proceedings of the International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and
Decision Support, pp 25-28, San Diego, United States.
Moher, D, Liberati, A, Tetzlaff, J & Altman, DG 2010, Preferred reporting items for systematic reviews and meta-
analyses: the PRISMA statement. International Journal of Surgery, vol 8, no 5, pp 336-41.
Moustakis, V, Litos, C, Dalivigas, A & Tsironis, L 2004, Website quality assessment criteria. Proceedings of the
9th international conference of information quality, Nov 5-7, Boston, United States.
Olsina, L & Rossi, G 2002, Measuring web application quality with WebQEM. Multimedia, IEEE, vol 9, no 4, pp
20-29.
Oppenheim, AN 1992, Questionnaire design, interviewing and attitude measurement, New edition, London: Pinter
Publishers.
Riley, WT, Rivera, DE, Atienza, AA, Nilsen, W, Allison, SM & Mermelstein, R 2011, Health behavior models in the
age of mobile interventions: are our theories up to the task? Translational Behavioral Medicine, vol 1, no 1, pp 53-
71.
Seethamraju, R 2004, Measurement of user perceived web quality. Proceedings of the European Conference on
Information Systems, Finland.
14 // Safe. Healthy. Resilient.
Shrout, PE, Fleiss, JL 1979, Intraclass correlations: uses in assessing rater reliability. Psychological bulletin, vol
86, no 2, pp 420.
Su, W 2014, A preliminary survey of knowledge discovery on smartphone applications (apps): principles,
techniques and research directions for e-health. Proceedings of the International Conference on Complex Medical
Engineering, June 26-29; Taipei, Taiwan.
Zou, GY 2011, Sample size formulas for estimating intraclass correlation coefficients with precision and
assurance. Statistics in medicine, vol 31, no 29, pp 3972.
Contains
a
Author
/
Year
Title
scale
Publications
Aladwani AM, Palvia PC; 2001[1] Developing and validating an instrument for measuring user- Yes
perceived web quality
Doherty G, Coyle D, Matthews M; Design and evaluation guidelines for mental health No
2010[2] technologies
Eng TR; 2002[3] eHealth research and evaluation: Challenges and No
opportunities
Finstad K; 2010[4] The usability metric for user experience Yes
Handel MJ; 2011[5] mHealth (Mobile health) – Using apps for health and No
wellness
Ho B, Lee M, Armstrong AW; Evaluation criteria for mobile teledermatology applications No
2013[6] and comparison to major mobile teledermatology
applications
Kay-Lambkin FJ, White A, Baker Assessment of function and clinical utility of alcohol and Yes
AL; 2011[7] other drug web sites: An observational, qualitative study
Khoja S, Durrani H et al.; 2013[8] Conceptual framework for development of comprehensive e- No
Health evaluation tool
Kim P, Eng TR et al.; 1999[9] Published criteria for evaluating health related web sites: No
review
Lavie T, Tractinsky N; 2004[10] Assessing dimensions of perceived visual aesthetics of web Yes
sites
Moshagen M, Thielsch M; 2012[11] A short version of the visual aesthetics of websites inventory Yes
Oinas-Kukkonen H, Harjumaa M; A systematic framework for designing and evaluating No
2008[12] persuasive systems
Olsina L, Rossi G; 2002[13] Measuring web application quality with WebQEM No
Schulze K, Krömker H; 2010[14] A framework to measure user experience of interactive online No
products
Tuch AN, Roth SP, et al.; 2012[15] Is beautiful really usable? Toward understanding the relation No
between usability, aesthetics, and affect in HCI
Law ELC, Roto V et al.; 2009[16] Understanding, scoping and defining user eXperience: A survey No
approach
Moustakis V, Litos C et al.; Website quality assessment criteria Yes
2004[17]
Seethamraju R; 2006[18] Measurement of user-perceived web quality Yes
Väätäjä H, Koponen T, Roto V; Developing practical tools for user experience evaluation – a case Yes
2009[19] from mobile news journalism
Vermeeren APOS, Law ELC et al.; User experience evaluation methods: Current state and Yes
2010[20] development needs
Manuscripts
mHIMSS App Usability Work Selecting a mobile app: Evaluating the usability of medical Yes
Group; 2012[21] applications
Naumann F, Rolker C; 2000 [22] Assessment methods for information quality criteria No
Studies in Health Technology and Issues and considerations for healthcare consumers using mobile No
Informatics; 2013[23] applications
Websites
1. Aladwani, A.M. and P.C. Palvia, Developing and validating an instrument for measuring user-perceived
web quality. Information & Management, 2002. 39(6): p. 467-476.
2. Doherty, G., D. Coyle, and M. Matthews, Design and evaluation guidelines for mental health
technologies. Interacting with computers, 2010. 22(4): p. 243-252.
3. Eng, T.R., eHealth research and evaluation: challenges and opportunities. Journal of health
communication, 2002. 7(4): p. 267-272.
4. Finstad, K., The usability metric for user experience. Interacting with Computers, 2010. 22(5): p. 323-
327.
5. Handel, M.J., mHealth (Mobile Health)—Using Apps for Health and Wellness. EXPLORE: The Journal of
Science and Healing, 2011. 7(4): p. 256-261.
6. Ho, B., M. Lee, and A.W. Armstrong, Evaluation Criteria for Mobile Teledermatology Applications and
Comparison of Major Mobile Teledermatology Applications. Telemedicine and e-Health, 2013.
7. Kay-Lambkin, F., et al., Assessment of function and clinical utility of alcohol and other drug web sites: An
observational, qualitative study. BMC public health, 2011. 11(1): p. 277.
8. Khoja, S., et al., Conceptual Framework for Development of Comprehensive e-Health Evaluation Tool.
TELEMEDICINE and e-HEALTH, 2013. 19(1): p. 48-53.
9. Kim, P., et al., Published criteria for evaluating health related web sites: review. Bmj, 1999. 318(7184): p.
647-649.
10. Lavie, T. and N. Tractinsky, Assessing dimensions of perceived visual aesthetics of web sites.
International journal of human-computer studies, 2004. 60(3): p. 269-298.
11. Moshagen, M. and M. Thielsch, A short version of the visual aesthetics of websites inventory. Behaviour
& Information Technology, 2013. 32(12): p. 1305-1311.
12. Oinas-Kukkonen, H. and M. Harjumaa, A systematic framework for designing and evaluating persuasive
systems, in Persuasive technology. 2008, Springer. p. 164-176.
13. Olsina, L. and G. Rossi, Measuring Web application quality with WebQEM. Multimedia, IEEE, 2002. 9(4):
p. 20-29.
Instructions for use: The app should be thoroughly trialled for at least 10 minutes.
Determine: 1) how easy it is to use; 2) how well it functions; 3) does it do what it purports to do?
Review: settings, developer information, external links, security features, etc.
Visit the app store, review the app description and identify information on the purpose, functionality, affiliations of
the app developers, how the app was developed and if it has been tested.
Conduct an online search using the app name to identify up-to-date information on the app. Google scholar may
also need to be searched for evidence base on the app in the scientific literature.
Developer: _______________________________________________________________________________
__________________________________________________________________________
18 // Safe. Healthy. Resilient.
Focus: what the app targets Theoretical background/Strategies
(select all that apply) (all that apply)
! Increase Happiness/Well-being ! Assessment
! Mindfulness/Meditation/Relaxation ! Feedback
! Reduce negative emotions ! Information/Education
! Depression ! Monitoring/Tracking
! Anxiety/Stress ! Goal setting
! Anger ! Advice /Tips /Strategies /Skills training
! Behaviour Change ! CBT - Behavioural (positive events)
! Alcohol /Substance Use ! CBT – Cognitive (thought challenging)
! Goal Setting ! ACT - Acceptance commitment therapy
! Entertainment ! Mindfulness/Meditation
! Relationships ! Relaxation
! Physical health ! Gratitude
! Other _______________________________ ! Strengths based
! Other ____________________________
Affiliations:
! Unknown ! Commercial ! Government ! NGO ! University
SECTION A
Engagement – fun, interesting, customisable, interactive (e.g. sends alerts, messages, reminders,
feedback, enables sharing), well-targeted to audience
1. Entertainment: Is the app fun/entertaining to use? Does it use any strategies to increase
engagement through entertainment (e.g. through gamification)?
1 Dull, not fun or entertaining at all
2 Mostly boring
3 OK, fun enough to entertain user for a brief time (< 5 minutes)
4 Moderately fun and entertaining, would entertain user for some time (5-10 minutes total)
5 Highly entertaining and fun, would stimulate repeat use
2. Interest: Is the app interesting to use? Does it use any strategies to increase engagement by
presenting its content in an interesting way?
1 Not interesting at all
2 Mostly uninteresting
3 OK, neither interesting nor uninteresting; would engage user for a brief time (< 5 minutes)
4 Moderately interesting; would engage user for some time (5-10 minutes total)
5 Very interesting, would engage user in repeat use
3. Customisation: Does it provide/retain all necessary settings/preferences for apps features (e.g.
sound, content, notifications, etc.)?
1 Does not allow any customisation or requires setting to be input every time
2 Allows insufficient customisation limiting functions
3 Allows basic customisation to function adequately
4 Allows numerous options for customisation
5 Allows complete tailoring to the individual’s characteristics/preferences, retains all settings
4. Interactivity: Does it allow user input, provide feedback, contain prompts (reminders, sharing
options, notifications, etc.)? Note: these functions need to be customisable and not
overwhelming in order to be perfect.
1 No interactive features and/or no response to user interaction
2 Insufficient interactivity, or feedback, or user input options, limiting functions
3 Basic interactive features to function adequately
4 Offers a variety of interactive features/feedback/user input options
5 Very high level of responsiveness through interactive features/feedback/user input options
5. Target group: Is the app content (visual information, language, design) appropriate for your
target audience?
1 Completely inappropriate/unclear/confusing
2 Mostly inappropriate/unclear/confusing
3 Acceptable but not targeted. May be inappropriate/unclear/confusing
4 Well-targeted, with negligible issues
21
5 Perfectly targeted, no issues found
A. Engagement mean score = __________
SECTION B
7. Ease of use: How easy is it to learn how to use the app; how clear are the menu labels/icons and
instructions?
1 No/limited instructions; menu labels/icons are confusing; complicated
2 Useable after a lot of time/effort
3 Useable after some time/effort
4 Easy to learn how to use the app (or has clear instructions)
5 Able to use app immediately; intuitive; simple
SECTION C
Aesthetics – graphic design, overall visual appeal, colour scheme, and stylistic consistency
10. Layout: Is arrangement and size of buttons/icons/menus/content on the screen appropriate or
zoomable if needed?
1 Very bad design, cluttered, some options impossible to select/locate/see/read device display
not optimised
2 Bad design, random, unclear, some options difficult to select/locate/see/read
11. Graphics: How high is the quality/resolution of graphics used for buttons/icons/menus/content?
1 Graphics appear amateur, very poor visual design - disproportionate, completely stylistically
inconsistent
2 Low quality/low resolution graphics; low quality visual design – disproportionate, stylistically
inconsistent
3 Moderate quality graphics and visual design (generally consistent in style)
4 High quality/resolution graphics and visual design – mostly proportionate, stylistically consistent
5 Very high quality/resolution graphics and visual design - proportionate, stylistically consistent
throughout
SECTION D
Information – Contains high quality information (e.g. text, feedback, measures, references) from a
credible source. Select N/A if the app component is irrelevant.
13. Accuracy of app description (in app store): Does app contain what is described?
1 Misleading. App does not contain the described components/functions. Or has no description
2 Inaccurate. App contains very few of the described components/functions
3 OK. App contains some of the described components/functions
4 Accurate. App contains most of the described components/functions
5 Highly accurate description of the app components/functions
14. Goals: Does app have specific, measurable and achievable goals (specified in app store
description or within the app itself)?
N/A Description does not list goals, or app goals are irrelevant to research goal (e.g. using a game
for educational purposes)
1 App has no chance of achieving its stated goals
2 Description lists some goals, but app has very little chance of achieving them
3 OK. App has clear goals, which may be achievable.
4 App has clearly specified goals, which are measurable and achievable
5 App has specific and measurable goals, which are highly likely to be achieved
15. Quality of information: Is app content correct, well written, and relevant to the goal/topic of the
app?
N/A There is no information within the app
1 Irrelevant/inappropriate/incoherent/incorrect
2 Poor. Barely relevant/appropriate/coherent/may be incorrect
3 Moderately relevant/appropriate/coherent/and appears correct
4 Relevant/appropriate/coherent/correct
16. Quantity of information: Is the extent coverage within the scope of the app; and comprehensive
but concise?
N/A There is no information within the app
1 Minimal or overwhelming
2 Insufficient or possibly overwhelming
3 OK but not comprehensive or concise
4 Offers a broad range of information, has some gaps or unnecessary detail; or has no links to
more information and resources
5 Comprehensive and concise; contains links to more information and resources
18. Credibility: Does the app come from a legitimate source (specified in app store description or
within the app itself)?
1 Source identified but legitimacy/trustworthiness of source is questionable (e.g. commercial
business with vested interest)
2 Appears to come from a legitimate source, but it cannot be verified (e.g. has no webpage)
3 Developed by small NGO/institution (hospital/centre, etc.) /specialised commercial business,
funding body
4 Developed by government, university or as above but larger in scale
5 Developed using nationally competitive government or research funding (e.g. Australian
Research Council, NHMRC)
19. Evidence base: Has the app been trialled/tested; must be verified by evidence (in published
scientific literature)?
N/A The app has not been trialled/tested
1 The evidence suggests the app does not work
2 App has been trialled (e.g., acceptability, usability, satisfaction ratings) and has partially positive
outcomes in studies that are not randomised controlled trials (RCTs), or there is little or no
contradictory evidence.
3 App has been trialled (e.g., acceptability, usability, satisfaction ratings) and has positive
outcomes in studies that are not RCTs, and there is no contradictory evidence.
4 App has been trialled and outcome tested in 1-2 RCTs indicating positive results
5 App has been trialled and outcome tested in > 3 high quality RCTs indicating positive results
SECTION E
20. Would you recommend this app to people who might benefit from it?
1 Not at all I would not recommend this app to anyone
2 There are very few people I would recommend this app to
3 Maybe There are several people whom I would recommend it to
4 There are many people I would recommend this app to
5 Definitely I would recommend this app to everyone
21. How many times do you think you would use this app in the next 12 months if it was relevant to
you?
1 None
2 1-2
3 3-10
4 10-50
5 >50
Scoring
App-specific
These added items can be adjusted and used to assess the perceived impact of the app on the user’s knowledge,
attitudes, intentions to change as well as the likelihood of actual change in the target health behaviour.
1. Awareness: This app is likely to increase awareness of the importance of addressing [insert
target health behaviour]
3. Attitudes: This app is likely to change attitudes toward improving [insert target health
behaviour]
5. Help seeking: Use of this app is likely to encourage further help seeking for [insert target
health behaviour] (if it’s required)
6. Behaviour change: Use of this app is likely increase/decrease [insert target health behaviour]
Raters should:
Scoring
SECTION
The App subjective quality scale can be reported as individual items or as a mean score, depending on the aims
of the research.
The App-specific items can be adjusted and used to obtain information on the perceived impact of the app on the
user’s knowledge, attitudes and intentions related to the target health behaviour.
Circle the number that most accurately represents the quality of the app you are rating. All items are rated on a 5-
point scale from “1.Inadequate” to “5.Excellent”. Select N/A if the app component is irrelevant.
SECTION A
Engagement – fun, interesting, customisable, interactive, has prompts (e.g. sends alerts, messages,
reminders, feedback, enables sharing)
1. Entertainment: Is the app fun/entertaining to use? Does it have components that make it
more fun than other similar apps?
1 Dull, not fun or entertaining at all
2 Mostly boring
3 OK, fun enough to entertain user for a brief time (< 5 minutes)
4 Moderately fun and entertaining, would entertain user for some time (5-10 minutes total)
5 Highly entertaining and fun, would stimulate repeat use
2. Interest: Is the app interesting to use? Does it present its information in an interesting way
compared to other similar apps?
1 Not interesting at all
2 Mostly uninteresting
3 OK, neither interesting nor uninteresting; would engage user for a brief time (< 5 minutes)
4 Moderately interesting; would engage user for some time (5-10 minutes total)
5 Very interesting, would engage user in repeat use
3. Customisation: Does it allow you to customise the settings and preferences that you would
like to (e.g. sound, content and notifications)?
1 Does not allow any customisation or requires setting to be input every time
2 Allows little customisation and that limits app’s functions
3 Basic customisation to function adequately
4 Allows numerous options for customisation
5 Allows complete tailoring the user’s characteristics/preferences, remembers all settings
4. Interactivity: Does it allow user input, provide feedback, contain prompts (reminders,
sharing options, notifications, etc.)?
1 No interactive features and/or no response to user input
2 Some, but not enough interactive features which limits app’s functions
3 Basic interactive features to function adequately
4 Offers a variety of interactive features, feedback and user input options
5 Very high level of responsiveness through interactive features, feedback and user input options
5. Target group: Is the app content (visuals, language, design) appropriate for your target
audience?
1 Completely inappropriate, unclear or confusing
2 Mostly inappropriate, unclear or confusing
3 Acceptable but not specifically designed for young people. May be
inappropriate/unclear/confusing at times
4 Designed for young people, with minor issues
SECTION B
7. Ease of use: How easy is it to learn how to use the app; how clear are the menu labels, icons
and instructions?
1 No/limited instructions; menu labels, icons are confusing; complicated
2 Takes a lot of time or effort
3 Takes some time or effort
4 Easy to learn (or has clear instructions)
5 Able to use app immediately; intuitive; simple (no instructions needed)
8. Navigation: Does moving between screens make sense; Does app have all necessary links
between screens?
1 No logical connection between screens at all /navigation is difficult
2 Understandable after a lot of time/effort
3 Understandable after some time/effort
4 Easy to understand/navigate
5 Perfectly logical, easy, clear and intuitive screen flow throughout, and/or has shortcuts
9. Gestural design: Do taps/swipes/pinches/scrolls make sense? Are they consistent across all
components/screens?
1 Completely inconsistent/confusing
2 Often inconsistent/confusing
3 OK with some inconsistencies/confusing elements
4 Mostly consistent/intuitive with negligible problems
5 Perfectly consistent and intuitive
SECTION C
Aesthetics – graphic design, overall visual appeal, colour scheme, and stylistic consistency
10. Layout: Is arrangement and size of buttons, icons, menus and content on the screen
appropriate?
1 Very bad design, cluttered, some options impossible to select, locate, see or read
2 Bad design, random, unclear, some options difficult to select/locate/see/read
3 Satisfactory, few problems with selecting/locating/seeing/reading items
4 Mostly clear, able to select/locate/see/read items
5 Professional, simple, clear, orderly, logically organised
11. Graphics: How high is the quality/resolution of graphics used for buttons, icons, menus and
content?
SECTION D
Information – Contains high quality information (e.g. text, feedback, measures, references) from a
credible source
13. Quality of information: Is app content correct, well written, and relevant to the goal/topic of
the app?
N/A There is no information within the app
1 Irrelevant/inappropriate/incoherent/incorrect
2 Poor. Barely relevant/appropriate/coherent/may be incorrect
3 Moderately relevant/appropriate/coherent/and appears correct
4 Relevant/appropriate/coherent/correct
5 Highly relevant, appropriate, coherent, and correct
14. Quantity of information: Is the information within the app comprehensive but concise?
N/A There is no information within the app
1 Minimal or overwhelming
2 Insufficient or possibly overwhelming
3 OK but not comprehensive or concise
4 Offers a broad range of information, has some gaps or unnecessary detail; or has no links to
more information and resources
5 Comprehensive and concise; contains links to more information and resources
16. Credibility of source: does the information within the app seem to come from a credible
source?
N/A There is no information within the app
1 Suspicious source
2 Lacks credibility
3 Not suspicious but legitimacy of source is unclear
4 Possibly comes from a legitimate source
SECTION E
17. Would you recommend this app to people who might benefit from it?
1 Not at all I would not recommend this app to anyone
2 There are very few people I would recommend this app to
3 Maybe There are several people I would recommend this app to
4 There are many people I would recommend this app to
5 Definitely I would recommend this app to everyone
18. How many times do you think you would use this app in the next 12 months if it was relevant
to you?
1 None
2 1-2
3 3-10
4 10-50
5 >50
App-specific
SECTION F
1. Awareness : This app has increased my awareness of the importance of addressing the health
behaviour
3. Attitudes – The app has changed by attitudes toward improving this health behaviour
4. Intention to change – The app has increased my intentions/motivation to address this health
behaviour
5. Help seeking: This app would encourage me to seek further help to address the health
behaviour (if I needed it)
6. Behaviour change: Use of this app will increase/decrease the health behaviour
THANK YOU!