Escolar Documentos
Profissional Documentos
Cultura Documentos
Subscales for easier computation and interpretation of results (need to figure out
what items on the survey function similarly)
Statistical frameworks exist for assessing these challenges, but they typically
require large sample sizes and assume certain structures underlie the survey
design.
Example Survey
a) Red:Rainbow::July:____ (Month, Year, Hot, Cloud)
b) Soothing:Anodyne::____:Esoteric (Eccentric, School, Abstruse, Calming)
c) Pyrrhic:Victory::Potemkin:____ (Village, Battle, Hollow, Achilles)
d) Stegasaurus:Jurassic::Trilobite:____ (Triassic, Dinosaur, Mesozoic, Cambrian)
e) Mice:Men::Cabbages:____ (Women, Lettuce, Salad, Kings)
f) Fill in the following series: 1, 1/8, 1/27, 1/64, ___
g) Fill in the following series: ___, 25, 168, 1229, 9592
h) Fill in the following series: 3, ___,4,1,5
Factor Analysis
Creation of new surveys requires internal and external validation, typically
done through factor analysis.
Exploratory factor analysis is used to cluster items measuring similar underlying
processes.
Confirmatory factor analysis can then be applied to validate those clusters, or
subscales, that were found in the exploratory analysis.
Crohnbach’s alpha establishes internal consistency.
Math
Verbal
f h
a e
g
b c d
Potential Pitfalls in Psychometric
Validation with Factor Analysis
Two major problems challenge the assumptions of these methods and
necessitate the development of a new way to analyze and validate the
measure.
Time-wise or context-wise measurement can introduce non-independent, non-
hierarchical components into the model.
Study habits across terms (longitudinal effects on measurement), identity across social
spheres (student perception of intellectual ability when with friends, work, and school)
Factor analysis can be broadened to Bayesian networks and structural equation models, but
this method comes with its own assumptions on the underlying geometry and sample size.
Small sample size can create numerical instability in traditional algorithms for both
factor analysis and structural equation models (suggest 5-10 participants per item).
If there are 90 items, at least 450 students would be needed to discover subscales, and
another 450 would be needed to validate these findings.
Cost and population size can be prohibitive to the study.
Ex. Bridging constructs, or loosely connected concepts without a defined hierarchy,
typically run into both limitations and require a new method to validate their
surveys.
Many of these issues arise from the dependence on linear mapping from the
survey response space to a lower-dimensional space.
Moving from Euclidean-Based Statistics
to Topologically-Based Statistics
2D example
One can define connections between pieces of this space via algebra and examine
structural properties computationally:
Homotopy (shrinking connected paths to a point)
Homology (hole-counting to define topological classification of structure)
Hodge Theory
Homotopy/ 1 2 3
Homology Basins of Attraction (Morse Theory)
Applied Homology: Filtrations and
Persistence
Filtration
This is an iterative changing of lens with which
to examine data (height, neighbors…).
Topological features appear and disappear as
the lens changes.
This creates a nested sequence of features with
underlying algebraic objects, called a homology
sequence:
Hom1⊂Hom2⊂Hom3⊂Hom4
Heatmap
Math Verbal
0.6
0.4
0.2
1
ILLCa_status_school
ILLCa_status_family
ILLCa_status_freetime
ILLCa_sexual_or_family
ILLCa_race_family
ILLCa_look_dating
ILLCa_look_group
ILLCa_age_group
ILLCa_gender_group
ILLCa_music_group
ILLCa_music_family
ILLCa_music_school
ILLCa_school_success_freetime
Identity by Context Survey Heatmap
ILLCa_school_success_group
ILLCa_school_success_dating
ILLCa_sport_neighborhood
ILLCa_music_dating
ILLCa_music_religion
ILLCa_music_freetime
ILLCa_sport_group
ILLCa_sport_family
ILLCa_sport_school
ILLCa_age_neighborhood
ILLCa_gender_neighborhood
ILLCa_status_neighborhood
ILLCa_sexual_or_neighborhood
ILLCa_race_neighborhood
ILLCa_school_success_neighborhood
ILLCa_age_school
ILLCa_age_family
ILLCa_gender_school
ILLCa_gender_family
ILLCa_gender_freetime
ILLCa_sexual_or_freetime
ILLCa_sexual_or_group
ILLCa_race_dating
ILLCa_race_group
ILLCa_status_group
ILLCa_status_dating
ILLCa_look_school
ILLCa_look_family
ILLCa_look_freetime
ILLCa_beauty_group
ILLCa_beauty_school
ILLCa_beauty_freetime
ILLCa_beauty_family
ILLCa_sexual_or_school
ILLCa_race_freetime
ILLCa_race_school
ILLCa_religion_dating
ILLCa_religion_religion
ILLCa_beauty_religion
ILLCa_status_religion
ILLCa_race_religion
ILLCa_music_neighborhood
ILLCa_look_religion
ILLCa_school_success_religion
ILLCa_look_neighborhood
ILLCa_beauty_neighborhood
ILLCa_tribe_religion
ILLCa_tribe_neighborhood
ILLCa_tribe_school
ILLCa_tribe_family
ILLCa_tribe_freetime
ILLCa_tribe_group
ILLCa_tribe_dating
ILLCa_politics_freetime
ILLCa_politics_school
ILLCa_politics_group
ILLCa_politics_neighborhood
ILLCa_politics_family
ILLCa_politics_religion
ILLCa_age_religion
ILLCa_gender_religion
ILLCa_sexual_or_religion
ILLCa_religion_group
ILLCa_politics_dating
ILLCa_religion_neighborhood
ILLCa_religion_school
ILLCa_religion_family
ILLCa_religion_freetime
ILLCa_sport_religion
ILLCa_sport_freetime
ILLCa_sport_dating
ILLCa_beauty_dating
ILLCa_sexual_or_dating
ILLCa_age_freetime
ILLCa_age_dating
ILLCa_gender_dating
ILLCa_school_success_school
ILLCa_school_success_family
Conclusion
This method offers a robust way to create survey subscales and validate
measures without needing a large sample or a pre-defined measure structure.
Flexible
Deeply routed in mathematics
Statistically testable
Internal validity by pvclust’s statistical test of cluster hierarchy for cut-points
External validity by Hausdorff nonparametric test on bootstrapped samples