Você está na página 1de 6

Chapter 13: Data

processing Data
sets

331
First European Survey on Language Competences: Technical Report

13 Data processing - Data sets


This chapter details the contents of the ESCL data sets.
The ESLC international data sets consist of seven data files: four student-level files,
one teacher-level file and two school-level files.

13.1 The Student Questionnaire and performance data file


Filename: INT_stu.txt
For each student who participated in the assessment the following information is
available:
Identification variables for the educational system, school, target language
and student
The student responses on the questionnaire
ire
and Writing (only for the two skills out of three, for which each student was
sampled)
variance estimates

13.2 Language assessment items data files


13.2.1 Scored responses
Filename: INT_cogn_sco.txt
For each student who participated in the cognitive assessment the following
information is available:
Identification variables for the educational system, school, target language,
student and marker
Writing
booklet was marked by a central marker this file contains the marked
responses from the c
more than one marker, but not a central marker, the file contains the marks by
a randomly selected marker

332
First European Survey on Language Competences: Technical Report

13.2.2 Raw responses


Filename: INT_cogn_raw.txt
For each student who participated in a Listening or Reading test, the following
information is available:
Identification variables for the educational system, school, target language
and student
The students raw responses to Listening and Reading items

13.2.3 Multiple marking


Filename: INT_cogn_mm.txt
For each Writing booklet which was marked more than once the following information
is available:
Identification variables for the educational system, school, target language,
student and marker
Marked responses

13.3 Teacher Questionnaire data file


Filename: INT_tea.txt
For each teacher who filled out the questionnaire the following information is available:
Identification variables for the educational system, school, target language
and teacher
m the original responses in the questionnaire
variance estimates

13.4 School Questionnaire data files


File names: INT_sch_TL1.txt, INT_sch_TL2.txt
For each school that participated in the survey the following information is available:
Identification variables for the educational system, implicit and explicit strata,
school, target language and principal
target language

333
First European Survey on Language Competences: Technical Report

School plausible values for Listening, Reading and Writing and standard
errors for the school plausible values
School weights
The school dataset is divided separate in files for the first target language and the
second target language. If a school participated for two target languages, the school is
present in both files. Since only one principal responded per school the principal
responses and indices are replicated in both files as far as they are applicable to both
target languages.

13.5 Records in the data sets


Student level
All students who attended at least one questionnaire or test booklet session

Teacher level
All teachers who responded to the questionnaire

School level
All schools for which at least one student attended a questionnaire or test
booklet session

13.6 Records excluded from the datasets


The following data is excluded from the datasets
Students that did not participate in any session, either because they were
ineligible, excluded or absent
Teachers that did not respond to the questionnaire
Schools for which no students attended a questionnaire or test booklet
session.

13.7 Weights in the datasets


All schools for which any student participated in the survey are in the datasets.
However, only students and schools that meet the formal criteria for participation have
a weight in the datasets.
A participating student is defined as one who has responded to the Student
Questionnaire (required of all students), and has done at least one of the two cognitive
tests assigned.

334
First European Survey on Language Competences: Technical Report

A participating school is defined as a school where at least 25% of the sampled


students have completed the questionnaire and at least one test booklet. Based on
this criterion four schools (two in the first target language sample and two in the
second target language sample) did not get a weight because all questionnaires for
these schools were lost.
In Spain and the Flemish Community of Belgium, a number of schools took part that
were not part of the sample. These schools can be identified through the code
student respondents
from these schools do not have weights.

13.8 Representing missing data


Missing responses were coded to distinguish between four types of missing data36:
Not applicable: 77 for closed questions and 7777 in open questions. This code
is used for items or options in the questionnaires that were not administered to
respondents, mainly due to the localisation (see Chapter 3).
Not applicable: 78 for closed questions and 7778 in open questions. This code
is used for items or options in the Principal Questionnaire that were not
applicable for the target language because the principal responded to the
other target language version of the questionnaire.
Invalid: 88 for closed questions and 8888 in open questions. This code is used
when a respondent gave an invalid answer, for example selected several
answers when only one answer was expected.
Missing: 99 for closed questions and 9999 for open questions. This code is
used when the respondent did not provide an answer to the questions.

13.9 Identification of respondents, schools and markers


The following identifiers were used:
Educational system identification variable named educational system_id. The
educational system codes used in ESLC are the educational system codes of
the European Commission
The school identification variable named school_id. This consists of the letters
The respondent identification variable named respondent_id. Unique randomly
assigned number for identification of students, teachers and principals
The marker identification variable called marker id. This is a string consisting
of a three letter educational system identification variable (ISO 3166, with
BGE, BFL, BFR for the German, Flemish and French Communities of Belgium

36

Note that as far as the indices are concerned, each missing value is a true missing value

335
First European Survey on Language Competences: Technical Report

the marker.
Full details of all identifiers and codes used can be found in the codebook
made available with the data sets.
Note: since some schools participated for two target languages, merging the student
files with the teacher or school files,
, should
always be done on two variables: school_id and targetLanguage_id.

336
First European Survey on Language Competences: Technical Report