What is Validity?
The fundamental concept to keep in mind when creating any assessment is validity. Validity refers to whether a test measures what it aims to measure. For example, a valid driving test should include a practical driving component and not just a theoretical test of the rules of driving. A valid language test for university entry, for example, should include tasks that are representative of at least some aspects of what actually happens in university settings, such as listening to lectures, giving presentations, engaging in tutorials, writing essays, and reading texts.
Validity has different elements, which we are now going to look at in turn.
Test Purpose – Why am I testing?
We can never really say that a test is valid or not valid. Instead, we can say that a test is valid for a particular purpose. There are several reasons why you might want to test your students. You could be trying to check their learning at the end of a unit, or trying to understand what they know and don't know. Or, you might want to use a test to place learners into groups based on their ability, or to provide test takers with a certificate of language proficiency. Each of these different reasons for testing represents a different test purpose.
The purpose of the test determines the type of test you're going to produce, which in turn affects the kinds of tasks you're going to choose, the number of test items, the length of the test, and so on. For example, a test certifying that doctors can practise in an English-speaking country would be different from a placement test which aims to place those doctors into language courses.
Test Takers – Who am I testing?
It’s also vital to keep in mind who is taking your test. Is it primary school children or teenagers or adults? Or is it airline pilots or doctors or engineers? This is an important question because the test has to be appropriate for the test takers it is aimed for. If your test takers are primary school children, for instance, you might want to give them more interactive tasks or games to test their language ability. If you are testing listening skills, for example, you might want to use role plays for doctors, but lectures or monologues with university students.
Test Construct – What am I testing?
Another key point is to consider what you want to test. Before designing a test, you need to identify the ability or skill that the test is designed to measure – in technical terms, the ‘test construct’. Some examples of constructs are: intelligence, personality, anxiety, English language ability, pronunciation. To take language assessment as an example, the test construct could be communicative language ability, or speaking ability, or perhaps even a construct as specific as pronunciation. The challenge is to define the construct and find ways to elicit it and measure it; for example, if we are testing the construct of fluency, we might consider features such as rate of speech, number of pauses/hesitations and the extent to which any pauses/hesitations cause strain for a listener.
Test Tasks – How am I testing?
Once you’ve defined what you want to test, you need to decide how you’re going to test it. The focus here is on selecting the right test tasks for the ability (i.e. construct) you're interested in testing. All task types have advantages and limitations and so it’s important to use a range of tasks in order to minimize their individual limitations and optimize the measurement of the ability you’re interested in. The tasks in a test are like a menu of options that are available to choose from, and you must be sure to choose the right task or the right range of tasks for the ability you're trying to measure.
Test Reliability - How am I scoring?
Next it’s important to consider how to score your test. A test needs to be reliable and to produce accurate scores. So, you’ll need to make sure that the scores from a test reflect a learner's actual ability. In deciding how to score a test, you’ll need to consider whether the answers to the are going to be scored as correct or incorrect (this might be the case for multiple–choice tasks, for example) or whether you might use a range of marks and give partial credit, as for example, in reading or listening comprehension questions. In speaking and writing, you’ll also have to decide what criteria to use (for example, grammar, vocabulary, pronunciation, essay, organisation in writing, and so on). You’ll also need to make sure that the teachers involved in speaking or writing assessment have received some training, so that they are marking to (more or less) the same standard.
Test Impact - How will my test help learners?
The final – and in many ways most important – question to ask yourself is how the test is benefitting learners. Good tests engage learners in situations similar to ones that they might face outside the classroom (i.e. authentic tasks), or which provide useful feedback or help their language development by focusing on all four skills (reading, listening, writing, speaking). For example, if a test has a speaking component, this will encourage speaking practice in the classroom. And if that speaking test includes both language production (e.g. describe a picture) and interaction (e.g. discuss a topic with another student), then preparing for the test encourages the use of a wide range of speaking activities in the classroom and enhances learning.