Sue Hackett greeted everybody and welcomed them to the third day which would be on receptive skills: Reading and Listening
As a warmer, Sue read a poem by Brain Patten, one of the Liverpudlian Beat poets from the 1960s.
Minister for Exams
When I was a child I sat an exam.
This test was so simple
There was no way i could fail.
Q1. Describe the taste of the Moon.
It tastes like Creation I wrote,
it has the flavour of starlight.
Q2. What colour is Love?
Love is the colour of the water a man
lost in the desert finds, I wrote.
Q3. Why do snowflakes melt?
I wrote, they melt because they fall
on to the warm tongue of God.
There were other questions.
They were as simple.
I described the grief of Adam
when he was expelled from Eden.
I wrote down the exact weight of
an elephant’s dream
Yet today, many years later,
For my living I sweep the streets
or clean out the toilets of the fat
hotels.
Why? Because constantly I failed
my exams.
Why? Well, let me set a test.
Q1. How large is a child’s
imagination?
Q2. How shallow is the soul of the
Minister for exams?
Brian Patten
Barry started his session on testing reading by talking about a model. He mentioned the model he and his colleagues developed after researching many models especially underlining the most comprehensible one by Urquhart and Weir (1998).
Barry and his colleagues began with word recognition, then they looked at the lexical items, and syntactical aspects after looking at the text at propositional level they dealt with the comprehension stage.
On the website of coe, you can find many example texts that are recommended at certain levels that shows the progression. Barry and his colleagues analysed these texts and came to the conclusion that no C2 level exam tests that they were shown were actually In C2 level.
Asking scanning and skimming questions in a reading test is not possible unless you are timing the test. Therefore, most tests are testing some form of careful reading.
Asking ‘guess meaning in context’ questions is not a way of testing reading comprehension.
Then, Barry showed some examples from A1 and A2 levels and asked the participants to notice that A1 is more of a word/ sentence level in a short text which cannot be exactly called a text, A2 sample seems to be a bit more in paragraph level asking learners to order the jumbled sentences.
When the learner is in intermediate levels, selective deletion cloze tests can be used.
Cloze type of questions work according to the researches conducted in the 70s because it tests people’s comprehension skills.
Barry showed a sample from B2 level where learners have to deal with the reading text as a whole. Indicating the progression of the testing as the level progresses.
Barry showed a sample from B2 level where learners has to deal with the reading text as a whole. Indicating the progression of the testing as the level progresses.
When it comes to the advanced levels, Barry suggested that two texts could be used and needed to be linked together. These type of tests are really difficult and they should be so.
Barry suggested the testers in the room to develop your own qualitative criteria while producing tests and use vocabulary profilers to check the lexical complexity of the texts which is based on lexical frequency. http://www.lextutor.ca/vp/eng/
He showed a sample FCE text and profiled it and explained how the percentage of unknown words affect the learners’ understanding of the texts.
Students should be able to know approximately %98 (between 95-98) of a text to be able to understand it, this means they can handle it without knowing 2% of the text.
That is, in a 750 word text, roughly 14 words can be unknown for them. Some testers or teachers insist on 100% word coverage but this is not necessary, and may even be avoided.
Assessing Listening CEFR way
The second session was given by Zeynep Urkun. Zeynep asked the participants what the hardest skill to test was to begin with.
She said she had been involved in testing for so many years and found the listening skill the hardest to test.
She underlined the fact that test makers perception of assessment would affect the way they approach the issue.
As for students’ perception, she stated, is not usually positive: they feel anxious, threatened, pressurized and failure is unbearable for them, etc.
How about the tester?
Most teachers are not part of the testing, it usually happens after learning occurs, testers live in their own world, testing is not exactly fun and hard to master due to the fact that there is lack of training.
When all these are considered, Zeynep said, CEFR is helpful. If it is interpreted well it can help the testers as well as teachers and curriculum makers. Testers can make life easier for themselves but they have to be careful as CEFR is not supposed to fit with the needs of institutions. Basically, if you use CEFR exactly, it probably would not work in your exact context. It is NOT prescriptive. It has to be flexible.
CEFR Descriptors can be useful while designing test specifications that fits your context
After this brief introduction on testing and CEFR, Zeynep mentioned the discourse roles in listening (Rost,1990)
– the listener is an interactive partner
– the listener us an addressee
– the listener is an overhearer
Zeynep listed some activities for Listening in a foreign language that a language learner needs to be engaged in, and she indicated the role of the listener in different activities.
Zeynep recommended testers to keep the following questions;
– to what range of inputs the learner will need/ be equipped/ be required to listen?
-For what purposes will the learner listen to the input?
-in what mode of listening will the learner engage?
Following that Zeynep mentioned Sabanci University model while adapting CEFR descriptors in their assessment and shared their practices.
They wrote their descriptors for the stakeholders using CEFR descriptors, they also wrote sets of ‘can do’ statements to increase learners’ awareness of their own progress.
They basically used the CEFR at Sabanci as a tool to inform the assessment criteria, to communicate efficently between the curriculum, testing and faculty tutors, share ideas and cooperate on international basis.
Naturally they faced several challenges while writing their institutional descriptors, some of them are;
– writing descriptors, for example, some ‘can do’s they needed were not in the CEFR
– Student friendly version was almost impossible to write
– learner training takes time before speaking exams (students are recorded and asked to self-assess)
– instructor training
– Ownership issues
Zeynep said that at Sabanci, they designed the tasks as real life like as possible. Authenticity was achieved that way. For example, they created listening tasks in which a university student knocks the door of a lecturer and asks for clarifications, or asks a question for negotiation of meaning, etc.
In order to exemplify what she meant, Zeynep got the audience to listen to a sample text and showed the questions to this text from their context. Zeynep also shared how they conducted the second listening activity which is a note- taking. They give students headings for note taking since this has proven to be helpful to the learners so far.
Having shared the actual tests, Zeynep went through their test specifications for the these tests.
Writing open ended questions
Barry O’Sullivan
The last session of the day was mainly on preparing questions in a Short answer format(SAF)
-It measures a student’s ability to reproduce specific information
-Ss complete a statement (fill in the blank, completion items) or answer a direct question using a single word or a brief phrase.( answers that can be copy pasted from the input text are not right)
– far more cognitively challenging than T/F or multiple choice questions
– it is more obvious what the candidate is actually doing when creating the response when compared to a set of multiple choice questions
Barry then listed the advantages and the disadvantages of SAF;
Advantages
– there is a higher level of cognitive ability required
– response can be modeled
– guessing is less likely
– good for ‘detail’ responses
– relatively easy to construct
Disadvantages
– can be difficult to score, ( spelling, grammar, etc.)
– limited to knowledge and comprehension questions as opposed to responding anything longer that three words is short answer. (if your question can be answered with more than three words then it is not a SAF and requires a different kind of rubric. You need to double check your question.)
When a member of the audience asked about summary type questions Barry said that summary questions also requires incredible amount of planning. Barry thinks students can be asked to summarize in L1 in monolingual classes. They may read and understand a text but may not be able to express themselves.
Unless you construct the main text very well in your summary, students are disadvantaged. And you need to have a scoring system which is tightly scripted. Otherwise it won’t work. He added that he could give a text to the people in the room and everyone could come up with a different summary
Another aspect of SAF is that the item analysis becomes very difficult.
Tips
-Make sure the instructions are clear so that students know what to do
-Word items so that students clearly understand exactly what information they are expected to supply
-Ensure that the required response is brief and specific
-Write the question so that there is either one answer or a limited number of answers possible
-If more than one answer is acceptable, ensure that they all are reflected in the key
-Do not use the same language in the item prompt as is used in the input
-Never allow students to copy from text exactly
-Where possible, do not be tempted to allow partial credit
Finally, Barry shared some do’s and don’ts of SAF, matching and T/F questions.
A few tips to note down for T/F questions are as follows;
-Where possible do not use T/F questions
-avoid using exact language from the input
– make all the questions the same length
– avoid negative and especially double negatives
– ensure roughly equal number of true and false answers ( though students tend to go for True when randomly guessing)
– consider asking students to correct false statements