TEA in Istanbul -Day 5-

Posted on February 17, 2012 by burcuakyol

The last day of TEA in Istanbul is snowy, lots of flights were cancelled but all the participants are all set and ready for the full day.

Glyn Jones from Pearson joined the day online from the UK giving a short session on Pearson Test of English. He said that he works with EAQUALS which is an institution helping teachers write tests aligned with the CEF.

Glyn gave information of the PTE and how the items were written. He stated that their job was difficult as CEFR didn’t clearly give enough input. Of course, that was not what it aimed at. They tried to pinpoint the detail which shows the level difference in the descriptors and interpreted them while writing items. He exemplified what he meant by giving specific examples from the CEFR descriptors.

Glyn said that he and the team he works with at EAQUALS are working on a manual which will hopefully be ready by the end of this year or beginning of the next year. He stated that anyone interested can join the group and give a hand and become part of the project.

A participant asked whether the process of going through the descriptors and internalising them would be better or just use the product you will have come up with. Glyn said that from the practical perspective, of course, using the product. However, of course, if you have time there is of course great benefits of it.

Assessing Vocabulary

Barry started his session by referring to the explicit and implicit vocabulary teaching and stated that as research supports, explicit vocabulary teaching works better than the implicit teaching.

We have to think the receptive and productive vocabulary as well as the breadth and depth of vocabulary while planning our tests. Again, what our purpose to test is determines the kind of tests we offer.

Barry, then, gave some examples of measuring vocabulary such as discrete vs. embedded, selective vs. comprehensive, context-independent vs. context-dependent measurement (Read, 2001). He exemplified some items that test vocabulary referring to these categories.

For more information, Barry also referred to the works of Schmitt N. (2010) and Quin(1998) http://www.collectionscanada.gc.ca/obj/s4/f2/dsk2/ftp02/NQ33914.pdf

While talking about testing vocabulary, frequency tests were discussed, and the works of Cobb T. and Nation P. that all of us are greatly appreciative of were mentioned. The words that appear in the frequency word lists, however, are not learner friendly as they come in any order. Since our brain does not work that way, the words are more likely to be learned by making associations and semantic connections.

Finally, Barry suggested a website that helps create cloze tests.

Before lunch, TEA in Istanbul participants were presented with the details of Oxford Placement Tests given online by Alexandra Miller.

The end of the TEA In Istanbul

And finally, after everyone was given their certificates of attendance and 25 free membership to IATEFL TEA SIG, TEA in Istanbul came to an end with a Jeopardy Game which was a lot of fun. Participants were put in groups of four and competed for a free flight and accommodation to the TEA SIG event in Prague. The group who got the most points were called to the board to draw for the winning prize. And the lucky winner is Greg Grimaldi, Sabanci University.

Zeynep Urkun thanked the participants for their contributions, invited everyone to the Prague event and to Glasgow to IATEFL. For those who cannot make it was reminded that they could follow the conference online.

TEA In Istanbul -Day 4-

Posted on February 16, 2012 by burcuakyol

Thursday morning started with a presentation by Diane Steward from the British Council Istanbul, who gave detailed information on IELTS exams and answered the participants’ questions. She recommended that both the website and the specially designed DVD which gives lots of samples, band descriptors, feedback to the candidates, etc.

Building on the IELTS information given by the opening speaker, Barry talked briefly about lexile results in relation to different exams e.g. IELTS, TEEP and iBT. In terms of text complexity, the texts utilised in the IELTS Reading component were the most consistent as they were mainly at the undergraduate level with some falling within the post-grad level. The range for other tests varied widely from pre-undergrad to post grad levels.

Barry then presented a case study as an example as to how a grammar test can be developed using the following steps: planning – design – development – administration – evaluation.

One difficulty he faced during the project was the absence of a grammatical progression list to categorise structures according to the CEFR levels (the CEFR does not make explicit reference to grammar). The Core Inventory, which identifies key grammatical and lexical resources at each of the levels, was produced after the project. These lists prepared by the BC and EAQUALS represent a very useful resource for test designers: www.teachingenglish.org.uk/sites/teacheng/files/Z243%20E&E%20EQUALS%20BROCHURErevised6.pdf

Participants started to discuss types of grammar questions such as rewrite type questions, TOEFL type questions. They reflected that they are not sure if they are testing the grammar or other things. Barry suggested Purpura (2004) Assessing Grammar, CUP; Heaton (1990) Classroom Testing, Longman.

At the start of the afternoon session, participants wrote down ‘light bulb’ moments, ‘illuminating moments from the week’. Sue will put these up on a wall linked to the blog on Friday.

Wrap up session questions

Q; What would be your three favorite tasks for asking Explicit grammar teaching?

A: It all depends on your context and purpose. You should be the judge of it.

Q; Can you or should you assess learners from everything you taught?

A: No, you have to select and compromise. You cannot ever satisfy all the teachers.

If it is in the curriculum, teachers have no right to say that they haven’t taught it. Just say, ‘tough’ if they say ‘but I haven’t taught it’.

Q; could you tell us what you think about assessing process writing?

A; It has three stages, planning, doing and revising. I wouldn’t assess the developmental stages.

Young learners should only be assessed on speaking

Q; When can you get away without an item analysis?

A: Never.

Q: in item analysis, what percentage shows that an item is successful?

A: in achievement tests, hopefully 100% is. However, over 95% correct answer may be too easy, testing nothing. In general when more than 50% of the candidates answers a question correctly this shows that it is a good question. In proficiency tests, below 20-25% and above 75% correctness means that question is not successful

Q. How long should a tester wait to test a newly taught item?

A. A week or so, postponing is not a good idea

TEA in Istanbul -Day 3-

Posted on February 15, 2012 by burcuakyol

Sue Hackett greeted everybody and welcomed them to the third day which would be on receptive skills: Reading and Listening

As a warmer, Sue read a poem by Brain Patten, one of the Liverpudlian Beat poets from the 1960s.

Minister for Exams

When I was a child I sat an exam.
This test was so simple
There was no way i could fail.

Q1. Describe the taste of the Moon.

It tastes like Creation I wrote,
it has the flavour of starlight.

Q2. What colour is Love?

Love is the colour of the water a man
lost in the desert finds, I wrote.

Q3. Why do snowflakes melt?

I wrote, they melt because they fall
on to the warm tongue of God.

There were other questions.
They were as simple.

I described the grief of Adam
when he was expelled from Eden.
I wrote down the exact weight of
an elephant’s dream

Yet today, many years later,
For my living I sweep the streets
or clean out the toilets of the fat
hotels.

Why? Because constantly I failed
my exams.
Why? Well, let me set a test.

Q1. How large is a child’s
imagination?
Q2. How shallow is the soul of the
Minister for exams?

Brian Patten

Barry started his session on testing reading by talking about a model. He mentioned the model he and his colleagues developed after researching many models especially underlining the most comprehensible one by Urquhart and Weir (1998).

Barry and his colleagues began with word recognition, then they looked at the lexical items, and syntactical aspects after looking at the text at propositional level they dealt with the comprehension stage.

On the website of coe, you can find many example texts that are recommended at certain levels that shows the progression. Barry and his colleagues analysed these texts and came to the conclusion that no C2 level exam tests that they were shown were actually In C2 level.

Asking scanning and skimming questions in a reading test is not possible unless you are timing the test. Therefore, most tests are testing some form of careful reading.
Asking ‘guess meaning in context’ questions is not a way of testing reading comprehension.

Then, Barry showed some examples from A1 and A2 levels and asked the participants to notice that A1 is more of a word/ sentence level in a short text which cannot be exactly called a text, A2 sample seems to be a bit more in paragraph level asking learners to order the jumbled sentences.

When the learner is in intermediate levels, selective deletion cloze tests can be used.
Cloze type of questions work according to the researches conducted in the 70s because it tests people’s comprehension skills.

Barry showed a sample from B2 level where learners have to deal with the reading text as a whole. Indicating the progression of the testing as the level progresses.

Barry showed a sample from B2 level where learners has to deal with the reading text as a whole. Indicating the progression of the testing as the level progresses.

When it comes to the advanced levels, Barry suggested that two texts could be used and needed to be linked together. These type of tests are really difficult and they should be so.

Barry suggested the testers in the room to develop your own qualitative criteria while producing tests and use vocabulary profilers to check the lexical complexity of the texts which is based on lexical frequency. http://www.lextutor.ca/vp/eng/

He showed a sample FCE text and profiled it and explained how the percentage of unknown words affect the learners’ understanding of the texts.

Students should be able to know approximately %98 (between 95-98) of a text to be able to understand it, this means they can handle it without knowing 2% of the text.

That is, in a 750 word text, roughly 14 words can be unknown for them. Some testers or teachers insist on 100% word coverage but this is not necessary, and may even be avoided.

Assessing Listening CEFR way

The second session was given by Zeynep Urkun. Zeynep asked the participants what the hardest skill to test was to begin with.

She said she had been involved in testing for so many years and found the listening skill the hardest to test.

She underlined the fact that test makers perception of assessment would affect the way they approach the issue.

As for students’ perception, she stated, is not usually positive: they feel anxious, threatened, pressurized and failure is unbearable for them, etc.

How about the tester?

Most teachers are not part of the testing, it usually happens after learning occurs, testers live in their own world, testing is not exactly fun and hard to master due to the fact that there is lack of training.

When all these are considered, Zeynep said, CEFR is helpful. If it is interpreted well it can help the testers as well as teachers and curriculum makers. Testers can make life easier for themselves but they have to be careful as CEFR is not supposed to fit with the needs of institutions. Basically, if you use CEFR exactly, it probably would not work in your exact context. It is NOT prescriptive. It has to be flexible.

CEFR Descriptors can be useful while designing test specifications that fits your context

After this brief introduction on testing and CEFR, Zeynep mentioned the discourse roles in listening (Rost,1990)

– the listener is an interactive partner
– the listener us an addressee
– the listener is an overhearer

Zeynep listed some activities for Listening in a foreign language that a language learner needs to be engaged in, and she indicated the role of the listener in different activities.

Zeynep recommended testers to keep the following questions;

– to what range of inputs the learner will need/ be equipped/ be required to listen?
-For what purposes will the learner listen to the input?
-in what mode of listening will the learner engage?

Following that Zeynep mentioned Sabanci University model while adapting CEFR descriptors in their assessment and shared their practices.

They wrote their descriptors for the stakeholders using CEFR descriptors, they also wrote sets of ‘can do’ statements to increase learners’ awareness of their own progress.

They basically used the CEFR at Sabanci as a tool to inform the assessment criteria, to communicate efficently between the curriculum, testing and faculty tutors, share ideas and cooperate on international basis.

Naturally they faced several challenges while writing their institutional descriptors, some of them are;

– writing descriptors, for example, some ‘can do’s they needed were not in the CEFR
– Student friendly version was almost impossible to write
– learner training takes time before speaking exams (students are recorded and asked to self-assess)
– instructor training
– Ownership issues

Zeynep said that at Sabanci, they designed the tasks as real life like as possible. Authenticity was achieved that way. For example, they created listening tasks in which a university student knocks the door of a lecturer and asks for clarifications, or asks a question for negotiation of meaning, etc.

In order to exemplify what she meant, Zeynep got the audience to listen to a sample text and showed the questions to this text from their context. Zeynep also shared how they conducted the second listening activity which is a note- taking. They give students headings for note taking since this has proven to be helpful to the learners so far.

Having shared the actual tests, Zeynep went through their test specifications for the these tests.

Writing open ended questions
Barry O’Sullivan

The last session of the day was mainly on preparing questions in a Short answer format(SAF)

-It measures a student’s ability to reproduce specific information
-Ss complete a statement (fill in the blank, completion items) or answer a direct question using a single word or a brief phrase.( answers that can be copy pasted from the input text are not right)
– far more cognitively challenging than T/F or multiple choice questions
– it is more obvious what the candidate is actually doing when creating the response when compared to a set of multiple choice questions

Barry then listed the advantages and the disadvantages of SAF;

Advantages

– there is a higher level of cognitive ability required
– response can be modeled
– guessing is less likely
– good for ‘detail’ responses
– relatively easy to construct

Disadvantages

– can be difficult to score, ( spelling, grammar, etc.)
– limited to knowledge and comprehension questions as opposed to responding anything longer that three words is short answer. (if your question can be answered with more than three words then it is not a SAF and requires a different kind of rubric. You need to double check your question.)

When a member of the audience asked about summary type questions Barry said that summary questions also requires incredible amount of planning. Barry thinks students can be asked to summarize in L1 in monolingual classes. They may read and understand a text but may not be able to express themselves.

Unless you construct the main text very well in your summary, students are disadvantaged. And you need to have a scoring system which is tightly scripted. Otherwise it won’t work. He added that he could give a text to the people in the room and everyone could come up with a different summary

Another aspect of SAF is that the item analysis becomes very difficult.

Tips

-Make sure the instructions are clear so that students know what to do
-Word items so that students clearly understand exactly what information they are expected to supply
-Ensure that the required response is brief and specific
-Write the question so that there is either one answer or a limited number of answers possible
-If more than one answer is acceptable, ensure that they all are reflected in the key
-Do not use the same language in the item prompt as is used in the input
-Never allow students to copy from text exactly
-Where possible, do not be tempted to allow partial credit

Finally, Barry shared some do’s and don’ts of SAF, matching and T/F questions.

A few tips to note down for T/F questions are as follows;

-Where possible do not use T/F questions
-avoid using exact language from the input
– make all the questions the same length
– avoid negative and especially double negatives
– ensure roughly equal number of true and false answers ( though students tend to go for True when randomly guessing)
– consider asking students to correct false statements

TEA in Istanbul -Day 2-

Posted on February 14, 2012 by burcuakyol

Assessing productive skills

The second day started with Sue Hackett’s presentation on assessing speaking and writing.

As a warmer, Sue wants pairs to complete the following sentences…

The CEFR is …..

The role of self assessment is….
Validity is …
Test specifications are important because…

Sue then wanted some pairs to shout out the ones the sentences that they came up with. Some of them are as follows:

The CEFR is neutral and objective.
The CEFR standardize a global framework
The role of self assessment is….to help the learner see his her progress
The role of self assessment is important for goal setting
Validity is the key
Validity is central to the whole process the first thing and the last thing you think of
Test specifications are important because…they act as a road map to the preparation of the test
Test specifications are important because they set standards for test preparation

Sue started the session by underlying the fact that ‘Speaking’ component is covered in the CEFR very well as opposed to ‘Writing’ component.

Sue stated that although both of the skills are productive skills and the students use the same linguistic resources, learners use different mental processes. For example, textual features change, sociocultural norms and patterns of language use change as well as cognitive processes involved in production and interaction.

Another thing a tester has to always keep in mind is the importance of knowing the learners and their linguistic and social background while planing and preparing tests.

Following this, Sue asked participants what other aspects can be mentioned about testing writing and speaking. Some of the comments are as follows;

-The issue of affect and error correction both get involved in writing and speaking in different ways.
-Self correction can be postponed in writing but it has to be immediate in speaking
-Lack of time for planning in speaking is another aspect that needs to be kept in mind
– It is easier to give feedback for writing but for speaking it is not as common
– Fluency is more important in speaking, and accuracy is more of a factor in writing.
– The importance of correctness is possibly more sought in writing
– Pressure to keep the communication going is more important in speaking

Assessment of performance

Sue used an analogy while explaining the assessment of performance. One’s performance is a result of many underlying competences that you try to infer can be compared.

You can see the performance as the tip of the iceberg.

Sue indicated that language assessment task types should resemble real world task types. She also said that productive test should involve tasks that meet the following conditions.

Reliability
Construct validity
Authenticity
Interactivity
Impact
Practicality/ feasibility

Then the audience commented on the conditions they found interesting.

Some comments are as follows;

– All are equally important but I would choose construct validity if I had to choose one.
– We don’t live our lives on a multiple choice basis so the task should be a real life tasks that can engage the learners.
– Authenticity must be what learners will need to do in real life.

Assessing writing

Sue mentioned the following aspects to consider while planing, preparing and assessing writing tests:

– Genre
– knowledge of audience
– context
– task completion and relevance
-cognitive complexity
– the process of production planing, drafting, editing, finalizing
– syntactical pattern and linguistic complexities
– Coherence and cohesion

Barry made a comment when one of the audience asked about the role of task completion in written tests; Task completion depends on the purpose, i.e. If the task is about your last holiday and if the learner has shown enough language for me to assess his performance, it doesn’t matter if he has stopped talking about his summer on 3 August.

Sue suggested the following contents for specifications for speaking tests referring to Alderson et al(1995) in Sari Luoma, Assessing Speaking 2004 CUP.

-The more time you spend at this stage the more ti e you will save from the prep stage
-The test purpose
-Description of examinees
-Test level
-Definition of construct
-No of sections and appears
-Time for each section and paper
-Weighting for each section paper
-Target language situation
-Text types
-Test methods
-Rubrics
-Criteria for marking
-Description of typical performance at each level
-Description of what candidates at each level can do in the real world
-Sample papers and samples of learner performance on task

The session ended with Sue promising to get into the details of these in the afternoon session.

The next session was on Assessing Performance in Speaking and Writing by Barry O’Sullivan.

Barry started by saying that there isn’t a perfect way of assessing productive skills

He shared a few traditional rating scales, both holistic and analytical ones and mentioned the history of them. Following these scales, he shared some new ideas in creating tests and scales. The following chart shows that all the criterion were not tested in each task. Test makers cherry picked the criterion that fit best with the task requirements and just assessed these at certain tasks as shown below.

Barry commented on the fact that different task types require different scales. For example, when a public speaker gives a talk it requires a one-way, formal, long-turn but the same speaker may have a conversation at the break time with the other participants. That talk is interactive, accuracy becomes less important and that both parties are responsible for the communication. Using the same rating scale for both tasks is simply wrong.

Barry claimed that spoken grammar is not the same in an interactive task as opposed to a monologic task, and he said that in an interactive task, he would get rid of all the accuracy components because once you are listening for errors as an examiner you lose the plot in an interactive speaking tests.

He extended his discussion on the same issue by mentioning the idea of fluency: in a monologue fluency is the ability not to stop, to basically keep going. However, in an interaction it is different because interaction fluency is different. Interactive language doesn’t belong to one person it is co-constructive. We build the conversation together.

He suggested to try speaking to somebody without making any reference to the other speaker and see what he meant.

Barry gave some statistics;

– a lot of examiners rate the test taker in 3 seconds, which means before they even start speaking! He commented that This is scary.. Then he underlined the importance of rater training. Barry also mentioned the importance of finding markers that can do the marking fast and consistently.

-Average time taken by an examiner to mark 250 words: 57 seconds.

According to Barry, even though technology helps examiners in our day and age, machines cannot assess how well an idea has been communicated, they can only count well. He gave an example: If Ernest Hemingway or James Joyce was tested by machines, their English might have been interpreted at the level of A1 or A2. But humans can understand how well their work is brought together to be masterpieces, machines cannot get this nuance.

Finally, Barry listed the following suggestions;

– Score every task ( however difficult)
– Think about what criteria suit each task
– Double mark everything
– Train raters (ideally for internal consistency)
– Establish evidence of the meaning of score
– Monitor rater performance in real time
– Analyze test data

After lunch, participants were put in 8 different groups to share experiences in writing and speaking testing. Sue, Zeynep and Barry monitored the groups and joined the discussions.

At 15.15, the groups got back together and started to ask the following questions to Barry, Sue and Zeynep.

Wrap up session

Q: While we are testing students we give them the criteria which they are familiar with, we give the criteria on the exam paper. Is that Ok?

A: Yes, so long as it is similar to the one that they are familiar with, that is fine.

Q: Is asking students to write 5 paragraph essays in prep programmes realistic and meaningful? Is there a value of teaching organization skills?

A: The issue of asking students to write essays was been discussed. How much time is spent on the introductions and conclusions which are not exactly represented at all in the CEFR were mentioned.

Q: A: Idea development to support what you want to express, a little bit like paragraph writing should be enough.

Zeynep mentioned the research they conducted at Sabanci University to discover the needs of the departments while revising their writing curriculum. They found out that the departments asked questions as follows;

-What is ….?
-What is the difference between x and y?
-Draw conclusions about the the topic of…

That is, departments definitely didn’t need or want 5-paragraph essays.

So Sabanci University started to train their students accordingly and test them similarly. Students have to follow content-based English classes and the writing questions comes from this content, either as short texts or longer responses. This way, they ensure attendance as well.

Q: How many people should mark a writing piece

A: At least, two. Following a standardization session.

Q: Sometime weak and strong learners are paired up in speaking tests. Is that fair?

A: In interactive tasks, higher achievers are disadvantaged when matched with a lower achiever. It would be worthwhile to tag students and match them with candidates with similar competence

Q: When the teacher is the only interlocutor, is there a good way to set up the test?

A: You need to change the power relationship between the examiner and the test taker.

You could ask students to prepare a topic from some core topics and come to the exam day. The test taker becomes the expert. In such situations, questions cannot be scripted, untrained examiners cannot manage this. That is the weakness of this kind of exam. A huge amount of training is necessary.

Q: How can we get round to mark students when personality factors affect the interaction. For example, a dominant person leading the task.

A: Best way to get round to it is that you need to give different tasks to the candidates. So if one candidate is disadvantaged in one of the tasks, s/he can be advantaged in another.

More from Day 1 – Alton Cole is reporting

Posted on February 14, 2012 by burcuakyol

Day 1 of TEA in Istanbul started with a warm greeting from Sue Hackett at breakfast – I was rather astounded that she recognized me from the pictures in the conference booklet. Zeynep continued with a lovely welcome to the conference, and then we were off and running! Barry and Sue were the main speakers for the day, and it was a whirlwind tour of the theory underlying assessment, and the CEFR. Not being from Europe, I feel like I’m playing a bit of catch up on the CEFR, but I’m sure it will come clearer as the week goes on. One of the best parts was the opportunity to get to meet and talk to people from so many different institutions and countries. I’m looking forward to more of that as the week goes on. The British Council kindly hosted a social event in the evening, which was another good opportunity to meet people and talk about work and non-work topics. It’s been a good start, and I’m sure the writing and speaking today will be equally interesting and illuminating.

Best regards,

Alton

TEA in Istanbul -Day 1-

Posted on February 13, 2012 by burcuakyol

Greetings from Karakoy, Istanbul! Karakoy is one of the most central and prominent places in the city. It is located on the northern shore of the Golden Horn next to Galata Bridge, one of the most important bridges in the inner city.

The reason we will be spending 5 days in this great place is TEA in Istanbul, a TEA SIG Event titled “Using the CEFR for Classroom Assessment”. 50 English teachers from 20 different institutions came together to explore the area with guidance of the expert trainers and share their experiences and reflections with each other.

The first day of the course started with Sue Hackett’s introduction. She talked about the aims of this kind of IATEFL SIG events, that are sharing practice, building networks of practice and quality assurance.

After Sue’s introduction, Zeynep Urkun gave an overview of the event and briefly talked about how the event has been made possible. She also mentioned the strong need for training in the testing and assessment area since teachers are offered very limited training within teacher training courses. Following that, Zeynep indicated that most testers in language institutions use the Common European Framework of References specifying what to test in different levels. She asked what participants think about CEFR in general and led the participants to reflect on the following:
is it….
prescriptive?
limiting?
a necessary evil?
understandable?
applicable?
testable?
Having emphasized the value of CEFR in testing and assessment, Barry O’Sullivan started his session with some theoretical background and information including a detailed definition of a test, validation theory, individual and cognitive characteristics of the test taker, the test system and the scoring system. He also talked about test specifications and the importance of having well-written test specifications, for everyone in a team to construct the tests in the same systematic way and for the new team members not to reinvent the wheel.

At the end of his morning session, Barry summarized the key points of the session as follows:

Validation is key to a successful test.
A validation model must drive all aspects of the test development.
Testing is a team effort.
Specifications are a basic requirement.
Trial all items/tasks before using them.
Monitor the test (analyze it as much as possible)

After Barry’s session, Sue started her session titled “The Common European Framework of Reference for Languages & Self-Assessment” and asked the participants:
“What comes to your mind when we say CEFR?”
The participants came up with the following answers:

can-do statements
level descriptors
benchmark for the tests you develop
general guidelines that you can apply to your context
not context-specific
the language portfolio

Then she talked about:

The aims of the CEFR,
The educational principles underlying the CEFR,
How CEFR can be used in language teaching&learning,
Principles of the CEFR (labour mobility, pluriculturalism, plurilingualism, valuing linguistic and cultural diversity, a neutral, universal scale, democratic citizenship, tolerance for difference)

She also gave the following figures and statistics regarding the multilingualism in Europe:

There are 500 different languages used in London.
29 African languages are spoken in the Canary Islands.
10% of population in France doesn’t have French as their L1.
139 languages are spoken in Ireland.
19 linguistic minorities are recognized in Romania.

After this very comprehensive presentation, the participants had the chance to work in groups to share experiences and discuss the features of well-written test specifications. Very fruitful discussions took place in this part of the day, since different people from different institutions had the chance to show each other some actual tests that they have used in their programs.

After the group work, Barry gave another session on how to write good multiple choice items. At the beginning of his session, he referred to the fact that you can only use multiple choice questions to test knowledge. You cannot test creativity with them. He listed the key issues while writing multiple choice stems and options and shared some examples demonstrating what shouldn’t be done.

The first day of the TEA in Istanbul proved itself to be a very rewarding day for everyone.

At the end of each day, you will find some questions to ponder… We look forward to your responses.

Questions of the day from the TEA in Istanbul Trainers; Sue Hackett, Barry O’Sullivan and Zeynep Ürkün:

What’s the role of CEFR in setting standards for assessment in your context?
How can we construct reading test specifications to ensure that we are testing reading?
What’s the role of cognition in testing, for example, in reading?
What makes a good (enough) multiple choice item?

We invite you to have some TEA in Istanbul:) Please leave comments and/or questions and join the discussion.

IATEFL Testing, Evaluation and Assessment SIG Training Course

Posted on December 20, 2011 by burcuakyol

Using the CEFR for Classroom Assessment
13 – 17 February, 2012
Sabanci University, Karakoy Center, Istanbul

Deadline to apply: January 6, 2012

We are very happy to announce that the IATEFL TEA SIG will be running a five-day training course titled “Applying the CEFR in the Classroom” between February 13 & 17, 2012. The course will be taking place at the Sabanci University Communications Center in Karakoy, Istanbul.

Course trainers will be:

Prof. Barry O’Sullivan (Roehampton University, UK),
Sue Hackett (National Qualifications Authority, Ireland) and
Zeynep Urkun (Sabanci University, Turkey).

This five-day course will provide input into best practices of developing classroom assessment tasks that reflect the main principles of the CEFR and require the participants to produce sample tasks that reflect these principles.

In order for participants to make the most out of the course, they are expected to have a working knowledge of the subject area and a background in language test production. Some prior reading will be issued to participants before the course. A Certificate of Attendance issued by IATEFL will be provided upon completion of the 40-hour course.