Print Page   |   Contact Us   |   Report Abuse   |   Sign In   |   Join or Create a Guest Account
LTRC 2020 Roundtable: Language test translation and adaptation

Tuesday, June 9, 2020, 9:00 am - 4:00 pm
Organizers: Albert Weideman & Tobie van Dyk

Maximum number of participants: 35

Aim and rationale

There is worldwide interest in test adaptation and translation, not only in the case of language tests, but also in other fields. Issues of fairness and justice arise immediately, as do concerns about test equivalence not only across languages, but also across cultures. The aim of this roundtable is to sample various designs that have been proposed as solutions in the context of language assessment in Africa, though it will also take in contributions from further afield. The roundtable will accommodate as wide a range of levels of language ability as possible, as well as a variety of cultural issues in language assessment. Similarly, it will engage with institutional as well as national issues related to its theme.


The format that the roundtable will take is the presentation of seven papers of 20 minutes each, followed by 10 minutes discussion. The roundtable will conclude with a plenary discussion of an hour. It will take place on 9 June 2020, beginning at 09:00 and concluding at 16:00, with a tea and lunch break in between.

Organizers: Albert Weideman (University of the Free State) & Tobie van Dyk (North-West University)

Presenters and participants

Participation in the roundtable is open not only to presenters, but to all interested in language test translation and adaptation, or in particular topics featured. Participants must be registered for the conference itself, and there may be a small additional fee for attendance of the roundtable. We look forward to a productive discussion.





Paper 1: Does one size fit all? Some considerations for test translation

Tobie van Dyk (North-West University), Piet Murre (Driestar Educatief) & Herculene Kotzé (NWU)




Paper 2: A taxonomy for multiple-language test development

Ramsey Cardwell (The University of North Carolina at Greensboro)




Paper 3: Mapping it all out: Using ‘Textmapping’ to facilitate parallel test development for multilingual administrations

Sanet Steyn (University of Cape Town






Paper 4: Test translation and adaptation: from an intuitive to a systematic approach

Bjorn Norrbom (National Center for Assessment, Saudi Arabia)




Paper 5: Test translation in foreign languages: a study of multilingual item difficulty

Katharina Karges (University of Fribourg)






Paper 6: Developing assessment tools for a multilingual context in the early grades

Nangamso Mtsatse (Funda Wande)




Paper 7: The development of an Academic literacy Diagnostic Assessment and Placement Test (ADAPT) in two languages

Anneke Butler (North-West University)




Concluding roundtable discussion: Issues in language assessment and adaptation




Paper 1: Does one size fit all? Some considerations for test translation

Tobie van Dyk (NWU)
Piet Murre (Driestar Educatief & NWU)
Herculene Kotzé

In this paper a number of variables influencing study success are acknowledged, among others underpreparedness for university education, difficulties with the transition from school to higher education, financial challenges, emotional well-being, motivation, study skills, self-efficacy, and educational background. The paper explores the use of tests as indicators of what students struggle with in terms of their language and literacy abilities, since the results of such tests are often used to inform support practices and curriculum design. Indeed, as is asserted in the literature, one of the purposes of language testing is to provide information to make decisions about possible subsequent courses of action. This is reiterated in discussions of test use, i.e. that data generated by tests should benefit stakeholders. The pertinent question is whether it is responsible merely to use and reuse tests in different, albeit comparable, contexts? Although such a pragmatic approach seems to be justifiable, can it still be considered fair and valid, objectively measuring what it aims to test? Does one size fit all? This paper considers the translation of an Afrikaans academic literacy test, designed for South African universities, into Dutch for use in the Netherlands. Theoretical frameworks for academic literacy and test translation will be presented, as well as empirical data derived from statistical analyses (alpha values, t-tests, p-values, DIF) and qualitative feedback based on interpretations made after consultation with subject experts. The paper will conclude with recommendations on ensuring fair and unbiased practices, particularly in the field of test translation.

Paper 2: A taxonomy for multiple-language test development

Ramsey Cardwell
The University of North Carolina at Greensboro

In multilingual regions, there is a need to compare examinees from different language (and cultural) groups on cognitive attributes for admissions, employment, and certification/licensure. In such cases, the goal is not only adaptation of a test to reflect the same construct in another population, but truly interchangeable scores across languages. When not all examinees share a common language, the only option is to test in multiple languages, as differences in language proficiency could introduce construct-irrelevant variance on a single-language exam (AERA, APA, & NCME, 2014). But when the items of a test are translated, they potentially constitute completely different items.

Given that the assumption of common items across groups is potentially violated when test content is translated, one solution is to reduce DIF as much as possible in the test development process. In this paper, I propose a taxonomy of multiple-language test development and equating approaches, with examples from educational achievement and certification testing in the North American context. The development approaches ‑ named sequential, parallel, reciprocal, and simultaneous ‑ differ in the extent and manner in which the content of the language forms influence each other. The categories of possible equating approaches are named monolingual groups, bilingual groups, and matched groups based on the types of samples involved. Each choice of development and equating approach has implications for item functioning and the validity of test score interpretation and use. Thus, the utilization of a taxonomy can help guide the evolution and application of best practices in multiple-language test development.

Paper 3: Mapping it all out: Using ‘Textmapping’ to facilitate parallel test development for multilingual administrations

Sanet Steyn
University of Cape Town

One of the key considerations in using parallel measurement instruments is the comparability of the test forms. Linking, scaling and equating methods, for example, generally require a qualitative judgement on the content and structure of the instruments before any statistical operations are employed. Already difficult to judge within monolingual testing contexts, comparing test forms across multiple languages poses additional challenges. Those who are designers and developers of tests in these types of settings should therefore endeavour to address this consideration already in the early stages of the development process. Articulating the test item specifications in great detail not only provides a guide for the development of individual test forms, but also a framework for comparing parallel forms. In addition, however, one would need a strategy for providing evidence to show that there is alignment between the specifications and the instruments, as well as between the instruments themselves. As part of a South African project looking at multilingual language assessment in the grade 12 exit-level examinations (the Umalusi Home Languages Project), a variation on Green’s (2017) ‘textmapping’ procedure was used to evaluate a set of parallel instruments, initially designed in English and Afrikaans, in terms of how well the instruments are aligned with the test construct and how well the test item specifications were followed. This paper will report on the preliminary findings of this experiment and comment on how this may be employed both in the development of new tests and retrospectively in the evaluation of existing instruments.

Keywords: parallel testing; test development; textmapping; construct representation; fairness.


Green, R. 2017. Designing listening tests: a practical approach. Palgrave Macmillan UK.

Paper 4: Test translation and adaptation: from an intuitive to a systematic approach

Bjorn Norrbom
National Center for Assessment, Saudi Arabia

The National Center for Assessment (NCA) faces a large number of translation and adaptation challenges. These include, but are not limited to, full and partial translations of NCA Arabic tests into English, full and partial translation of instruments and test constructs from English into Arabic which then inform test development in Arabic, and possible back translation of those into English. In addition it deals with consultations on translated texts considered for inclusion in international tests along with full translation of related questionnaires conducted by the Center. The demand for tests translated into English is ever increasing. In order to deal with these important and diverse issues, the Center is developing comprehensive a local translation and adaptation framework by referring to practices from international tests such as PISA, PIRLS, and TIMMS in order to move from current intuitive and individual practices to more systematic and valid approaches. The presentation describes both theoretical and practical aspects in developing the framework. In addition, it also considers key psychometric aspects such as equating test forms across languages. The presentation also touches upon resource optimization to achieve maximum validity. The contribution is likely to stimulate discussion among participants on the possibilities and challenges facing national and regional organizations that need to provide equitable tests in several languages and across cultures as a response to the increase in linguistic diversity in society and how such organizations may optimize their resources in doing so. It presents possible solutions for defensible assessment design within an institution facing multilingual challenges.

Paper 5: Test translation in foreign languages: a study of multilingual item difficulty

Katharina Karges
University of Fribourg/Switzerland

While test adaptation and translation has long been discussed in the social studies and, to some extent, practised in international assessments such as PISA, it is almost unheard of in second and foreign language assessment. It is unclear whether this is due to incompatible test constructs across languages or whether it is simply not usually done. In this research project, I have the rare opportunity to study three foreign language tests which are translations of each other. The items and test results were part of a large-scale assessment that studied reading and listening competences of sixth-graders (N≈20’000) in their respective first foreign language studied at school: English, French or German (A1/A2 of the CEFR). The assessment was conducted in Switzerland, in three regions with different languages of instruction (German, French and Italian). In my analyses, a comparison of the relationship between certain task or item characteristics and empirically measurable item difficulties across the languages will add evidence to the general discussion surrounding item difficulty in large-scale assessments. Also, the different languages of instruction included in the study will allow more detailed analyses of the impact of individual linguistic repertoires in foreign language assessment. Finally, the results from these analyses will yield more information on the extent to which foreign language assessments a) measure comparable constructs and b) can thus be translated. This paper will discuss some preliminary results and, more importantly, open up an exchange of ideas and experiences with researchers and practitioners from different contexts.

Paper 6: Developing assessment tools for a multilingual context in the early grades

Nangamso Mtsatse
Funda Wande

South Africa’s education system is widely considered to be underperforming by regional and international standards. The low literacy outcomes of international tests can be addressed by focusing on reading with comprehension in the early grades of school. Evidence from both local and international studies have found a high correlation between oral reading fluency (ORF) and reading comprehension. Although there has been extensive research on ORF benchmarks and norms for the English language, there have been limited studies on the mother tongue languages spoken in South Africa. This means that the most commonly used ORF benchmarks and norms (such as that of DIBLES) have not been developed to cater for reading in African languages. Most African languages are transparent and agglutinating, as opposed to English, which has an opaque spelling and is not an agglutinating language. In a previous study, I have shown how omitting considerations relating to the orthography and morphology of languages in developing reading literacy programmes will lead to biased results. The current study aims to develop assessment tools for African languages in the early grades, and makes use of a sample unit consisting of 6000 learners across the Foundation Phase. The significance of presenting tentative assessment tools will lie in the application of these in order to refine reading development curricula in a way that addresses specific African language demands. For teachers, the assessment tools being developed in this study are used to monitor reading progress and to design more effective reading instruction.

Paper 7: The development of an Academic literacy Diagnostic Assessment and Placement Test (ADAPT) in two languages

Anneke Butler
North-West University

From a rich body of research there is common concern about the inadequate academic literacy (AL) levels of many students who gain access to higher education in South Africa. One of the most comprehensive responses to this underpreparedness of first-year students was ICELDA’s development of AL placement instruments (TALL/TAG) in two languages of instruction. These provide decision-makers with information on the general AL levels of students. These placement instruments have, therefore, not been developed as diagnostic AL tests per se. Any diagnostic information yielded by these tests would remain coarse-grained on a macro-level as we could only broadly diagnose risk areas using the results of specific sections/aspects of the tests. There exists a clear need for a diagnostic AL assessment instrument, since there is no such instrument available for university education in SA. Some would claim that in the field of language testing we give lip service to the concept of a diagnostic test, since we are still figuring out what such a test should look like. This paper reports on a research project that aims to analyse critically the current construct of the TALL/TAG in order to create a taxonomy of AL abilities that could lead to a layered and weighted framework for assessment in a more nuanced manner: the identification of specific AL abilities with which individual students struggle in the two languages of instruction. Since these tests were initially designed on the basis of the same construct and specifications in both English and Afrikaans, they represent an additional layer of complexity when used as diagnostic measure. Such a diagnostic instrument as the one we envisage would enable one to provide more fine-grained (specific) and individual feedback to students. Furthermore, it could also provide essential information for the design of appropriate academic literacy interventions.

academic literacy (AL); test construct; placement test; TALL (Test of Academic Literacy Levels); TAG (Toets van Akademiese Geletterdheidsvlakke); diagnostic tests; diagnostic assessment


Organizer, Albert Weideman:

Association Management Software Powered by YourMembership  ::  Legal