Dr. Wayne Camara, Horace Mann Research Chair, ACT
June Edition
calendar-icon_whitecircle
imageedit_5_4553010225
Test Security is a
Matter of Fairness
Wayne J. Camara is the Horace Mann Research Chair at ACT, where he provides technical expertise and research leadership for internal projects, collaborations, and consultation, as well as represents ACT with key stakeholders in measurement and assessment work. In this article, Camara discusses the relationship between fairness and test security, and the importance of incorporating the three pillars of security (prevention, deterrence, and detection/reaction) when pursuing fairness in the use and interpretation of test scores.
The Standards for Educational and Psychological Testing (AERA, APA, and NCME, 2014) define test security as both protection of test content from unauthorized release or use, as well as protection of the integrity of scores. Fairness is a foundational aspect of measurement and is directly related to the validity of test scores. While we often consider subgroup differences in discussions of fairness, it is just as critical to examine test security when the goal is to maintain basic fairness in the use and interpretation of test scores.  There are nearly 20 separate standards which reference test security or issues related to security in the Standards. Procedural fairness is the most salient example of where cheating or other malfeasant behaviors can threaten both the fairness of assessments and the validity of score interpretations. Standard 3.4 states “test takers should receive comparable treatment during the test administration and scoring process” (p.65). This standard reinforces the importance of standardization. Cheating or other issues related to security threaten the comparability of treatment and may provide an unfair advantage to one or more test takers, especially in a highly competitive environment. 
A basic tenet of assessment has been to treat all test takers comparably in terms of the major factors which impact performance such as administrative conditions, access to tools (e.g., calculators, spell-check), timing, prior exposure to test questions or materials, scoring, and reporting. Standardization is required to accomplish this goal and where differentiation is permitted, test developers and users are required to provide evidence of score comparability. This requirement is briefly illustrated as it relates to the three pillars of security – prevention, deterrence, and detection/reaction. 
Contact
Interested in learning more about how to secure your testing program? Want to contribute to this magazine? Contact us.
Submit
Join our mailing list
Copyright© 2018 Caveon, LLC.
All rights reserved. Privacy Policy | Terms of Use
PREVENTION AND FAIRNESS
Maintaining secure test content is essential to procedural fairness. If a test taker has prior knowledge of the test questions, they would have an unfair advantage. There are many examples of how individuals and groups can collaborate to acquire test content before an event. Prevention can greatly reduce many of these threats. Here are a few ways testing programs can protect their exams:

A testing program should have appropriate policies and procedures regarding administrative issues such as item exposure and form use. Often test content can be exposed through inadequate item pools or and reuse policies. Many educational assessments delivered in K-12 schools have extended online testing windows which may run 4-8 weeks because schools lack the infrastructure to assess students in shorter periods. However, if a single form or only a few forms are used in that testing window, students can easily gain access to test content through word-of-mouth. When test content can be exposed without any purposeful effort to cheat, you know you have a problem.
Similarly, for programs that rotate testing forms, it is essential to ensure that such rotation plans either eliminate or greatly reduce any opportunity that the same test taker will be given the same form with retesting. Alternative solutions such as spiraling content across forms can mitigate the exposure risk if not completely eliminate the threat. These are commonsense issues that maintain procedural fairness.

In a highly competitive situation, such as with credentialing, licensing, or admissions testing, you can anticipate retesting to occur, and ensuring test takers will have very limited opportunities to see the same items or forms is simply a basic responsibility of proper administrative design and program maintenance. Of course, even when rigorous exposure controls are in place, test developers and users need to implement additional steps that minimize direct threats to capture content. Prohibiting test takers from bringing electronic devices (e.g., cell phones) into a secure testing room, protecting item content (e.g., ensuring online access is not allowed until testing begins, locking up paper forms in secure locations), and ensuring anyone with access to test content does not have a conflict of interest and is properly trained in prevention are equally important.
"Maintaining secure test content is essential to procedural fairness. If a test taker has prior knowledge of the test questions they would have an unfair advantage."
DETERRENCE AND FAIRNESS
Deterring individuals from cheating or engaging in inappropriate behaviors is another requirement to ensure the fairness of test scores across all test takers. Ensuring that all individuals, including test takers, are aware of both proper and improper behaviors, as well as the penalties and consequences associated with cheating, stealing of content, and other unwanted behaviors is essential. This information can be conveyed during registration for an assessment and at the beginning of a testing session. Many test sites also include posters or information reiterating these principles. Obtaining a signed agreement from test takers can both be used to reinforce or remind individuals of these issues and provide consent in the event that penalties are later enacted for a violation of these procedures. Similar signed statements are often used with test administrators, proctors, scorers, item writers, and staff involved in the development, administration, maintenance, and scoring of assessments.
"The manner in which testing programs respond to detection is a basic issue of fairness."

Deterrence is also critical to ensure fairness for all test takers. For example, many high-stakes online assessments will limit retesting opportunities within a year and have stiff penalties for violations of approved practices which may further limit test takers from retesting or reporting scores. Some may argue that such policies reduce available income or may be viewed as overly ‘heavy handed’ and create a punitive view of the assessment program and test sponsor. On the other hand, if deterrence is minimal, then test takers and other personnel involved in the assessment may have few disincentives to avoid such behavior. The result of engaging in hacking, copying, cheating, theft of content, impersonation of test takers, or a host of other potentially negative behaviors will be to provide an unfair advantage to certain test takers. When deterrence is minimal, both the fairness and validity of assessment results are questioned.
DETECTION/REACTION AND FAIRNESS
Technology is largely responsible for many of the innovations across assessment today, but it ensures that there will be new and innovative threats to the security that didn’t exist in the past. During the past decade, state assessments have made headlines as educators systematically engaged in activities to raise test scores for students. The response has been to institute erasure analyses which identify patterns that signal collusion but will not usually detect irregularities by an individual test taker. A second response has been to move tests online which may eliminate one threat but certainly raised other risks factors which have to be addressed.  Many programs implement psychometric analyses to examine response patterns, item latency, or score changes with retesting. Developing a protocol for detection is essential to maintain the fairness of a testing program because they can help identify new problems which have previously not emerged or been recognized by a program. Whereas prevention and deterrence require a theory of action regarding the threat or unwanted behaviors, detection doesn’t necessarily require advanced knowledge of how cheating or irregularities will occur. In this way, detection can identify new risks and result in improvements in the prevention and deterrence procedures used. 

The manner in which testing programs respond to detection is a basic issue of fairness. Here it is important to remind test users that irregularities should rarely result in an accusation of cheating, but rather flag an irregularity. What actions are taken from such irregularities, will in turn, determine whether the consequences deter such behaviors going forward.
TESTING STANDARDS
There are a number of standards which specifically address test security and have implications for test fairness. For example, Standard 3.4 describes fairness as comparable treatment during administration and scoring. Irregularities or breaches in test security pose threats to such comparability. Test developers examine mode and device differences and conduct comparability studies to ensure scores are equivalent. A similar step is required to ensure fairness and comparability that could occur through unintended differences resulting from such irregularities such as prior exposure to items, knowledge of which items are non-operational in a test, or collusion between test takers. Standards 8.11 and 13.8 address both the fairness of scores and the integrity of scores. For example, the former standard notes that any form of cheating or behavior reducing validity or fairness should be investigated.

Test users have a responsibility to consider security in selecting among assessments and when implementing a program. Standard 3.20 calls out security as one of many factors which should be considered in comparing available assessments; Standard 7.9 requires documentation to explain how to protect test materials and prevent inappropriate exchange of information. Security procedures may include the tracking and storage of materials, encryption of electronic transmissions of content and scores, non-disclosure agreements, monitoring of examinees and evaluation of item statistics designed to detect a breach or exposure (p. 83).
"Cheating and irregularities are sources of construct-irrelevant variance and can only be dealt with through a comprehensive approach of prevention, deterrence and detection.  When such procedures and practices are lax the basic fairness and score comparability across test takers can be challenged."
Finally, chapters 8 and 9 of the Standards address security issues from several perspectives. For example, Chapter 8 requires test takers to be aware that having others test for them, disclose test content, or engage in other forms of cheating is unacceptable, and identify potential sanctions that may result from irregularities. Chapter 9 states explicitly that “Test users have the responsibility to protect the security of tests, including that of previous editions” (9.21), and they also “have a legal and ethical responsibility to protect the security of test content and privacy of test takers…” (9.0). Today, technology and innovation together are pushing assessment into uncharted purposes and new methods. However, the fundamental promise of assessments is that scores should be valid for the intended purpose, and test takers should be on a level playing field. Standardization of processes is a requirement and where deviations are purposefully designed in an assessment we must provide evidence of score comparability and lack of construct-irrelevant variance. Cheating and irregularities are sources of construct-irrelevant variance and can only be dealt with through a comprehensive approach of prevention, deterrence and detection. When such procedures and practices are lax the basic fairness and score comparability across test takers can be challenged.
You might also like...
More Reads
Get to know your colleagues in the test security industry a little bit better as they answer fun and intriguing questions from Marcel Proust's 18th century parlor game
Ranging from the serious to the absurd, here are 10 cheating-centric films that Caveon test security professionals recommend you keep you entertained on your next movie night.