Instruction > Testing & Evaluation
Testing & Evaluation
"Education must be increasingly concerned about the fullest development of all children and youth, and it will be the responsibility of the schools to seek learning conditions which will enable each individual to reach the highest level of learning possible."
Benjamin Bloom
|
Assessment Types
Key Terms
|
Definition
|
Use
|
Assessment
|
Any systematic method of obtaining evidence from posing questions to draw inferences about the knowledge, skills, attitudes, and other characteristics of people for a specific purpose.
|
Exam
|
A summative assessment used to measure a student’s knowledge or skills for the purpose of documenting their current level of knowledge or skill. Seeks to measure achievement. |
Summative |
Test
|
A diagnostic assessment to measure a student’s knowledge or skills for the purpose of informing the students or their teacher of their current level of knowledge or skill. Seeks to provide guidance for learning and instruction. |
Diagnostic |
Quiz
|
A formative assessment used to measure a student’s knowledge or skills for the purpose of providing feedback to inform the student of his or her current level of knowledge or skill. Seeks to provide feedback / status of learning and instruction. |
Formative |
Survey
|
A diagnostic or reaction assessment to measure the knowledge, skills, and/or attitudes of a group for the purpose of determining needs required to fulfill a defined purpose. Seeks to capture key information. |
Need,
Reaction,
Diagnostic
|
Uses of Assessment 
Types
|
Purpose & Assessment Strategy
|
Diagnostic |
An assessment that is primarily used to identify the needs and prior knowledge of participants for the purpose of directing them to the most appropriate learning experience. |
Needs |
An assessment used to determine the knowledge, skills, abilities and attitudes of a group to assist with gap analysis and courseware development. Gap analysis determines the variance between what a student knows and what they are required to know. |
Formative |
An assessment that has a primary objective of providing practice for search and retrieval from memory for a student and to provide prescriptive feedback. This assessment might relate to:
item,
topic, and/or
course levels.
|
Summative |
An assessment, usually quantitative, whose primary purpose is to give a definitive grade and/or make a judgment about the participant's achievement. If this judgment verifies that the participant has met an established standard indicative of special expertise, the judgment may confer “certification.” |
Kirkpatrick's Levels of Assessment 
In 1959 Donald Kirkpatrick developed a model of training evaluation. It is currently the most widely used evaluation approach. It is simple, flexible and complete using a four level approach. Recently, a fifth level emphasizing return on training investment (ROI)has been added by Jack Phillips:
Reaction/ Satisfaction:
Level 1 (Kirkpatrick)
Did they like it?
|
Evaluate Reaction: Are people happy with the training inputs?
An assessment used to determine the satisfaction level with a learning or assessment experience. These assessments are often known as Level 1 evaluations based on Dr. Donald Kirkpatrick's model. Course satisfaction evaluations (sometimes referred to as smile or happy sheets) are completed at the end of a learning or certification experience.
Questionnaires are the most common collection tool. Obtain reaction to content, methods, media, trainer style, facilities, & course materials.
|
Learning
Level 2 (Kirkpatrick)
Did they learn?
|
Evaluate Learning: What do people remember from the training session?
For this meaure to be meaningful, it is represents a summative evaluation that validates that learners have met the criterion objectives of the training program. Learning is change in knowledge, skills and attitude. Can be measured by interview, surgeys, tests (pre-/post-), observations, and combinations of these.
|
Behavior / Transfer of Learning
Level 3 (Kirkpatrick)
Did they use it?
|
Evaluate Behavior: Do people use what they know at work?
Behavior is a measure of the transfer of knowledge, skills and/or attitude to the real world. It is a measure of achievement of performance objectives. Behavior evaluation is the extent of applied lerning back on the job. Observe the behavior; survey key people who observe the performer; use checklists, questionnaires, interview or a combination of these.
The benefits to conducting Level Three evaluations are: (1) an indication of the ‘time to job impact; (2) an indication of the types of job impacts occurring (cost, quality, time, productivity)
|
Work Results
Level 4 (Kirkpatrick)
Did it impact the bottom line?
|
Evaluate Results: What are the outcomes of applications on the job over a period fo time?
Results evaluation is the effect on the business or environment by the trainee. Measures must already be in place via normal management systems and reporting. The challenge is to relate influence of the trainee(s) to these base measures. Assess the "bottom line" or final results. The concept of "results" depends upon the goal of thr training program. Proof is concrete, evidence is soft. Use control group; allow time for results to be realized; measure before and after the program; consider cost versus benefits.
The type of business impact data that can be measured are the following: Sales training: Measure change in sales volume, customer retention, length of sales cycle, profitability on each sale after the training program has been implemented. Technical training: Measure reduction in calls to the help desk; reduced time to complete reports, forms, or tasks; or improved use of software or systems. Quality training: Measure a reduction in number of defects.Safety training: Measure reduction in number or severity of accidents. Management training: Measure increase in engagement levels of direct-reports
The advantages to a Level Four evaluation are as follows:
(1) determine bottom line impact of training;
(2) tie business objectives and goals to training
|
ROI
Level 5 (Phillips)
What is the return on training investment?
|
Evaluate Financial Value: What is the impact of training on the bottom-line financials?
Jack Phillips' Five Level ROI Model Source: "Measuring the Return on Investment in Training and Development Certification Materials", Jack J. Phillips, Ph.D (2002).
The methodology is a comprehensive approach to training measurement. It begins with planning the project (referred to by Dr. Phillips as an Impact Study). It moves into the tools and techniques to collect data, analyze the data and finally report the data. The end result is not only a Level 5 ROI but also measurements on the Kirkpatrick 4 Levels as well. This yields a balanced scorecard approach to the measurement exercise.
|
Question Types: Quiz, Test, Interactive Practice 
- Multiple-choice: choosing one answer from given alternatives (each response can be given a different score)
- Multiple response: choosing a number of answers from a list
- True/False: applied to a statement
- Yes/No: applied to a choice
- Lykert Scale: rating scale to survey preferences & opinions
- Fill-in-the-blank: text match can be used for simple text input
- Matching: selection/association, match items from two related lists
- Assertion/Reason: select correct reason for a particular assertion
- Multiple Hotspots: move labels to an appropriate place on an image
- Drag & Drop: manipulate graphics,by placing them into a particular sequence or arrangement
- Short answer: text match used for multi-line text box
- Essay: text match used for large multi-line text box
- Games: complex programmatic interactivity
- Simulation: interactive practice of a task
- Custom question types: using audio-video enhancements, problem-solving, visualization or modeling
Benefits of Web-based Testing 
- Time and cost savings in test administration & automated scoring
- Greater score precision
- Maximize student engagement: minimize student frustration in taking test that are too difficult or boredom in taking test that are too easy
- Improved test security
- New kinds of questions by using multimedia, simulations , visualizations, and other resources to assess active learning
Bloom's Taxonomy: Cognitive Domain 
An introduction to Bloom's Domains of Learning was presented in an earlier section "Types of Instruction by Domains of Learning: Knowledge, Attitude & Skills". The following is organized to emphasize the importance of this model of the Cognitive Domain for measuring performance and creating effective tests.

The major categories in the cognitive domain of educational objectives, edited by Benjamin Bloom (1956), are the following.
Knowledge: Remember and Recall; Observe and recognize information, Demonstrate knowledge of dates, events, places; Demonstrate knowledge of major ideas; Demonstrate mastery of basic subject matter.
Comprehension: Translate and Paraphrase; Interpret and Extrapolate information; Use information, methods, concepts theories in new situations; Solve problems; Use required skills or knowledge; Ability to grasp meaning; Estimating future trends (predicting consequences or effects)
Application: Transfer knowledge to new settings; Use of Generalizations in Specific instances; Ability to use learned material in new and concrete situations (rules, principles, laws, theories)
Analysis: Differentiate component parts; Determine Relationships; Breakdown ideas or structure; Identification of parts and organizational principles; Determine arrangement, logic and semantics
Synthesis: Weave components into a whole; Create New Relationships; Use old ideas to create new ones; Generalize from given facts; Relate knowledge from several areas; Predict; Draw conclusions; The ability to put parts together for form a new whole
Evaluation: Judge the value of information; Exercise of learned judgment; Evaluate based on given criteria; The ability to judge the value of material (i.e., statement, novel, poem, research report) for a given purpose
Quick Overview: Key Verbs
|
Knowledge |
Comprehension |
Application |
Analysis |
Synthesis |
Evaluation |
List
Name
Identify
Show
Define Recognize Recall
State
Visualize
Tell
Describe
Label
Collect
Examine
Quote
Record |
Summarize
Explain
Interpret
Describe
Compare
Paraphrase Differentiate Demonstrate
Convert
Defend
Distinguish
Give Examples
Paraphrase
Predict
Recognize
|
Solve
Illustrate Calculate
Use
Interpret
Relate Manipulate Apply
Modify
Complete
Show
Examine
Relate
Experiment
Discover
Classify
|
Analyze Organize Deduce Contrast Compare Distinguish Discuss
Plan
Devise
Criticize
Diagram
Inspect
Examine
Categorize
Appraise
Differentiate
|
Design
Hypothesize
Support
Schematize
Write
Report
Justify
Categorize
Combine
Compile
Compose
Create
Devise
Explain
Organize
Summarize |
Evaluate
Choose
Estimate
Judge
Defend
Criticize
Compare
Rate
Value
Choose
Assess
Estimate
Measure
Select
Score
Revise |
Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.
Test Development: Question Stems and Verbs 
I. KNOWLEDGE (drawing out factual answers, testing recall and recognition) |
|
who |
how |
describe |
which one |
|
what |
label |
define |
what is the best one |
|
why |
match |
choose |
how much |
|
when |
select |
omit |
what does it mean |
|
where |
reproduce |
define |
who invented |
|
state |
list (number) |
what did the book say about |
|
Activities: observe, locate, listen, research, identify, discover, match, ask |
Products: films, models, tapes, books, people, records, diagrams, magazines, newspapers, radio, tv, internet, events |

II. COMPREHENSION (translating, interpreting and extrapolating) |
|
state in your own words |
classify |
which are facts |
|
what does this mean |
judge |
is this the same as |
|
give an example |
infer |
select the best definition |
|
condense this paragraph |
show |
what would happen if |
|
state in one word |
indicate |
explain what is happening |
|
what part doesn't fit |
tell |
explain what is meant |
|
what expectations are there |
translate |
read the graph, table |
|
what are they saying |
select |
this represents |
|
what seems to be |
match |
is it valid that |
|
what seems likely |
explain |
show in a graph, table |
|
which statements support |
represent |
demonstrate |
|
what restrictions would you add |
|
|
Activities: drawing conclusion, analogy, causal relationships, summary, outline |
Products: skit, speech, story, photograph, statement, cartoon, poster, diagram, graph, drama, tape recording, collage |

III. APPLICATION (to situations that are new, unfamiliar or elicit a new perspective from the learner) |
|
predict what would happen if |
explain |
solve |
|
choose the best statements that apply |
apply |
illustrate |
|
judge the effects |
select |
calculate |
|
what would result |
relate |
use |
|
tell how, when, where, why |
manipulate |
interpret |
|
tell what would happen if
|
modify |
complete |
|
identify the results of |
examine |
show |
|
tell how much change there would be |
experiment |
discover |
|
relate |
classify |
demonstrate |
|
If you know A & B, how could you determine C |
What other possible reasons... |
What might they do with... |
Activities: list, construct, teach, paint, sketch, manipulate, interview, experiment, record, report, stimulate |
Products: diary, collection, puzzle, diagram, photographs, sculpture, diorama, scrapbook, map, stitchery, mobile, model, illustration |

IV. ANALYSIS (breaking down into parts, forms) |
|
distinguish |
what is the function of |
|
identify |
what's fact, opinion |
|
what assumptions |
what statement is relevant |
|
what motive is there |
related to, extraneous to, not applicable |
|
what conclusions |
what does author believe, assume |
|
make a distinction |
state the point of view of |
|
what is the premise |
state the point of view of |
|
what ideas apply |
what ideas justify conclusion |
|
what's the relationship between |
the least essential statements are |
|
what's the main idea, theme |
what inconsistencies, fallacies |
|
what literary form is used |
what persuasive technique |
|
implicit in the statement is |
breakdown |
|
diagram |
differentiate |
|
infer |
outline |
|
illustrate |
relate |
|
point out |
select |
Activities: classify, categorize, separate, compare, dissect, contrast, advertise, survey |
Products: graph, survey, questionnaire, commercial, report, diagram, chart |

V. SYNTHESIS (combining elements into a pattern not clearly there before) |
|
create |
how would you test |
make up |
|
tell |
propose an alternative |
compose |
|
make |
solve the following |
formulate |
|
do |
plan |
how else would you |
|
choose |
design |
state a rule |
|
develop |
categorize |
compile |
|
devise |
explain |
generate |
|
modify |
organize |
rearrange |
|
relate |
revise |
rewrite |
|
summarize |
tell |
write |
Activities: combine, invent, compose, hypothesize, predict, estimate, role-play, produce, infer, produce, imagine, write |
Products: story, poem, play, pantomime, song, cartoon, news article, TV show, radio show, magazine, advertisement, new game, structure, invention, recipe, puppet show, machine |

VI. EVALUATION (according to some set of criteria, and state why) |
|
appraise |
what fallacies, consistencies, inconsistencies appear |
|
judge |
which is more important, moral, better, logical, valid, appropriate |
|
criticize |
find the errors |
|
defend |
compare |
|
conclude |
describe |
|
discriminate |
explain |
|
summarize |
support |
|
interpret |
justify |
LORI -- Learning Object Review Instrument 
The Learning Object Review Instrument (LORI) is used to evaluate the quality of E-Learning resources. LORI is an online form that consisting of rubrics, rating scales, and comment fields. The current version of LORI available from eLera is version 1.5. Here is a summary of the criteria used for evaluation:
Dimensions of
Quality
|
Evaluation
Criteria
|
Content Quality |
Veracity, accuracy, balanced presentation of ideas, and appropriate level of detail |
Learning Goal Alignment |
Alignment among learning goals, activities, assessments, and learner characteristics |
Feedback and Adaptation |
Adaptive content or feedback driven by differential learner input or learner modeling |
Motivation |
Ability to motivate, and stimulate the interest or curiosity of, an identified population of learners |
Presentation Design |
Design of visual and auditory information for enhanced learning and efficient mental processing |
Interaction Usability |
Ease of navigation, predictability of the user interface, and the quality of UI help features |
Accessibility |
Support for learners with disabilities |
Reusability |
Ability to port between different courses or learning contexts without modification |
Standards Compliance |
Adherence to international standards and specifications |
E-Learning Program Evaluation: Brandon-Hall 
The following evaluation criteria was developed by Brandon Hall and Arjun Reddy in 1988 (MM&ITNL,Brandon-Hall) for evaluation of E-Learning programs:
Evaluation Categories
|
Evaluation
Criteria
|
1. Content
|
Right amount and quality of information
Content meets defined objectives
Content organization
|
2. Instructional
Design
|
Course objectives clearly defined & tangible
Chunking of information
Job aids / performance support
|
3. Interactivity |
User engaged through the opportunity for their input
Amount of interactivity
Quality of interactivity |
4. Navigation |
Users can determine their own path through the program
An exit option always available
A course map is always accessible
Appropriate use of icons and/or clear labels so that users don't have to read excessively to determine program options
Clarity of directions
Adequacy of navigation controls
Branching to other topics doesn't create a sense of being 'lost' |
5. Motivation |
Program engages the user through novelty, humor, game elements, testing, adventure, unique content, surprise elements, etc.
The course follows a metaphor (golf, Doom, etc.)
Metaphor effectively used to help the learning process
Progress indicator |
6. Use of Media |
Program effectively and appropriately employs: video, audio (voice, music, sound effects), animation, graphics, special visual effects |
7. Evaluation |
Mastery of a section required before proceeding to later sections
Section quizzes used
Final exam
Quality of the testing modules
Testing is relevant to the real world performance objectives
Appropriateness and timeliness of feedback |
8. Aesthetics |
Program attractive and appealing to eye and ear
Design of the interface is simple, uncluttered, symmetry of objects such as headings, menu bar, etc. Screens do not look busy. |
9. Record
Keeping |
Student performance data recorded, such as time to complete, question analyses, and final score
Data forwarded to course manager automatically |
10. Tone |
Program designed for intended audience
Program avoids being condescending, trite, etc. |
In 2002 the following criteria was used by Brandon Hall for evaluation of E-Learning programs:
Five Star Evaluation: M. David Merrill 
The following principles have been proposed by M.David Merrill as a way of getting at the most important criteria for evaluating E-Learning effectiveness. A more detailed discussion has been presented in the section "5 Star Instruction."
5 Star Evaluation
|
Merrill's Evaluation Criteria
|
1. Problem Principle |
Learning is facilitated when learners are engaged in solving real-world problems
The learner is engaged at the problem or task level, not just the operation or action level
The learner solves a progression of problems. This promotes skill development, and meaningful feedback and reinforcement to occur.
The learner is guided to an explicit comparison of problems.
|
2. Activation Principle |
Learning is facilitated when existing knowledge is activated as a foundation for new knowledge
The learner is directed to recall, relate, describe, or apply knowledge from relevant past experience that can be used as a foundation for the new knowledge.
The learner is provided with relevant experience that can be used as a foundation for the new knowledge.
|
3. Demonstration Principle |
Learning is facilitated when new knowledge is demonstrated to the learner
The Learner is shown rather than merely told.
The demonstration is consistent with the learning goal.
The learner is shown multiple representations.
The learner is directed to explicitly compare alternative represenations.
The use of media play a relevant instructional role. |
4. Application Principle |
Learning is facilitated when new knowledge is applied by the learner (e.g., guided and unguided practice)
The learner is required to use his or her new knowledge to solve problems.
The problem-solving activity is consistent with the learning goal.
The learner is shown how to detect and correct errors.
The learner is guided in problem-solving by appropriate coaching that is gradually withdrawn. |
5. Integration Principle |
Learning is facilitated when new knowledge is integrated into the learner's world
The learner is required to demonstrate his or her new knowledge or skill.
The learner can reflect-on, discuss, and defend his or her new knowledge or skill.
The learner can create, invent, and explore new and personal ways to use his or her new knowledge or skill |
Conscious Competence Learning Matrix 
The following model was developed to describe some key observations about learning and performance — and the assessment of performance . The learner of a new skill begins at stage 1 - 'unconscious incompetence', and ends at stage 4 - 'unconscious competence', having passed through stage 2 - 'conscious incompetence' and - 3 'conscious competence'.

Trainers often make the mistake of assuming that a learner is at stage 2, and focus their effort toward achieving stage 3, when in fact the learner is still at stage 1. This is a a fundamental reason for training failure — because the learner simply has not recognized the need for new learning. Until the learner has achieved awareness of a weakness or a training need ('conscious incompetence'), the learner has no interest, attention or motivation for the learning process. Learners only respond to training or teaching when they are aware of their own need for it, and the personal benefits they will derive from it.
1 — Unconscious Incompetence
|
The learner is not aware of the existence or relevance of the skill area.
The learner is not aware that they have a particular deficiency in the area concerned
The learner might deny the relevance or usefulness of the new skill
The learner must become conscious of their incompetence before development of the new skill or learning can begin
The aim of the trainer or teacher is to move the learner into the 'conscious competence' stage, by demonstrating the skill or ability and the benefit that it will bring to the learner's effectiveness
|
2— Conscious Incompetence |
The learner becomes aware of the existence and relevance of a skill; he becomes aware that he cannot perform the skill
The learner is therefore also aware of their deficiency in this area, ideally by attempting or trying to use the skill
The learner realises that by improving their skill or ability in this area their effectiveness will improve
Ideally the learner has a measure of the extent of their deficiency in the relevant skill, and a measure of what level of skill is required for their own competence
The learner ideally makes a commitment to learn and practice the new skill, and to move to the 'conscious competence' stage
In the eastern philosophy of Zen there is a term called "beginner's mind" — it reflects a state of radical openness to learning — deeply felt humility and motivation for continued learning
|
3 — Conscious Competence |
The learner achieves 'conscious competence' in a skill when he can perform it reliably at will
The learner will need to concentrate and think in order to perform the skill
The learner can perform the skill without assistance
The learner will not reliably perform the skill unless thinking about it - the skill is not yet 'second nature' or 'automatic'
The learner should be able to demonstrate the skill to another, but is unlikely to be able to teach it well to another person
The learner should ideally continue to practise the new skill, and if appropriate commit to becoming 'unconsciously competent' at the new skill
Practice is the single most effective way to move from stage 3 to 4
|
4 — Unconscious Competence |
The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature'
Common examples are driving, sports activities, typing, manual dexterity tasks, listening and communicating
It becomes possible for certain skills to be performed while doing something else, for example, knitting while reading a book
The person might now be able to teach others in the skill concerned, although after some time of being unconsciously competent the person might actually have difficulty in explaining exactly how they do it — the skill has become largely instinctual
This gives rise to the need for long-standing unconscious competence to be checked periodically against new standards
|
This model would be incomplete if it did not acknowledge the idea of a 5th level which has sometimes been called "reflective competence" or even "enlightened competence". The model illustrates how skills become so integrated that they become "unconscious" and instinctual. But if we stopped there, it would give the impression that this "unconsciousness" is the highest stage of learning — such as the artist, dancer, craftsman who practices their skill at the highest level but cannot articulate it or teach it to others. Of course, the level beyond this is the coach, the expert teacher, the mentor, the master craftsman, who can demonstrate a skill in practice and also articulate the fine details of the skill, art or craft.
5 — Reflective Competence |
The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature' (minimum effort is required for maximum quality output), however, the practicioner can also articulate the fine details of the skill to others.
Fluent, highly efficient and accurate performance can occur instinctively and reflexively — no longer requiring conscious, deliberate and careful execution — and is also accompanied by the capability to understand and articulate the dynamic flow and scientific/systems explanation of one's performance.
The highest level of performance (and assessment of that performance) involves exhibiting fully integrated metacognitive skills with the primary skill.
This level may be awkwardly described as "conscious competence of unconscious competence".
|
References 
Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.
Kirkpatrick, Donald, (1994) Evaluating Training Programs: The Four Levels. San Francisco, CA: Berrett-Koehler Publishers, inc., 2nd Ed, 1998
Merrill, M. David, Five Star Instruction (2003); and First Principles of Instruction (2002.)
Nantel, Richard and staff of Brandon Hall Research (2005). Testing Building Tools: A Comparison of 24 Products for Authoring Online Tests, Assessments and Evaluations. Brandon Hall Research, Sunnyvale, CA.
http://www.brandonhall.com/public/publications/testbuildingtools/index.htm

|
 |
Testing
&
Evaluation |
|