Instructional Design
Interactive Design
Testing & Evaluation
Assessment Types
Assessment Use
Kirkpatrick's Levels
Question Types
Web-based Testing
Bloom's Taxonomy
Test Question Stems
E-Learning Evalutation
Five Star Evaluation
Competence Matrix

  Instruction > Testing & Evaluation

Testing & Evaluation

"Education must be increasingly concerned about the fullest development of all children and youth, and it will be the responsibility of the schools to seek learning conditions which will enable each individual to reach the highest level of learning possible."
  Benjamin Bloom

Assessment Types

Key Terms
Any systematic method of obtaining evidence from posing questions to draw inferences about the knowledge, skills, attitudes, and other characteristics of people for a specific purpose.
A summative assessment used to measure a student’s knowledge or skills for the purpose of documenting their current level of knowledge or skill. Seeks to measure achievement. Summative
A diagnostic assessment to measure a student’s knowledge or skills for the purpose of informing the students or their teacher of their current level of knowledge or skill. Seeks to provide guidance for learning and instruction. Diagnostic
A formative assessment used to measure a student’s knowledge or skills for the purpose of providing feedback to inform the student of his or her current level of knowledge or skill. Seeks to provide feedback / status of learning and instruction. Formative
A diagnostic or reaction assessment to measure the knowledge, skills, and/or attitudes of a group for the purpose of determining needs required to fulfill a defined purpose. Seeks to capture key information.


Uses of Assessment  

Purpose & Assessment Strategy
Diagnostic An assessment that is primarily used to identify the needs and prior knowledge of participants for the purpose of directing them to the most appropriate learning experience.
Needs An assessment used to determine the knowledge, skills, abilities and attitudes of a group to assist with gap analysis and courseware development. Gap analysis determines the variance between what a student knows and what they are required to know.

An assessment that has a primary objective of providing practice for search and retrieval from memory for a student and to provide prescriptive feedback. This assessment might relate to:
      topic, and/or
      course levels.

Summative An assessment, usually quantitative, whose primary purpose is to give a definitive grade and/or make a judgment about the participant's achievement. If this judgment verifies that the participant has met an established standard indicative of special expertise, the judgment may confer “certification.”


Kirkpatrick's Levels of Assessment

In 1959 Donald Kirkpatrick developed a model of training evaluation. It is currently the most widely used evaluation approach. It is simple, flexible and complete using a four level approach. Recently, a fifth level emphasizing return on training investment (ROI)has been added by Jack Phillips:

Reaction/ Satisfaction:

Level 1 (Kirkpatrick)

Did they like it?

Evaluate Reaction:  Are people happy with the training inputs?

An assessment used to determine the satisfaction level with a learning or assessment experience. These assessments are often known as Level 1 evaluations based on Dr. Donald Kirkpatrick's model. Course satisfaction evaluations (sometimes referred to as smile or happy sheets) are completed at the end of a learning or certification experience.

Questionnaires are the most common collection tool. Obtain reaction to content, methods, media, trainer style, facilities, & course materials.


Level 2 (Kirkpatrick)

Did they learn?

Evaluate Learning:  What do people remember from the training session?

For this meaure to be meaningful, it is represents a summative evaluation that validates that learners have met the criterion objectives of the training program. Learning is change in knowledge, skills and attitude. Can be measured by interview, surgeys, tests (pre-/post-), observations, and combinations of these.

Behavior / Transfer of Learning

Level 3 (Kirkpatrick)

Did they use it?

Evaluate Behavior:  Do people use what they know at work?

Behavior is a measure of the transfer of knowledge, skills and/or attitude to the real world. It is a measure of achievement of performance objectives. Behavior evaluation is the extent of applied lerning back on the job. Observe the behavior; survey key people who observe the performer; use checklists, questionnaires, interview or a combination of these.

The benefits to conducting Level Three evaluations are: (1) an indication of the ‘time to job impact; (2) an indication of the types of job impacts occurring (cost, quality, time, productivity)

Work Results

Level 4 (Kirkpatrick)

Did it impact the bottom line?

Evaluate Results:  What are the outcomes of applications on the job over a period fo time?

Results evaluation is the effect on the business or environment by the trainee. Measures must already be in place via normal management systems and reporting. The challenge is to relate influence of the trainee(s) to these base measures. Assess the "bottom line" or final results. The concept of "results" depends upon the goal of thr training program. Proof is concrete, evidence is soft. Use control group; allow time for results to be realized; measure before and after the program; consider cost versus benefits.

The type of business impact data that can be measured are the following: Sales training: Measure change in sales volume, customer retention, length of sales cycle, profitability on each sale after the training program has been implemented. Technical training: Measure reduction in calls to the help desk; reduced time to complete reports, forms, or tasks; or improved use of software or systems. Quality training: Measure a reduction in number of defects.Safety training: Measure reduction in number or severity of accidents. Management training: Measure increase in engagement levels of direct-reports

The advantages to a Level Four evaluation are as follows:
(1) determine bottom line impact of training;
(2) tie business objectives and goals to training


Level 5 (Phillips)

What is the return on training investment?

Evaluate Financial Value:  What is the impact of training on the bottom-line financials?

Jack Phillips' Five Level ROI Model Source: "Measuring the Return on Investment in Training and Development Certification Materials", Jack J. Phillips, Ph.D (2002).

The methodology is a comprehensive approach to training measurement. It begins with planning the project (referred to by Dr. Phillips as an Impact Study). It moves into the tools and techniques to collect data, analyze the data and finally report the data. The end result is not only a Level 5 ROI but also measurements on the Kirkpatrick 4 Levels as well. This yields a balanced scorecard approach to the measurement exercise.

Question Types: Quiz, Test, Interactive Practice

  • Multiple-choice: choosing one answer from given alternatives (each response can be given a different score)
  • Multiple response: choosing a number of answers from a list
  • True/False: applied to a statement
  • Yes/No: applied to a choice
  • Lykert Scale: rating scale to survey preferences & opinions
  • Fill-in-the-blank: text match can be used for simple text input
  • Matching: selection/association, match items from two related lists
  • Assertion/Reason: select correct reason for a particular assertion
  • Multiple Hotspots: move labels to an appropriate place on an image
  • Drag & Drop: manipulate graphics,by placing them into a particular sequence or arrangement
  • Short answer: text match used for multi-line text box
  • Essay: text match used for large multi-line text box
  • Games: complex programmatic interactivity
  • Simulation: interactive practice of a task
  • Custom question types: using audio-video enhancements, problem-solving, visualization or modeling   

Benefits of Web-based Testing

  • Time and cost savings in test administration & automated scoring  
  • Greater score precision
  • Maximize student engagement: minimize student frustration in taking test that are too difficult or boredom in taking test that are too easy
  • Improved test security
  • New kinds of questions by using multimedia, simulations , visualizations, and other resources to assess active learning

Bloom's Taxonomy: Cognitive Domain   

An introduction to Bloom's Domains of Learning was presented in an earlier section "Types of Instruction by Domains of Learning: Knowledge, Attitude & Skills". The following is organized to emphasize the importance of this model of the Cognitive Domain for measuring performance and creating effective tests.

The major categories in the cognitive domain of educational objectives, edited by Benjamin Bloom (1956), are the following.

Knowledge: Remember and Recall; Observe and recognize information, Demonstrate knowledge of dates, events, places; Demonstrate knowledge of major ideas; Demonstrate mastery of basic subject matter.

Comprehension: Translate and Paraphrase; Interpret and Extrapolate information; Use information, methods, concepts theories in new situations; Solve problems; Use required skills or knowledge; Ability to grasp meaning; Estimating future trends (predicting consequences or effects)

Application: Transfer knowledge to new settings; Use of Generalizations in Specific instances; Ability to use learned material in new and concrete situations (rules, principles, laws, theories)

Analysis: Differentiate component parts; Determine Relationships; Breakdown ideas or structure; Identification of parts and organizational principles; Determine arrangement, logic and semantics

Synthesis: Weave components into a whole; Create New Relationships; Use old ideas to create new ones; Generalize from given facts; Relate knowledge from several areas; Predict; Draw conclusions; The ability to put parts together for form a new whole

Evaluation: Judge the value of information; Exercise of learned judgment; Evaluate based on given criteria; The ability to judge the value of material (i.e., statement, novel, poem, research report) for a given purpose

Quick Overview: Key Verbs
Knowledge Comprehension Application Analysis Synthesis Evaluation
Define Recognize Recall

Paraphrase Differentiate Demonstrate
Give Examples

Illustrate Calculate
Relate Manipulate Apply

Analyze Organize Deduce Contrast Compare Distinguish Discuss


Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.

Test Development: Question Stems and Verbs   

I. KNOWLEDGE (drawing out factual answers, testing recall and recognition)
  who how describe which one
  what label define what is the best one
  why match choose how much
  when select omit what does it mean
  where reproduce define who invented
  state list (number) what did the book say about  
Activities: observe, locate, listen, research, identify, discover, match, ask
Products: films, models, tapes, books, people, records, diagrams, magazines, newspapers, radio, tv, internet, events

II. COMPREHENSION (translating, interpreting and extrapolating)
  state in your own words classify which are facts
  what does this mean judge is this the same as
  give an example infer select the best definition
  condense this paragraph show what would happen if
  state in one word indicate explain what is happening
  what part doesn't fit tell explain what is meant
  what expectations are there translate read the graph, table
  what are they saying select this represents
  what seems to be match is it valid that
  what seems likely explain show in a graph, table
  which statements support represent demonstrate
  what restrictions would you add    
Activities: drawing conclusion, analogy, causal relationships, summary, outline
Products: skit, speech, story, photograph, statement, cartoon, poster, diagram, graph, drama, tape recording, collage

III. APPLICATION (to situations that are new, unfamiliar or elicit a new perspective from the learner)
  predict what would happen if explain solve
  choose the best statements that apply apply illustrate
  judge the effects select calculate
  what would result relate use
  tell how, when, where, why manipulate interpret

tell what would happen if

modify complete
  identify the results of examine show
  tell how much change there would be experiment discover
  relate classify demonstrate
  If you know A & B, how could you determine C What other possible reasons... What might they do with...
Activities: list, construct, teach, paint, sketch, manipulate, interview, experiment, record, report, stimulate
Products: diary, collection, puzzle, diagram, photographs, sculpture, diorama, scrapbook, map, stitchery, mobile, model, illustration

IV. ANALYSIS (breaking down into parts, forms)
  distinguish what is the function of
  identify what's fact, opinion
  what assumptions what statement is relevant
  what motive is there related to, extraneous to, not applicable
  what conclusions what does author believe, assume
  make a distinction state the point of view of
  what is the premise state the point of view of
  what ideas apply what ideas justify conclusion
  what's the relationship between the least essential statements are
  what's the main idea, theme what inconsistencies, fallacies
  what literary form is used what persuasive technique
  implicit in the statement is breakdown
  diagram differentiate
  infer outline
  illustrate relate
  point out select
Activities: classify, categorize, separate, compare, dissect, contrast, advertise, survey
Products: graph, survey, questionnaire, commercial, report, diagram, chart

V. SYNTHESIS (combining elements into a pattern not clearly there before)
  create how would you test make up
  tell propose an alternative compose
  make solve the following formulate
  do plan how else would you
  choose design state a rule
  develop categorize compile
  devise explain generate
  modify organize rearrange
  relate revise rewrite
  summarize tell write
Activities: combine, invent, compose, hypothesize, predict, estimate, role-play, produce, infer, produce, imagine, write
Products: story, poem, play, pantomime, song, cartoon, news article, TV show, radio show, magazine, advertisement, new game, structure, invention, recipe, puppet show, machine

VI. EVALUATION (according to some set of criteria, and state why)
  appraise what fallacies, consistencies, inconsistencies appear
  judge which is more important, moral, better, logical, valid, appropriate
  criticize find the errors
  defend compare
  conclude describe
  discriminate explain
  summarize support
  interpret justify

LORI -- Learning Object Review Instrument

The Learning Object Review Instrument (LORI) is used to evaluate the quality of E-Learning resources. LORI is an online form that consisting of rubrics, rating scales, and comment fields. The current version of LORI available from eLera is version 1.5. Here is a summary of the criteria used for evaluation:

Dimensions of
Content Quality Veracity, accuracy, balanced presentation of ideas, and appropriate level of detail
Learning Goal Alignment Alignment among learning goals, activities, assessments, and learner characteristics
Feedback and Adaptation Adaptive content or feedback driven by differential learner input or learner modeling
Motivation Ability to motivate, and stimulate the interest or curiosity of, an identified population of learners
Presentation Design Design of visual and auditory information for enhanced learning and efficient mental processing
Interaction Usability Ease of navigation, predictability of the user interface, and the quality of UI help features
Accessibility Support for learners with disabilities
Reusability Ability to port between different courses or learning contexts without modification
Standards Compliance Adherence to international standards and specifications


E-Learning Program Evaluation: Brandon-Hall

The following evaluation criteria was developed by Brandon Hall and Arjun Reddy in 1988 (MM&ITNL,Brandon-Hall) for evaluation of E-Learning programs:

Evaluation Categories

1. Content

  Right amount and quality of information
  Content meets defined objectives
 Content organization

2. Instructional

  Course objectives clearly defined & tangible
  Chunking of information
 Job aids / performance support

3. Interactivity   User engaged through the opportunity for their input
  Amount of interactivity
  Quality of interactivity
4. Navigation  Users can determine their own path through the program
  An exit option always available
  A course map is always accessible
  Appropriate use of icons and/or clear labels so that users don't have to read excessively to determine program options
  Clarity of directions
  Adequacy of navigation controls
  Branching to other topics doesn't create a sense of being 'lost'
5. Motivation  Program engages the user through novelty, humor, game elements, testing, adventure, unique content, surprise elements, etc.
  The course follows a metaphor (golf, Doom, etc.)
  Metaphor effectively used to help the learning process
  Progress indicator
6. Use of Media  Program effectively and appropriately employs: video, audio (voice, music, sound effects), animation, graphics, special visual effects
7. Evaluation   Mastery of a section required before proceeding to later sections
  Section quizzes used
  Final exam
  Quality of the testing modules
  Testing is relevant to the real world performance objectives
  Appropriateness and timeliness of feedback
8. Aesthetics   Program attractive and appealing to eye and ear
  Design of the interface is simple, uncluttered, symmetry of objects such as headings, menu bar, etc. Screens do not look busy.
9. Record
  Student performance data recorded, such as time to complete, question analyses, and final score
  Data forwarded to course manager automatically
10. Tone   Program designed for intended audience
  Program avoids being condescending, trite, etc.

In 2002 the following criteria was used by Brandon Hall for evaluation of E-Learning programs:

Evaluation Categories
1. Business problem & results

  Matching the e-learning solution to the business/ performance problem at hand
  Achieving intended results.

2. Instructional design & integrity

  Structuring, relevance and quality of content;
  Focus on real-world competencies;
 Selecting the right strategies for the content and context.

3. Evaluation & assessment   Applying imagination and rigor to the design and implementation of evaluation or assessment
4. Interactivity  Using creativity and expert design practices to achieve instructionally powerful interactions of all kinds
5. Usability & Interface  Creating an effective, easy-to-use interface
6. Motivation & aesthetics  Motivating learners to follow and successfully complete the training
  Hitting the right tone and aesthetic notes
7. Media & technology   Smart selection and application of media, development tools and delivery technologies
8. Money & time   Achieving excellence under constrained budgets and time lines

Five Star Evaluation: M. David Merrill    

The following principles have been proposed by M.David Merrill as a way of getting at the most important criteria for evaluating E-Learning effectiveness. A more detailed discussion has been presented in the section "5 Star Instruction."

5 Star Evaluation
Merrill's Evaluation Criteria
1. Problem Principle

Learning is facilitated when learners are engaged in solving real-world problems
 The learner is engaged at the problem or task level, not just the operation or action level
 The learner solves a progression of problems. This promotes skill development, and meaningful feedback and reinforcement to occur.
 The learner is guided to an explicit comparison of problems.

2. Activation Principle

Learning is facilitated when existing knowledge is activated as a foundation for new knowledge
 The learner is directed to recall, relate, describe, or apply knowledge from relevant past experience that can be used as a foundation for the new knowledge.
 The learner is provided with relevant experience that can be used as a foundation for the new knowledge.

3. Demonstration Principle Learning is facilitated when new knowledge is demonstrated to the learner
 The Learner is shown rather than merely told.
  The demonstration is consistent with the learning goal.
  The learner is shown multiple representations.
  The learner is directed to explicitly compare alternative represenations.
  The use of media play a relevant instructional role.
4. Application Principle Learning is facilitated when new knowledge is applied by the learner (e.g., guided and unguided practice)
 The learner is required to use his or her new knowledge to solve problems.
 The problem-solving activity is consistent with the learning goal.
 The learner is shown how to detect and correct errors.
  The learner is guided in problem-solving by appropriate coaching that is gradually withdrawn.
5. Integration Principle Learning is facilitated when new knowledge is integrated into the learner's world
  The learner is required to demonstrate his or her new knowledge or skill.
  The learner can reflect-on, discuss, and defend his or her new knowledge or skill.
  The learner can create, invent, and explore new and personal ways to use his or her new knowledge or skill


Conscious Competence Learning Matrix  

The following model was developed to describe some key observations about learning and performance — and the assessment of performance . The learner of a new skill begins at stage 1 - 'unconscious incompetence', and ends at stage 4 - 'unconscious competence', having passed through stage 2 - 'conscious incompetence' and - 3 'conscious competence'.

Trainers often make the mistake of assuming that a learner is at stage 2, and focus their effort toward achieving stage 3, when in fact the learner is still at stage 1. This is a a fundamental reason for training failure — because the learner simply has not recognized the need for new learning. Until the learner has achieved awareness of a weakness or a training need ('conscious incompetence'), the learner has no interest, attention or motivation for the learning process. Learners only respond to training or teaching when they are aware of their own need for it, and the personal benefits they will derive from it.

1 — Unconscious Incompetence

The learner is not aware of the existence or relevance of the skill area.

 The learner is not aware that they have a particular deficiency in the area concerned
  The learner might deny the relevance or usefulness of the new skill
  The learner must become conscious of their incompetence before development of the new skill or learning can begin
 The aim of the trainer or teacher is to move the learner into the 'conscious competence' stage, by demonstrating the skill or ability and the benefit that it will bring to the learner's effectiveness

2— Conscious Incompetence

The learner becomes aware of the existence and relevance of a skill; he becomes aware that he cannot perform the skill

 The learner is therefore also aware of their deficiency in this area, ideally by attempting or trying to use the skill
 The learner realises that by improving their skill or ability in this area their effectiveness will improve
 Ideally the learner has a measure of the extent of their deficiency in the relevant skill, and a measure of what level of skill is required for their own competence
 The learner ideally makes a commitment to learn and practice the new skill, and to move to the 'conscious competence' stage
 In the eastern philosophy of Zen there is a term called "beginner's mind" — it reflects a state of radical openness to learning — deeply felt humility and motivation for continued learning

3 — Conscious Competence

The learner achieves 'conscious competence' in a skill when he can perform it reliably at will

 The learner will need to concentrate and think in order to perform the skill
 The learner can perform the skill without assistance
 The learner will not reliably perform the skill unless thinking about it - the skill is not yet 'second nature' or 'automatic'
 The learner should be able to demonstrate the skill to another, but is unlikely to be able to teach it well to another person
 The learner should ideally continue to practise the new skill, and if appropriate commit to becoming 'unconsciously competent' at the new skill

Practice is the single most effective way to move from stage 3 to 4

4 — Unconscious Competence

The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature'

 Common examples are driving, sports activities, typing, manual dexterity tasks, listening and communicating
  It becomes possible for certain skills to be performed while doing something else, for example, knitting while reading a book
  The person might now be able to teach others in the skill concerned, although after some time of being unconsciously competent the person might actually have difficulty in explaining exactly how they do it — the skill has become largely instinctual
 This gives rise to the need for long-standing unconscious competence to be checked periodically against new standards

This model would be incomplete if it did not acknowledge the idea of a 5th level which has sometimes been called "reflective competence" or even "enlightened competence". The model illustrates how skills become so integrated that they become "unconscious" and instinctual. But if we stopped there, it would give the impression that this "unconsciousness" is the highest stage of learning — such as the artist, dancer, craftsman who practices their skill at the highest level but cannot articulate it or teach it to others. Of course, the level beyond this is the coach, the expert teacher, the mentor, the master craftsman, who can demonstrate a skill in practice and also articulate the fine details of the skill, art or craft.

5 — Reflective Competence

The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature' (minimum effort is required for maximum quality output), however, the practicioner can also articulate the fine details of the skill to others.

 Fluent, highly efficient and accurate performance can occur instinctively and reflexively — no longer requiring conscious, deliberate and careful execution — and is also accompanied by the capability to understand and articulate the dynamic flow and scientific/systems explanation of one's performance.
  The highest level of performance (and assessment of that performance) involves exhibiting fully integrated metacognitive skills with the primary skill.
  This level may be awkwardly described as "conscious competence of unconscious competence".



Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.

Kirkpatrick, Donald, (1994) Evaluating Training Programs: The Four Levels. San Francisco, CA: Berrett-Koehler Publishers, inc., 2nd Ed, 1998

Merrill, M. David, Five Star Instruction (2003); and First Principles of Instruction (2002.)

Nantel, Richard and staff of Brandon Hall Research (2005). Testing Building Tools: A Comparison of 24 Products for Authoring Online Tests, Assessments and Evaluations. Brandon Hall Research, Sunnyvale, CA.


  ©2003 Cognitive Design Solutions, Inc.