TestingEvaluation


	Overview
	Instructional Design
	Interactive Design
	Testing & Evaluation
	Assessment Types
	Assessment Use
	Kirkpatrick's Levels
	Question Types
	Web-based Testing
	Bloom's Taxonomy
	Test Question Stems
	LORI
	E-Learning Evalutation
	Five Star Evaluation
	Competence Matrix
	References

Instruction > Testing & Evaluation

Testing & Evaluation

"Education must be increasingly concerned about the fullest development of all children and youth, and it will be the responsibility of the schools to seek learning conditions which will enable each individual to reach the highest level of learning possible."
Benjamin Bloom

Assessment Types

Key Terms	Definition	Use
Assessment	Any systematic method of obtaining evidence from posing questions to draw inferences about the knowledge, skills, attitudes, and other characteristics of people for a specific purpose.
Exam	A summative assessment used to measure a student’s knowledge or skills for the purpose of documenting their current level of knowledge or skill. Seeks to measure achievement.	Summative
Test	A diagnostic assessment to measure a student’s knowledge or skills for the purpose of informing the students or their teacher of their current level of knowledge or skill. Seeks to provide guidance for learning and instruction.	Diagnostic
Quiz	A formative assessment used to measure a student’s knowledge or skills for the purpose of providing feedback to inform the student of his or her current level of knowledge or skill. Seeks to provide feedback / status of learning and instruction.	Formative
Survey	A diagnostic or reaction assessment to measure the knowledge, skills, and/or attitudes of a group for the purpose of determining needs required to fulfill a defined purpose. Seeks to capture key information.	Need, Reaction, Diagnostic

Uses of Assessment

Types	Purpose & Assessment Strategy
Diagnostic	An assessment that is primarily used to identify the needs and prior knowledge of participants for the purpose of directing them to the most appropriate learning experience.
Needs	An assessment used to determine the knowledge, skills, abilities and attitudes of a group to assist with gap analysis and courseware development. Gap analysis determines the variance between what a student knows and what they are required to know.
Formative	An assessment that has a primary objective of providing practice for search and retrieval from memory for a student and to provide prescriptive feedback. This assessment might relate to: item, topic, and/or course levels.
Summative	An assessment, usually quantitative, whose primary purpose is to give a definitive grade and/or make a judgment about the participant's achievement. If this judgment verifies that the participant has met an established standard indicative of special expertise, the judgment may confer “certification.”

Kirkpatrick's Levels of Assessment

In 1959 Donald Kirkpatrick developed a model of training evaluation. It is currently the most widely used evaluation approach. It is simple, flexible and complete using a four level approach. Recently, a fifth level emphasizing return on training investment (ROI)has been added by Jack Phillips:

Reaction/ Satisfaction:

Level 1 (Kirkpatrick)

Did they like it?

Evaluate Reaction: Are people happy with the training inputs?

An assessment used to determine the satisfaction level with a learning or assessment experience. These assessments are often known as Level 1 evaluations based on Dr. Donald Kirkpatrick's model. Course satisfaction evaluations (sometimes referred to as smile or happy sheets) are completed at the end of a learning or certification experience.

Questionnaires are the most common collection tool. Obtain reaction to content, methods, media, trainer style, facilities, & course materials.

Learning

Level 2 (Kirkpatrick)

Did they learn?

Evaluate Learning: What do people remember from the training session?

For this meaure to be meaningful, it is represents a summative evaluation that validates that learners have met the criterion objectives of the training program. Learning is change in knowledge, skills and attitude. Can be measured by interview, surgeys, tests (pre-/post-), observations, and combinations of these.

Behavior / Transfer of Learning

Level 3 (Kirkpatrick)

Did they use it?

Evaluate Behavior: Do people use what they know at work?

Behavior is a measure of the transfer of knowledge, skills and/or attitude to the real world. It is a measure of achievement of performance objectives. Behavior evaluation is the extent of applied lerning back on the job. Observe the behavior; survey key people who observe the performer; use checklists, questionnaires, interview or a combination of these.

The benefits to conducting Level Three evaluations are: (1) an indication of the ‘time to job impact; (2) an indication of the types of job impacts occurring (cost, quality, time, productivity)

Work Results

Level 4 (Kirkpatrick)

Did it impact the bottom line?

Evaluate Results: What are the outcomes of applications on the job over a period fo time?

Results evaluation is the effect on the business or environment by the trainee. Measures must already be in place via normal management systems and reporting. The challenge is to relate influence of the trainee(s) to these base measures. Assess the "bottom line" or final results. The concept of "results" depends upon the goal of thr training program. Proof is concrete, evidence is soft. Use control group; allow time for results to be realized; measure before and after the program; consider cost versus benefits.

The type of business impact data that can be measured are the following: Sales training: Measure change in sales volume, customer retention, length of sales cycle, profitability on each sale after the training program has been implemented. Technical training: Measure reduction in calls to the help desk; reduced time to complete reports, forms, or tasks; or improved use of software or systems. Quality training: Measure a reduction in number of defects.Safety training: Measure reduction in number or severity of accidents. Management training: Measure increase in engagement levels of direct-reports

The advantages to a Level Four evaluation are as follows:
(1) determine bottom line impact of training;
(2) tie business objectives and goals to training

ROI

Level 5 (Phillips)

What is the return on training investment?

Evaluate Financial Value: What is the impact of training on the bottom-line financials?

Jack Phillips' Five Level ROI Model Source: "Measuring the Return on Investment in Training and Development Certification Materials", Jack J. Phillips, Ph.D (2002).

The methodology is a comprehensive approach to training measurement. It begins with planning the project (referred to by Dr. Phillips as an Impact Study). It moves into the tools and techniques to collect data, analyze the data and finally report the data. The end result is not only a Level 5 ROI but also measurements on the Kirkpatrick 4 Levels as well. This yields a balanced scorecard approach to the measurement exercise.

Question Types: Quiz, Test, Interactive Practice

Multiple-choice: choosing one answer from given alternatives (each response can be given a different score)
Multiple response: choosing a number of answers from a list
True/False: applied to a statement
Yes/No: applied to a choice
Lykert Scale: rating scale to survey preferences & opinions
Fill-in-the-blank: text match can be used for simple text input
Matching: selection/association, match items from two related lists
Assertion/Reason: select correct reason for a particular assertion
Multiple Hotspots: move labels to an appropriate place on an image
Drag & Drop: manipulate graphics,by placing them into a particular sequence or arrangement
Short answer: text match used for multi-line text box
Essay: text match used for large multi-line text box
Games: complex programmatic interactivity
Simulation: interactive practice of a task
Custom question types: using audio-video enhancements, problem-solving, visualization or modeling

Benefits of Web-based Testing

Time and cost savings in test administration & automated scoring
Greater score precision
Maximize student engagement: minimize student frustration in taking test that are too difficult or boredom in taking test that are too easy
Improved test security
New kinds of questions by using multimedia, simulations , visualizations, and other resources to assess active learning

Bloom's Taxonomy: Cognitive Domain

An introduction to Bloom's Domains of Learning was presented in an earlier section "Types of Instruction by Domains of Learning: Knowledge, Attitude & Skills". The following is organized to emphasize the importance of this model of the Cognitive Domain for measuring performance and creating effective tests.

The major categories in the cognitive domain of educational objectives, edited by Benjamin Bloom (1956), are the following.

Knowledge: Remember and Recall; Observe and recognize information, Demonstrate knowledge of dates, events, places; Demonstrate knowledge of major ideas; Demonstrate mastery of basic subject matter.

Comprehension: Translate and Paraphrase; Interpret and Extrapolate information; Use information, methods, concepts theories in new situations; Solve problems; Use required skills or knowledge; Ability to grasp meaning; Estimating future trends (predicting consequences or effects)

Application: Transfer knowledge to new settings; Use of Generalizations in Specific instances; Ability to use learned material in new and concrete situations (rules, principles, laws, theories)

Analysis: Differentiate component parts; Determine Relationships; Breakdown ideas or structure; Identification of parts and organizational principles; Determine arrangement, logic and semantics

Synthesis: Weave components into a whole; Create New Relationships; Use old ideas to create new ones; Generalize from given facts; Relate knowledge from several areas; Predict; Draw conclusions; The ability to put parts together for form a new whole

Evaluation: Judge the value of information; Exercise of learned judgment; Evaluate based on given criteria; The ability to judge the value of material (i.e., statement, novel, poem, research report) for a given purpose

Quick Overview: Key Verbs
Knowledge	Comprehension	Application	Analysis	Synthesis	Evaluation
List Name Identify Show Define Recognize Recall State Visualize Tell Describe Label Collect Examine Quote Record	Summarize Explain Interpret Describe Compare Paraphrase Differentiate Demonstrate Convert Defend Distinguish Give Examples Paraphrase Predict Recognize	Solve Illustrate Calculate Use Interpret Relate Manipulate Apply Modify Complete Show Examine Relate Experiment Discover Classify	Analyze Organize Deduce Contrast Compare Distinguish Discuss Plan Devise Criticize Diagram Inspect Examine Categorize Appraise Differentiate	Design Hypothesize Support Schematize Write Report Justify Categorize Combine Compile Compose Create Devise Explain Organize Summarize	Evaluate Choose Estimate Judge Defend Criticize Compare Rate Value Choose Assess Estimate Measure Select Score Revise

Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.

Test Development: Question Stems and Verbs

I. KNOWLEDGE (drawing out factual answers, testing recall and recognition)
	who	how	describe	which one
	what	label	define	what is the best one
	why	match	choose	how much
	when	select	omit	what does it mean
	where	reproduce	define	who invented
	state	list (number)	what did the book say about
Activities: observe, locate, listen, research, identify, discover, match, ask
Products: films, models, tapes, books, people, records, diagrams, magazines, newspapers, radio, tv, internet, events

II. COMPREHENSION (translating, interpreting and extrapolating)
	state in your own words	classify	which are facts
	what does this mean	judge	is this the same as
	give an example	infer	select the best definition
	condense this paragraph	show	what would happen if
	state in one word	indicate	explain what is happening
	what part doesn't fit	tell	explain what is meant
	what expectations are there	translate	read the graph, table
	what are they saying	select	this represents
	what seems to be	match	is it valid that
	what seems likely	explain	show in a graph, table
	which statements support	represent	demonstrate
	what restrictions would you add
Activities: drawing conclusion, analogy, causal relationships, summary, outline
Products: skit, speech, story, photograph, statement, cartoon, poster, diagram, graph, drama, tape recording, collage

III. APPLICATION (to situations that are new, unfamiliar or elicit a new perspective from the learner)
	predict what would happen if	explain	solve
	choose the best statements that apply	apply	illustrate
	judge the effects	select	calculate
	what would result	relate	use
	tell how, when, where, why	manipulate	interpret
	tell what would happen if	modify	complete
	identify the results of	examine	show
	tell how much change there would be	experiment	discover
	relate	classify	demonstrate
	If you know A & B, how could you determine C	What other possible reasons...	What might they do with...
Activities: list, construct, teach, paint, sketch, manipulate, interview, experiment, record, report, stimulate
Products: diary, collection, puzzle, diagram, photographs, sculpture, diorama, scrapbook, map, stitchery, mobile, model, illustration

IV. ANALYSIS (breaking down into parts, forms)
	distinguish	what is the function of
	identify	what's fact, opinion
	what assumptions	what statement is relevant
	what motive is there	related to, extraneous to, not applicable
	what conclusions	what does author believe, assume
	make a distinction	state the point of view of
	what is the premise	state the point of view of
	what ideas apply	what ideas justify conclusion
	what's the relationship between	the least essential statements are
	what's the main idea, theme	what inconsistencies, fallacies
	what literary form is used	what persuasive technique
	implicit in the statement is	breakdown
	diagram	differentiate
	infer	outline
	illustrate	relate
	point out	select
Activities: classify, categorize, separate, compare, dissect, contrast, advertise, survey
Products: graph, survey, questionnaire, commercial, report, diagram, chart

V. SYNTHESIS (combining elements into a pattern not clearly there before)
	create	how would you test	make up
	tell	propose an alternative	compose
	make	solve the following	formulate
	do	plan	how else would you
	choose	design	state a rule
	develop	categorize	compile
	devise	explain	generate
	modify	organize	rearrange
	relate	revise	rewrite
	summarize	tell	write
Activities: combine, invent, compose, hypothesize, predict, estimate, role-play, produce, infer, produce, imagine, write
Products: story, poem, play, pantomime, song, cartoon, news article, TV show, radio show, magazine, advertisement, new game, structure, invention, recipe, puppet show, machine

VI. EVALUATION (according to some set of criteria, and state why)
	appraise	what fallacies, consistencies, inconsistencies appear
	judge	which is more important, moral, better, logical, valid, appropriate
	criticize	find the errors
	defend	compare
	conclude	describe
	discriminate	explain
	summarize	support
	interpret	justify

LORI -- Learning Object Review Instrument

The Learning Object Review Instrument (LORI) is used to evaluate the quality of E-Learning resources. LORI is an online form that consisting of rubrics, rating scales, and comment fields. The current version of LORI available from eLera is version 1.5. Here is a summary of the criteria used for evaluation:

Dimensions of Quality	Evaluation Criteria
Content Quality	Veracity, accuracy, balanced presentation of ideas, and appropriate level of detail
Learning Goal Alignment	Alignment among learning goals, activities, assessments, and learner characteristics
Feedback and Adaptation	Adaptive content or feedback driven by differential learner input or learner modeling
Motivation	Ability to motivate, and stimulate the interest or curiosity of, an identified population of learners
Presentation Design	Design of visual and auditory information for enhanced learning and efficient mental processing
Interaction Usability	Ease of navigation, predictability of the user interface, and the quality of UI help features
Accessibility	Support for learners with disabilities
Reusability	Ability to port between different courses or learning contexts without modification
Standards Compliance	Adherence to international standards and specifications

E-Learning Program Evaluation: Brandon-Hall

The following evaluation criteria was developed by Brandon Hall and Arjun Reddy in 1988 (MM&ITNL,Brandon-Hall) for evaluation of E-Learning programs:

Evaluation Categories	Evaluation Criteria
1. Content	Right amount and quality of information Content meets defined objectives Content organization
2. Instructional Design	Course objectives clearly defined & tangible Chunking of information Job aids / performance support
3. Interactivity	User engaged through the opportunity for their input Amount of interactivity Quality of interactivity
4. Navigation	Users can determine their own path through the program An exit option always available A course map is always accessible Appropriate use of icons and/or clear labels so that users don't have to read excessively to determine program options Clarity of directions Adequacy of navigation controls Branching to other topics doesn't create a sense of being 'lost'
5. Motivation	Program engages the user through novelty, humor, game elements, testing, adventure, unique content, surprise elements, etc. The course follows a metaphor (golf, Doom, etc.) Metaphor effectively used to help the learning process Progress indicator
6. Use of Media	Program effectively and appropriately employs: video, audio (voice, music, sound effects), animation, graphics, special visual effects
7. Evaluation	Mastery of a section required before proceeding to later sections Section quizzes used Final exam Quality of the testing modules Testing is relevant to the real world performance objectives Appropriateness and timeliness of feedback
8. Aesthetics	Program attractive and appealing to eye and ear Design of the interface is simple, uncluttered, symmetry of objects such as headings, menu bar, etc. Screens do not look busy.
9. Record Keeping	Student performance data recorded, such as time to complete, question analyses, and final score Data forwarded to course manager automatically
10. Tone	Program designed for intended audience Program avoids being condescending, trite, etc.

In 2002 the following criteria was used by Brandon Hall for evaluation of E-Learning programs:

Evaluation Categories	Evaluation Criteria
1. Business problem & results	Matching the e-learning solution to the business/ performance problem at hand Achieving intended results.
2. Instructional design & integrity	Structuring, relevance and quality of content; Focus on real-world competencies; Selecting the right strategies for the content and context.
3. Evaluation & assessment	Applying imagination and rigor to the design and implementation of evaluation or assessment
4. Interactivity	Using creativity and expert design practices to achieve instructionally powerful interactions of all kinds
5. Usability & Interface	Creating an effective, easy-to-use interface
6. Motivation & aesthetics	Motivating learners to follow and successfully complete the training Hitting the right tone and aesthetic notes
7. Media & technology	Smart selection and application of media, development tools and delivery technologies
8. Money & time	Achieving excellence under constrained budgets and time lines

Five Star Evaluation: M. David Merrill

The following principles have been proposed by M.David Merrill as a way of getting at the most important criteria for evaluating E-Learning effectiveness. A more detailed discussion has been presented in the section "5 Star Instruction."

5 Star Evaluation	Merrill's Evaluation Criteria
1. Problem Principle	Learning is facilitated when learners are engaged in solving real-world problems The learner is engaged at the problem or task level, not just the operation or action level The learner solves a progression of problems. This promotes skill development, and meaningful feedback and reinforcement to occur. The learner is guided to an explicit comparison of problems.
2. Activation Principle	Learning is facilitated when existing knowledge is activated as a foundation for new knowledge The learner is directed to recall, relate, describe, or apply knowledge from relevant past experience that can be used as a foundation for the new knowledge. The learner is provided with relevant experience that can be used as a foundation for the new knowledge.
3. Demonstration Principle	Learning is facilitated when new knowledge is demonstrated to the learner The Learner is shown rather than merely told. The demonstration is consistent with the learning goal. The learner is shown multiple representations. The learner is directed to explicitly compare alternative represenations. The use of media play a relevant instructional role.
4. Application Principle	Learning is facilitated when new knowledge is applied by the learner (e.g., guided and unguided practice) The learner is required to use his or her new knowledge to solve problems. The problem-solving activity is consistent with the learning goal. The learner is shown how to detect and correct errors. The learner is guided in problem-solving by appropriate coaching that is gradually withdrawn.
5. Integration Principle	Learning is facilitated when new knowledge is integrated into the learner's world The learner is required to demonstrate his or her new knowledge or skill. The learner can reflect-on, discuss, and defend his or her new knowledge or skill. The learner can create, invent, and explore new and personal ways to use his or her new knowledge or skill

Conscious Competence Learning Matrix

The following model was developed to describe some key observations about learning and performance — and the assessment of performance . The learner of a new skill begins at stage 1 - 'unconscious incompetence', and ends at stage 4 - 'unconscious competence', having passed through stage 2 - 'conscious incompetence' and - 3 'conscious competence'.

Trainers often make the mistake of assuming that a learner is at stage 2, and focus their effort toward achieving stage 3, when in fact the learner is still at stage 1. This is a a fundamental reason for training failure — because the learner simply has not recognized the need for new learning. Until the learner has achieved awareness of a weakness or a training need ('conscious incompetence'), the learner has no interest, attention or motivation for the learning process. Learners only respond to training or teaching when they are aware of their own need for it, and the personal benefits they will derive from it.

1 — Unconscious Incompetence

The learner is not aware of the existence or relevance of the skill area.

The learner is not aware that they have a particular deficiency in the area concerned
The learner might deny the relevance or usefulness of the new skill
The learner must become conscious of their incompetence before development of the new skill or learning can begin
The aim of the trainer or teacher is to move the learner into the 'conscious competence' stage, by demonstrating the skill or ability and the benefit that it will bring to the learner's effectiveness

2— Conscious Incompetence

The learner becomes aware of the existence and relevance of a skill; he becomes aware that he cannot perform the skill

The learner is therefore also aware of their deficiency in this area, ideally by attempting or trying to use the skill
The learner realises that by improving their skill or ability in this area their effectiveness will improve
Ideally the learner has a measure of the extent of their deficiency in the relevant skill, and a measure of what level of skill is required for their own competence
The learner ideally makes a commitment to learn and practice the new skill, and to move to the 'conscious competence' stage
In the eastern philosophy of Zen there is a term called "beginner's mind" — it reflects a state of radical openness to learning — deeply felt humility and motivation for continued learning

3 — Conscious Competence

The learner achieves 'conscious competence' in a skill when he can perform it reliably at will

The learner will need to concentrate and think in order to perform the skill
The learner can perform the skill without assistance
The learner will not reliably perform the skill unless thinking about it - the skill is not yet 'second nature' or 'automatic'
The learner should be able to demonstrate the skill to another, but is unlikely to be able to teach it well to another person
The learner should ideally continue to practise the new skill, and if appropriate commit to becoming 'unconsciously competent' at the new skill

Practice is the single most effective way to move from stage 3 to 4

4 — Unconscious Competence

The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature'

Common examples are driving, sports activities, typing, manual dexterity tasks, listening and communicating
It becomes possible for certain skills to be performed while doing something else, for example, knitting while reading a book
The person might now be able to teach others in the skill concerned, although after some time of being unconsciously competent the person might actually have difficulty in explaining exactly how they do it — the skill has become largely instinctual
This gives rise to the need for long-standing unconscious competence to be checked periodically against new standards

This model would be incomplete if it did not acknowledge the idea of a 5th level which has sometimes been called "reflective competence" or even "enlightened competence". The model illustrates how skills become so integrated that they become "unconscious" and instinctual. But if we stopped there, it would give the impression that this "unconsciousness" is the highest stage of learning — such as the artist, dancer, craftsman who practices their skill at the highest level but cannot articulate it or teach it to others. Of course, the level beyond this is the coach, the expert teacher, the mentor, the master craftsman, who can demonstrate a skill in practice and also articulate the fine details of the skill, art or craft.

5 — Reflective Competence

The skill becomes so practised that it enters the unconscious parts of the brain - it becomes 'second nature' (minimum effort is required for maximum quality output), however, the practicioner can also articulate the fine details of the skill to others.

Fluent, highly efficient and accurate performance can occur instinctively and reflexively — no longer requiring conscious, deliberate and careful execution — and is also accompanied by the capability to understand and articulate the dynamic flow and scientific/systems explanation of one's performance.
The highest level of performance (and assessment of that performance) involves exhibiting fully integrated metacognitive skills with the primary skill.
This level may be awkwardly described as "conscious competence of unconscious competence".

References

Bloom, B.S. (Ed.). (1956). Taxonomy of Educational Objectives: Classification of Educational Goals. Handbook 1: Cognitive Domain. New York: Longman, Green & Co.

Kirkpatrick, Donald, (1994) Evaluating Training Programs: The Four Levels. San Francisco, CA: Berrett-Koehler Publishers, inc., 2nd Ed, 1998

Merrill, M. David, Five Star Instruction (2003); and First Principles of Instruction (2002.)

Nantel, Richard and staff of Brandon Hall Research (2005). Testing Building Tools: A Comparison of 24 Products for Authoring Online Tests, Assessments and Evaluations. Brandon Hall Research, Sunnyvale, CA.
http://www.brandonhall.com/public/publications/testbuildingtools/index.htm

Instruction
Overview

Testing
&
Evaluation

Top of Page " Home " E-Learning " Knowledge Management " Performance Support