[11] Evaluation - Testing

External to the development team, with the participation of potential external users.

The basic rule of evaluation is: the earlier you assess the system, the less accurate the results but also the less costly to fix issues. Therefore:

Evaluate Early and Often: Identify major errors early to prevent costly fixes later.
Prioritize Early Evaluations: Reserve later evaluations for minor issues or formal assessments.

What to evaluate: sketches, mock-up on a pc, wizard of oz (high quality moved by an human), prototype, the working system.

Testing software is vital because no theory or analytical technique can catch all problems, and issues found through inspection may not always reflect user experiences. Real user input is crucial for realistic and demonstrable insights, but users vary significantly. There are different categories of tests:

Formative Tests: Intended to uncover problems when suspected, aiming to gather evidence of their existence and location.
Summative Tests: Meant to confirm that identified problems have been resolved, providing evidence of correctness.

The outcomes of tests can be quantitative (hard data, demonstrating issue existence) or qualitative (subjective impressions, offering ideas for solutions). Tests can be conducted in two different phases: early during development to catch issues quickly and late to ensure all usability requirements are met.

Methodologies vary between statistically significant data (few variables, large subject numbers) and common sense (testing anything with any number of subjects, easier to question results).

Formative tests benefit the design team, while summative tests serve the client's contractual obligations.

Untitled

Full (Traditional / Deluxe) Usability Testing:
- Formalized with specific protocols, roles, and infrastructure.
- Quantitative and statistically justified assessments.
- Requires a significant number of participants (20-100).
- Expensive due to participant compensation, team costs, and suspended development. (20.000€)
- Primarily used for final or summative testing to verify compliance with requirements.
Discount (Guerrilla) Usability Testing:
- Informal setup with a team member interacting with participants.
- Intuitive results, indicative but not conclusive.
- Involves sequential testing with a small number of users (3-4).
- Cost-effective, no specialists required, and can run parallel to production.
- Useful for formative testing, identifying problems early.

Guerrilla Usability Testing

Introduced by Jacob Nielsen for low-budget projects like websites.
A response to the high costs of traditional testing methods.
Components:
- Scenarios: Prototypes representing small task functionalities.
- Informal Thinking Aloud: Conducted without specialists or formal equipment.
- Heuristic Evaluations: Applying usability rules by Nielsen and Molich.