Evaluating Computational Creativity

Evaluating Computational Creativity

AI and CC November 11, 2011 / By Anna Jordanous
Evaluating Computational Creativity

Can a computer be creative? And what can we learn from studying computational creativity?

Can a computer be creative? And what can we learn from studying computational creativity? How should something so subjective as creativity be measured? This short paper offers guidance into how to approach the evaluation and improvement of computational creativity systems and highlights key components for creativity in general.


The primary resource we have for examining creative actions is that of ourselves; how humans demonstrate creativity. The closer artificial creative systems can match our perception of human creativity, the more successful they are generally deemed to be for demonstrating creativity. Hence we need to clarify what humans consider creative. As a concept, creativity has proved resistant to satisfactory and comprehensive definition, despite numerous attempts to provide one. Creativity is complex and multi-dimensional, encompassing many related aspects, abilities, properties and behaviours.

The words that we use in discussing a concept are interlinked with the meaning of that concept. On this premise, using techniques from computational linguistics, a number of distinct themes - components of creativity  - have been identified through analysis of word usage in a cross-disciplinary range of academic viewpoints on the nature of creativity. These components, pictured in Figure 1, collectively contribute to a comprehensive, interdisciplinary definition of creativity [1].They act as a set of building blocks to make creativity easier to understand and more tractable to study and evaluate.


SPECS (Standardised Procedure for Evaluating Creative Systems) is a standardised and systematic methodology for evaluating computational creativity [1]. It is flexible enough to be applied to various different types of creative system and adaptable to specific demands in different types of creativity. In the three-stage process of evaluation, researchers are required to be specific about what creativity entails in their domain and what standards they test a system’s creativity by.

1. Identify a definition of creativity that your system should satisfy to be considered creative:

 (a) in a general context.

 (b) specific to the domain your system works in.

2.  Using Step 1, clearly state what standards you use to evaluate the creativity of your system.

3.  Test your creative system against the standards stated in Step 2 and report the results.

SPECS has been applied to case studies [1], demonstrating what can be learnt about systems’ strengths and weaknesses:

  • A case study for detailed comparisons of the creativity of four musical improvisation systems, identifying which systems are more creative than others and why.
  • A case study to capture initial impressions on the creativity of five systems presented at a 2011 computational creativity research event, performing different creative tasks.


The aim of this work is to enable more detailed, cognitively based evaluation of our progress in computational creativity. This also gives some insight into the nature of human creativity.I argue that it is most productive to treat creativity as a collection of inter-related factors such as originality, value and active involvement, which are more amenable to measurement and/or observation. Towards this end, key factors of creativity have been derived from empirical studies examining a wide variety of our writings on creativity.

Evaluation gives researchers vital feedback for improving their systems’ creativity, informing further research progress. To assist researchers in evaluating their systems, a simple but necessary set of steps is given on how to evaluate computational creativity systems: a Standardised Procedure for Evaluating Creative Systems (SPECS methodology). SPECS has been applied and evaluated [1].


[1] A. Jordanous. Evaluating Computational Creativity: A Standardised Procedure for Evaluating Creative Systems and its Application. PhD thesis, University of Sussex, Brighton, UK, September 2011

comments powered by Disqus