METADATING THE IMAGE
Human cultures have developed rich and precise systems to describe oral and written communication: phonetics, syntax, semantics, pragmatics, narrative
theory, rhetoric, and so on. Dictionaries and thesauruses help us to create new texts while the search engines and the ever present “find…” command on
our desktops help us to locate the particular texts already created, or their parts.
Paradoxically, while the role of visual communication has dramatically increased over the last two centuries, no similar descriptive systems were developed
for images at least not on the same scale. So while the number of different types of images we routinely create today is extremely large, if not infinite
(and it has become ever larger after computer tools made possible to more easily combine photographs, graphics and text, and to apply operations
previously reserved for each of this separate medium to all the other media blurring text, etc.), the systems we have to describe these images are very
poor. For instance, stock photography collections divide millions of images into a couple of dozen categories, at best, with names such as “joy”
“business,” and” achievement”; professional designers typically use even more limited range of categories to describe their projects ( “clean,” “futuristic,”
“corporate,” “conservative,” etc.)
As computerization dramatically increases the amount of media data that can be stored, accessed and manipulated, we are gradually shifting towards more
structured ways to organize and describe this data. For example, we are moving from HTML to XML (and next to Semantic Web); from MPEG-2 to MPEG-7;
from “flat” lens-based images to “layered” image composites and discrete 3D computer generated spaces. In all these cases the shift is from a “low-level”
metadata (the fonts on the Web page, the resolution and compression settings of a moving image) to a “high-level” metadata that describes the structure
of a media composition or even its semantics.
What about images? Computerization creates a promise (which maybe only an illusion) that images that traditionally resisted the human attempts to
describe them with precision will be finally conquered. After all, we now easily find out that a particular digital image contains so many pixels and so
many colors; we can also easily store all kinds of metadata along with the image; and we can tease out some indications of image structure and semantics
(for instance, we can find all edges in a bit-mapped image.) Yet visual search engines that can deal with the queries such as “find all images which have a
picture of ” or “find all images similar in composition to this one” are still in their infancy. Similarly, the metadata provided by a image database software I
use to organize my digital photos tells me all kinds of technical details such as what aperture my digital camera used to snap this or that image but
nothing about the image content. In short, while computerization made the image acquisition, storage, manipulation, and transmission much more
efficient than before, it did not help us so far to deal with one of its side effects how to more efficiently describe and access the vast quantities of digital
image being generated by digital cameras and scanners, by the endless “digital archives” and “digital libraries” projects around the world, by the sensors
and the museums…
The theoretical part of the Master class will develop in more detail the paradigm sketched here. We will discuss the key modern attempts (in cinema,
graphic design, art history, psychology, and other fields) to make images into a language i.e., to develop formal techniques to describe images and to
predict their effects on the viewer. Against this background, we will look at the history, the present research and the emerging trends in computer research
which pursue the similar project: visual search engines, the new hybrid forms of cinema which combine cinematography with a more structured way to
represent space borrowed from 3D computer graphics, the state of the art in computer vision applications, and so on. We will also look at the works of a
few new media artists that engage with the politics and poetics of image metadata (Joachim Sauter, George Legrady, and others).
Finally, we will also engage with some larger questions about the functioning of images in a global information society. For example, is it true that we live
in a predominantly visual culture, or does computerization in fact downplays the role of an image in favor of other representations such as text and 3D
space? Will our visual culture be still dominated by photographic-like images in the twenty first century, or will other kinds of images eventually take their
place? While computers allow us to manipulate old media in new ways, creating new hybrids and new forms, do they also enable any completely new and
unprecedented types of visual representations?
The practical projects developed during the Master class can pursue one of two directions. A project can present an analysis of some existing (and socially
important) system for cataloging and describing images and their contents for instance, the categories used by stock media collections, the categories
used to classify facial expressions of human emotions in computer research, the categories used by graphic designers to talk about the styles of Web
design. If possible, these projects should address the following two questions: (1) are there any conceptual shifts which can be observed in the logic of
image description systems as they become implemented in a computer, thus turning into software? (2) What are the relationships between image
description systems and the descriptions used by software for other type of media?
Alternatively, a participant can develop a conceptual proposal for a software interface to record, describe, access, or manipulate images in a new way. While
new media artists have extensively critiqued existing software interfaces in general and developed many particular alternatives, surprisingly little energy
has been spend so far thinking on how we interface to images. And yet the computerization of visual culture opens all kinds of interesting possibilities
waiting to be explored. For instance, if it already possible to record and store practically unlimited number of still and moving images of one’s existence,
what kind of interface can we use to organize and navigate these images? Or, given that we now can use database software to classify, link, and retrieve
images and image sequences along with other media, how can a database structure be used to represent the life of a modern city, the history of a place,
etc. In other words, behind the difficult problem of visual metadata that has become more pressing in computer age than ever before, there is also an
exiting promise the promise to represent reality and human experience in new ways.
The projects created during the class will be featured on a Master class Web site and will be published in a new book by V2 (Rotterdam). Therefore,
regardless of whether a participant chooses to pursue analytical or practical project, the final files should be ready to be put on the Web and to be
published in the book. Therefore the project should be presented as a single panel (similar in style to architectural proposals), available in Web-ready and
print-ready versions (for instance, an HTML file and an Illustrator file).