DSTC Pty Ltd, Uni. of Queensland, Qld, 4072, Australia. Phone +617 33654310, Fax +617 33654311
September, 1999
First we describe a Java application which enables the computer-assisted generation and editing of Dublin Core-based metadata descriptions for digital images and the annotation of regions within images. This application integrates an image display window with a graphical user interface and metadata input forms generated from a hierarchical Resource Description Framework (RDF) schema. The schema definition is also used to validate the input descriptions and control the format of the output. At "Save" time, the image is converted from GIF or JPEG to PNG format and the validated metadata which has been input is embedded within the image.
Secondly we describe an image-capable search engine developed through a simple code extension to DSTC's existing HotMeta web-page search engine. HotMeta crawls across the WWW extracting metatags from web pages and storing them in an indexed repository to enable searching. A simple code extension to HotMeta has enabled image search capabilities. Whenever HotMeta encounters a PNG image, it opens the image, extracts the metadata and saves this in the indexed repository.
Finally we describe an improved image browsing method which exploits the metadata embedded in thumbnail PNG images. Clicking on a PNG thumbnail runs a cgi script which opens the thumbnail image, extracts the metadata and displays the full scale JPEG image, with annotated image maps, and other relevant embedded metadata - eliminating the need for backend databases or static web pages.
This prototype system has been developed using digital images of historical photographs from the State Library of Queensland's (SLQ) John Oxley library. In particular, we have used historical photographs of Queensland from the William Boag Photographic Collection from the 1870's.
In this paper we describe an indexing and retrieval system for online images based on the ability of the PNG format to embed metadata within the image file.
PNG [HREF1] is the Portable Network Graphics format, a format for storing digital images. It was designed to be the successor to GIF after Unisys and CompuServe suddenly announced in January 1995 that programs implementing GIF would require royalties because of Unisys's patent on the LZW compression method used in GIF. Apart from its freedom from copyright restrictions, PNG has a number of other advantages over GIF. These include: alpha channels (variable transparency), gamma correction (cross-platform control of image brightness), and two-dimensional interlacing (a method of progressive display). PNG also compresses better than GIF (by around 5-25%) and provides three main image types (truecolor, grayscale and palette-based (8-bit)). However the major advantage of PNG which we aim to exploit in the work described in this paper, is it's ability to embed associated metadata within the image.
First we describe Peggie, an Image Metadata Generator and Editor application which enables the computer-assisted generation and editing of Dublin Core-based metadata descriptions for both digital images and regions within those images. When the images are saved in the PNG format, the associated metadata is saved within the image file.
This approach then enables existing Dublin Core-based search engines, such as DSTC's HotMeta, to search for images through a fairly simple extension. Whenever the extended HotMeta search engine encounters a PNG image, it opens the file, retrieves the metadata and stores this plus any additional information within its indexed repository enabling image searches.
Finally we describe an image browsing method which exploits the metadata embedded in thumbnail PNG images. Clicking on a PNG thumbnail runs a cgi script which opens the thumbnail image, extracts the metadata and displays the full scale JPEG image with annotated image map and other associated metadata - eliminating the need for backend databases or static web pages.
This prototype system has been developed using digital images of historical photographs from the State Library of Queensland's (SLQ) John Oxley library. In particular, we have used historical photographs of Queensland from the William Boag Photographic Collection from the 1870's [HREF2].
Through this work we hope to demonstrate that embedding standardized metadata within images provides the following advantages over existing online image databases:
In addition to demonstrating the advantages of the embedded metadata approach to image indexing, as listed above, the project had the following objectives:
Although we are aware of the problems with applying the Dublin Core element set and their qualifiers to resources such as images which often have multiple digital surrogates, we decided to use Dublin Core for the initial prototype for the following reasons:
Section 3.2 below examines some of the problems associated with applying the DC element set to image resources and specific Dublin Core element qualifiers necessary to satisfy our image description requirements.
DC.Date
Date is particularly problematic for images which exist in multiple formats e.g., the original photograph or physical object and multiple digital surrogates. Does it represent the date on which the photograph was taken, scanned or put online? For our application we chose the following representations:
DC.Creator, DC.Publisher, DC.Contributor
The DC Agent Working Group [HREF11] is attempting to resolve the confusion over what to put in each of these fields. With respect to images, is the Creator the creator of the photograph or the person who scanned it to create the digital surrogate or the person who put the digital surrogate online? In which DC term should the names of these contributors go? For our application, the Creator is the photographer, 'William Boag' and the Publisher is the 'State Library of Queensland'.
DC.Type
This defines the category of the resource. For the sake of interoperability, Type should be selected from a hierarchy of enumerated lists. The most recent report by the DC Type Working Group Type suggests an enumerated list for Type values which includes the image type [HREF12]. We suggest that the 'Image' type be further specified by another hierarchical enumerated list as shown:
DC.Format
This represents the data format of the resource and can be used to identify the software and possibly hardware that might be needed to display or operate the resource. For the sake of interoperability, the unqualified Format element should be selected from the IANA list of Internet Media Types [HREF13].
Format = IMT mime type e.g. image/gif, image/jpg
For images, the top level description may also have to also provide image-specific information such as file size (Kb), image dimensions/spatial resolution (width and height in pixels), and color information for each version of the image.
DC.Relation
DC.Relation can be used to specify the location of different versions of an image resource. For example consider the thumbnail version (myimage.gif) of a full-sized image (myimage.jpg):
The Relation qualifiers isPartOf and hasPart can be used to define image items within a collection or to specify spatial regions within an image. For example:
DC.Coverage
The recommended and most used qualifiers for Coverage are PeriodName and PlaceName.
However for images, the Coverage qualifiers, Coverage.rect, Coverage.circle, Coverage.point, Coverage.poly can be used to describe the spatial locations and shapes of regions within an image. Given the outline of a region, annotations or descriptions can then be attached to these regions.
When displaying the image and its metadata though a Web browser, image maps can be created from the second-level region metadata. For example:
<MAP NAME="MyMap"> <AREA SHAPE="polygon" HREF="Region1_metadata.html" COORDS="131,294 395,294 395,330 171,330"> <AREA SHAPE="circle" HREF="Region2_metadata.html" COORDS="234,349 15"> <AREA SHAPE="point" HREF="Region3_metadata.html" COORDS="234,349"> <AREA SHAPE="rect" HREF="Region4_metadata.html" COORDS="234,349 361,366"> </MAP> <IMG USEMAP="#MyMap" SRC="MyImage.jpg">
By top-level metadata, we are referring to the metadata description for the full-sized online master JPEG image. The top level scheme consists of the 15 Dublin Core elements (with certain image-specific qualifiers). The second level region metadata consists of a simple 4 element sub-set of the DC element set. Figure 1 below illustrates the proposed data model for the structured image metadata.
![]() |
An image map has also been created which attaches metadata to a particular spatial region of the image. This has been defined as Region 1 and is a rectangular region around the head of the woman on the far left of the photograph. Moving the mouse over this region displays the associated region metadata.
Top Level Metadata Description for Complete Image
Title: A selector and his family, probably in the Beenleigh district, 1872
Creator: William Boag
Subject: Photograph collection - Queensland
Description: The difficulties faced by a family in the Queensland bush included poor roads, an unreliable mail service and dense, vine-matted scrub. For many years, a selector's staple diet was salted meat (salt horse) and pumpkins. For several months, a woman and her children might be alone in their stringy-bark hut while her husband went off to split shingles or to earn extra money on a cattle property.
Date.created: 1872
Date.recordCreated: 1996
Date.placedOnline: 1997
Publisher: State Library of Queensland
Type: image.photograph
Format: image/jpg
Format.fileSize: 50.6Kb
Format.dimensions: 672 x 512
Format.colorpalette: grayscale
Identifier: http://archive.dstc.edu.au/RDU/SLQ/boag/20248.jpg
Source: BOAG negative no. 906
Language: en
Relation.isPartOf: http://www.slq.qld.gov.au/jol/boag.htm
Relation.hasParts: Region1
Relation.hasFormat: http://archive.dstc.edu.au/RDU/SLQ/boag/20248.gif
Coverage: Beenleigh region, Queensland, 1872
Rights: http://www.slq.qld.gov.au/cright.htm
Secondary Level Metadata for Region1
Identifier: Region1
Title: Annie Dickson
Description: Wife of James Dickson and mother to their 13 children.
Coverage.rect: 495,207,546,263
Relation.isPartOf: http://archive.dstc.edu.au/RDU/SLQ/boag/20248.jpg
The original idea behind the PNG Image Metadata Editor and Generator was to extend DSTC's Reggie application [HREF14], a metadata generator and editor for textual documents, to images. A key difference between the Reggie application and the "Peggie" application was the need for an integrated image display window. This enables the user to simultaneously view the image and input or edit the metadata descriptions.
Users can open existing GIF or JPEG images, enter the corresponding metadata and then save the image and metadata together as a PNG file. Alternatively, users can edit existing metadata by opening a previously saved PNG file. When a PNG file is opened, the image is displayed in the image panel and the embedded metadata is displayed in the neighbouring form fields. The input form is generated from an RDF schema which corresponds to the data model in Figure 1. The schema also constrains and validates the user input at Save time. Users can also define spatial regions within images and attach metadata to the specified region.
Figure 2 illustrates the user interface of the Peggie application.
![]() | ![]() |
Existing Web-based image search and retrieval systems can be classified into four types:
DSTC's existing HotMeta Search Engine crawls over specified web sites, extracting and indexing metadata from embedded HTML Meta tags and saving it in a metadata repository. A simple extension has been made to HotMeta to enable searching for images. Whenever the search engine encounters a PNG image, it opens it, extracts the metadata (if it exists) and stores it in the indexed metadata repository.
The advantage of the HotMeta PNG image search engine over the image search engines described above is that it makes detailed metadata, usually only available through site-specific proprietary database access, accessible to wide-scale search engines. This approach can provide much better metadata than is currently available via the ALT tag, to the large search engines.
An online demonstration of the HotMeta PNG version is available [HREF23]. Figures 3 and 4 below illustrate the Search Interface and Query Results from a simple search for the string 'selector'.
![]()
|
|
A browse interface was built which consists of a single web page containing the complete collection as PNG thumbnail images with embedded metadata [HREF24]. Simply clicking on a thumbnail PNG image, runs a cgi script which opens the image, extracts the metadata and dynamically generates a web page displaying the full-sized JPEG image with image maps (if specified) and the associated metadata information.
The advantages of this approach are two-fold:
Figure 5 is a screen dump of the browse interface for the Boag Photographic collection. Figure 6 shows the dynamically-generated full-size image and associated metadata which is displayed when the user clicks on the PNG thumbnail for Plate 103 of the collection. Plate 35 of the collection includes an example of a dynamically-generated image map.
|
|
We have developed an application which can be used by image librarians to quickly and easily create and store standardized embedded metadata descriptions within their image collections to improve their discovery and retrieval over the WWW. These descriptions can be used by Dublin Core-based Internet search engines to increase the discovery of these images or to dynamically generate webpages displaying full-screen master images and their associated detailed metadata descriptions.
We have demonstrated the following advantages of embedded standardized metadata for the resource discovery of images:
The major disadvantage of this approach is the difficulty associated with performing a batch modification of the metadata for a large collection of images. For example, if the address of the publisher of a large image collection were to change then it would be difficult to perform a blanket change across all of the images. Contrastingly, if the metadata was stored in a database, then this would be a relatively simple procedure.
Future Work includes:
The authors also wish to acknowledge the valuable contributions which discussions with Dan Brickley, Dave Beckett and Carl Lagoze have made to this work.
The authors also wish to acknowledge that this work was carried out within the Cooperative Research Centre for Research Data Networks established under the Australian Government's Cooperative Research Centre (CRC) Program and acknowledge the support of CITEC and the Distributed Systems Technology CRC under which the work described in this paper is administered.