Você está na página 1de 6

Chapter 11

Linking Data and Metadata: Packaging

11.1 Information Packaging Overview


OAIS describes packaging at a high level, as outlined in Sect. 6.3.4, where it is stressed that the package is a logical structure, i.e. does not have to be a single le. Despite stressing the logical structure, it can be useful to package digital objects lets say les together in a single le, for example a ZIP [142] le. However if one simply did that then there would be no indication of the relationship between the les, so there must be some mechanism for specifying the relationship. In any practical system one needs to encode the links somehow. If it is not practical to put everything into a single le then an alternative would be to point to one or more of the digital objects using some kind of identier system. As in the single le case, one would need to specify the relationships somehow. There are many ways of implementing this kind of packaging and each has its own mechanism for specifying such relationships. Regarding the package as a digital object, another way of thinking about this is that one needs the appropriate Representation Information in order to use the package however it seems useful to have some special terminology in this case. One can imagine that these mechanisms for specifying the relationships between the components of the package could include: Naming conventions for the components Reliance on specic software to extract the components Indirection, for example by means of an XML schema which provides the semantics to distinguish different components. Of course the schema would need its own Representation Information, and in particular the semantics associated with the element names. General relationship techniques such as RDF again there would need to be additional Representation Information meaning of the tags would have to be specied separately. There are a number of techniques which have been proposed including IMS content packaging [143], SOAP [144], METS [145] and XFDU [146].
D. Giaretta, Advanced Digital Preservation, DOI 10.1007/978-3-642-16809-3_11, C Springer-Verlag Berlin Heidelberg 2011 191

192

11

Linking Data and Metadata: Packaging

Of these only XFDU has close connections to OAIS and in particular full support for all types of Representation Information. Therefore we use XFDU in our examples, but this should not be taken to mean this is the only way. OAIS describes several package variants, but only the Archival Information Package (AIP) has mandatory contents and we look in detail at the AIP next.

11.2 Archival Information Packaging


The AIP is a critical element in OAIS. There is a distinction which is made between an Archival Information Unit (AIU) and an Archival Information Collection (AIC), both of which are special types of AIPs (Fig. 11.1). There is an analogy here with what were termed in Sect. 4.1 Simple Objects and Composite Objects. OAIS denes: Archival Information Collection (AIC): An Archival Information Package whose Content Information is an aggregation of other Archival Information Packages. Archival Information Unit (AIU): An Archival Information Package where the archive chooses not to break down the Content Information into other Archival Information Packages. An AIU can consist of multiple digital objects (e.g., multiple les). This shows that an AIC is a Composite Object, and the AIU could in some ways be described as a Simple Object although clearly it has components. For further details of the useful terminology associated with AICs the reader should consult OAIS.

Fig. 11.1 Specialisations of AIP

11.3

XFDU

193

11.3 XFDU
Much of the packaging described in Part II uses the XFDU and, although this is not the only possible packaging technique, so it is convenient to provide a little more detail here. XFDU has been standardized and well-documented by CCSDS with the idea of supporting OAIS terminology from its conception. One key feature is the exibility it allows in terms of which things are pointed to and which are physically inside the XFDU encoding. It has been used in an operational environment by The European Space Agency (ESA) in the form of the Standard Archive Format for Europe (SAFE) [147], a packaging format fully-compatible with XFDU. Developing XFDU solutions can be facilitated through existing open-source Java toolkits and APIs, which have been created by ESA and NASA, allowing the construction, editing and analysis of standardized XFDU Information Packages. The Manifest document shown in Fig. 11.2 contains the information about the relationships between the information that is packaged together. XFDU uses an XML schema to describe this manifest le which is split into ve sections. The packageHeader documents information about the package itself, its versioning, its position in a sequence or volume, and PDI about it existence. The dataObjectSection and metadataSection are used to relate the digital information to be preserved to its RepInfo or PDI, respectively. Both data objects and metadata objects can be either connected by reference or encoded within the manifest itself (Fig. 11.3). Each object is assigned an XML identier, which is used to link objects between the two sections. Objects in both sections can be given builtin classications or associated with user-dened classication schemes.

Fig. 11.2 Conceptual view of an XFDU

194

11

Linking Data and Metadata: Packaging

package header
packageHeader

data objects
dataObjectSection
URI

metadata objects
xml Id

metadataSection

URI

dataObject

metadataObject metadataObject

xml Id URI

Metadata Category Pointers (xml Ids)

Category

Class

REP

DED, SYNTAX, OTHER CONTEXT, PROVENANCE, REFERENCE, FIXITY, OTHER DESCRIPTION, OTHER

behaviorSection

informationPackageMap
xml Id

PDI

behaviorObject

ContentUnit

DMD

OTHER OTHER ANY

URI

Structure map

xfdu

Fig. 11.3 XFDU manifest logical view

The informationPackageMap records information about content units, which are used to associate data in the dataObjectSection to metadata in the metadataSection. The association is done via XML identiers, and maps to the OAIS concept of Content Information Object, the combination of a digital object and its RepInfo. A diagram of the full XML schema of the XFDU is shown in Fig. 11.4. This schema keeps AIPs consistent and standard while allowing a exible and adaptable implementation. By extending the XFDU schema to provide domain specic AIPs it is possible to allow the inclusion of additional information while maintaining the standardization and consistency that are two of the main advantages of using XFDU for preservation. ESA has demonstrated this by extending the XFDU schema into SAFE, which includes spacecraft mission-specic information embedded in the XFDU manifest. A toolkit for creating and reading XFDUs is available from the XFDU web site [148] and GAEL XFDU web site [149].

11.3.1 XFDU and TDO


Because both embody packaging techniques, the XFDU structure does implement many, perhaps all, of the concepts of the Trustworthy Digital Object (TDO) [8]. However the latter seems to rely on emulation (see Sect. 7.9) and in particular the UVC (see Sect. 7.9.4.3) as its ultimate preservation technique.

11.3

XFDU

195

Fig. 11.4 Full XFDU schema diagram

196

11

Linking Data and Metadata: Packaging

Emulation has its place in preservation but as we point out in Sect. 7.9, this is limiting not least because in essence one is limited to what has been possible with the digital object in the past. Moreover especially because the semantics of the digital object are not made explicit in the TDO, even if one could link the emulation to modern applications, one would be limited with what new things could be done. The XFDU is not tied in any way to emulation, although an emulator can be one part of the Representation Information in the package. Therefore it is fair to say that the XFDU is a superset of the TDO technical concept.

11.4 Summary
Packaging is an important requirement with many possible solutions. This chapter has tried to elucidate the key considerations and describe in some detail one possible packaging mechanism.

Você também pode gostar