CA Makes a distinction between archival description of the record at hand and documentation of the context of its creation. Argues the importance of the latter in establishing the evidentiary value of records, and criticizes ISAD(G) for its failure to account for context. "(1) The subject of documentation is, first and foremost, the activity that generated the records, the organizations and individuals who used the records, and the purposes to which the records were put. (2). The content of the documentation must support requirements for the archival management of records, and the representations of data should support life cycle management of records. (3) The requirements of users of archives, especially their personal methods of inquiry, should determine the data values in documentation systems and guide archivists in presenting abstract models of their systems to users." (p. 45-46)
Phrases
<P1> [T]he ICA Principles rationalize existing practice -- which the author believes as a practical matter we cannot afford; which fail to provide direct access for most archives users; and which do not support the day-to-day information requirements of archivists themselves. These alternatives are also advanced because of three, more theoretical, differences with the ICA Principles: (1) In focusing on description rather than documentation, they overlook the most salient characteristic of archival records: their status as evidence. (2) In proposing specific content, they are informed by the bibliographic tradition rather than by concrete analysis of the way in which information is used in archives. (3) In promoting data value standardization without identifying criteria or principles by which to identify appropriate language or structural links between the objects represented by such terms, they fail adequately to recognize that the data representation rules they propose reflect only one particular, and a limiting, implementation. (p. 33-34) <P2> Archives are themselves documentation; hence I speak here of "documenting documentation" as a process the objective of which is to construct a value-added representation of archives, by means of strategic information capture and recording into carefully structured data and information access systems, as a mechanism to satisfy the information needs of users including archivists. Documentation principles lead to methods and practices which involve archivists at the point, and often at the time, of records creation. In contrast, archival description, as described in the ICA Principles[,] is "concerned with the formal process of description after the archival material has been arranged and the units or entities to be described have been determined." (1.7) I believe documentation principles will be more effective, more efficient and provide archivists with a higher stature in their organizations than the post accessioning description principles proposed by the ICA. <warrant> (p. 34) <P3> In the United States, in any case, there is still no truly theoretical formulation of archival description principles that enjoys a widespread adherence, in spite of the acceptance of rules for description in certain concrete application contexts. (p. 37) <P4> [T]he MARC-AMC format and library bibliographic practices did not adequately reflect the importance of information concerning the people, corporate bodies and functions that generated records, and the MARC Authority format did not support appropriate recording of such contexts and relations. <warrant> (p. 37) <P5> The United States National Archives, even though it had contributed to the data dictionary which led to the MARC content designation, all the data which it believed in 1983 that it would want to interchange, rejected the use of MARC two years later because it did not contain elements of information required by NARA for interchange within its own information systems. <warrant> (p. 37) <P6> [A]rchivists failed to understand then, just as the ISAD(G) standard fails to do now, that rules for content and data representation make sense in the context of the purposes of actual exchanges or implementation, not in the abstract, and that different rules or standards for end-products may derive from the same principles. (p. 38) <P7> After the Committee on Archival Information Exchange of the Society of American Archivists was confronted with proposals to adopt many different vocabularies for a variety of different data elements, a group of archivists who were deeply involved in standards and description efforts within the SAA formed an Ad Hoc Working Group on Standards for Archival Description (WGSAD) to identify what types of standards were needed in order to promote better description practices.  WSAD concluded that existing standards were especially inadequate to guide practice in documenting contexts of creation.  Since then, considerable progress has been made in developing frameworks for documentation, archival information systems architecture and user requirements analysis, which have been identified as the three legs on which the documenting documentation platform rests. <warrant> (p. 38) <P8> Documentation of organizational activity ought to begin long before records are transferred to archives, and may take place even before any records are created -- at the time records are created -- at the time when new functions are assigned to an organization. (p. 39) <P9> It is possible to identify records which will be created and their retention requirements before they are created, because their evidential value and informational content are essentially predetermined. (p. 39) <P10> Archivists can actively intervene through regulation and guidance to ensure that the data content and values depicting activities and functions are represented in such a way that will make them useful for subsequent management and retrieval of the records resulting from these activities. This information, together with systems documentation, defines the immediate information system context out of which the records were generated, in which they are stored, and from which they were retrieved during their active life. (p. 39) <P11> Documentation of the link between data content and the context of creation and use of the records is essential if records (archives or manuscripts) are to have value as evidence. (p. 39) <P12> [C]ontextual documentation capabilities can be dramatically improved by having records managers actively intervene in systems design and implementation.  The benefits of proactive documentation of the context of records creation, however, are not limited to electronic records; the National Archives of Canada has recently revised its methods of scheduling to ensure that such information about important records systems and contexts of records creation will be documented earlier. <warrant> (p. 39) <P13> Documentation of functions and of information systems can be conducted using information created by the organization in the course of its own activity, and can be used to ensure the transfer of records to archives and/or their destruction at appropriate times. It ensures that data about records which were destroyed as well as those which were preserved will be kept, and it takes advantage of the greater knowledge of records and the purposes and methods of day-to-day activity that exist closer to the events. (p. 40) <P14> The facts of processing, exhibiting, citing, publishing and otherwise managing records becomes significant for their meaning as records, which is not true of library materials. (p. 41) <P15> [C]ontent and data representation requirements ought to be derived from analysis of the uses to which such systems must be put, and should satisfy the day to day information requirements of archivists who are the primary users of archives, and of researchers using archives for primary evidential purposes. (p. 41) <P16> The ICA Commission proposes a principle by which archivists would select data content for archival descriptions, which is that "the structure and content of representations of archival material should facilitate information retrieval." (5.1) Unfortunately, it does not help us to understand how the Commission selected the twenty-five elements of information identified as its standard, or how we could apply the principle to the selection of additional data content. It does, however, serve as a prelude to the question of which principles should guide archivists in choosing data values in their representations. (p. 42) <P17> Libraries have found that subject access based on titles, tables of contents, abstracts, indexes and similar formal subject analysis by-products of publishing can support most bibliographic research, but the perspectives brought to materials by archival researchers are both more varied and likely to differ from those of the records creators. (p. 43) <P18> The user should not only be able to employ a terminology and a perspective which are natural, but also should be able to enter the system with a knowledge of the world being documented, without knowing about the world of documentation. (p. 44) <P19> Users need to be able to enter the system through the historical context of activity, construct relations in that context, and then seek avenues down into the documentation. This frees them from trying to imagine what records might have survived -- documentation assists the user to establish the non-existence of records as well as their existence -- or to fathom how archivists might have described records which did survive. (p. 44) <P20> When they departed from the practices of Brooks and Schellenberg in order to develop means for the construction of union catalogues of archival holdings, American archivists were not defining new principles, but inventing a simple experiment. After several years of experience with the new system, serious criticisms of it were being leveled by the very people who had first devised it. (p. 45)
Conclusions
RQ "In short, documentation of the three aspects of records creation contexts (activities, organizations and their functions, and information systems), together with representation of their relations, is essential to the concept of archives as evidence and is therefore a fundamental theoretical principle for documenting documentation. Documentation is a process that captures information about an activity which is relevant to locating evidence of that activity, and captures information about records that are useful to their ongoing management by the archival repository. The primary source of information is the functions and information systems giving rise to the records, and the principal activity of the archivist is the manipulation of data for reference files that create richly-linked structures among attributes of the records-generating context, and which point to the underlying evidence or record." (p. 46)
Type
Journal
Title
The Management of Digital Data: A metadata approach
CA "Super-metadata may well play a crucial role both in facilitating access to DDOs and in providing a means of selecting and managing the maintenance of these DDOs over time."
Phrases
<P1> The preservation of the intellectual content of DDOs brings into focus a major issue: "the integrity and authenticity of the information as originally recorded" (Graham, 1997). (p.365). <P2> The emergence of dynamic and living DDOs is presenting challenges to the conventional understanding of the preservation of digital resources and is forcing many organizations to reevaluate their strategies in the light of these rapid advances in information sources. The use of appropriate metadata is recognized to be essential in ensuring continued access to dynamic and living DDOs, but the standards for such metadata are not yet fully understood or developed. (p.369)
Conclusions
RQ How can we decide what to preserve ? How can we assure long-term access? What will be the cost of electronic archiving? Which metadata schema will be in use 10 years from now, and how will migration be achieved?
Type
Electronic Journal
Title
A Spectrum of Interoperability: The Site for Science Prototype for the NSDL
"Currently, NSF is funding 64 projects, each making its own contribution to the library, with a total annual budget of about $24 million. Many projects are building collections; others are developing services; a few are carrying out targeted research.The NSDL is a broad program to build a digital library for education in science, mathematics, engineering and technology. It is funded by the National Science Foundation (NSF) Division of Undergraduate Education. . . . The Core Integration task is to ensure that the NSDL is a single coherent library, not simply a set of unrelated activities. In summer 2000, the NSF funded six Core Integration demonstration projects, each lasting a year. One of these grants was to Cornell University and our demonstration is known as Site for Science. It is at http://www.siteforscience.org/ [Site for Science]. In late 2001, the NSF consolidated the Core Integration funding into a single grant for the production release of the NSDL. This grant was made to a collaboration of the University Corporation for Atmospheric Research (UCAR), Columbia University and Cornell University. The technical approach being followed is based heavily on our experience with Site for Science. Therefore this article is both a description of the strategy for interoperability that was developed for Site for Science and an introduction to the architecture being used by the NSDL production team."
ISBN
1082-9873
Critical Arguements
CA "[T]his article is both a description of the strategy for interoperability that was developed for the [Cornell University's NSF-funded] Site for Science and an introduction to the architecture being used by the NSDL production team."
Phrases
<P1> The grand vision is that the NSDL become a comprehensive library of every digital resource that could conceivably be of value to any aspect of education in any branch of science and engineering, both defined very broadly. <P2> Interoperability among heterogeneous collections is a central theme of the Core Integration. The potential collections have a wide variety of data types, metadata standards, protocols, authentication schemes, and business models. <P3> The goal of interoperability is to build coherent services for users, from components that are technically different and managed by different organizations. This requires agreements to cooperate at three levels: technical, content and organizational. <P4> Much of the research of the authors of this paper aims at . . . looking for approaches to interoperability that have low cost of adoption, yet provide substantial functionality. One of these approaches is the metadata harvesting protocol of the Open Archives Initiative (OAI) . . . <P5> For Site for Science, we identified three levels of digital library interoperability: Federation; Harvesting; Gathering. In this list, the top level provides the strongest form of interoperability, but places the greatest burden on participants. The bottom level requires essentially no effort by the participants, but provides a poorer level of interoperability. The Site for Science demonstration concentrated on the harvesting and gathering, because other projects were exploring federation. <P6> In an ideal world all the collections and services that the NSDL wishes to encompass would support an agreed set of standard metadata. The real world is less simple. . . . However, the NSDL does have influence. We can attempt to persuade collections to move along the interoperability curve. <warrant> <P7> The Site for Science metadata strategy is based on two principles. The first is that metadata is too expensive for the Core Integration team to create much of it. Hence, the NSDL has to rely on existing metadata or metadata that can be generated automatically. The second is to make use of as much of the metadata available from collections as possible, knowing that it varies greatly from none to extensive. Based on these principles, Site for Science, and subsequently the entire NSDL, developed the following metadata strategy: Support eight standard formats; Collect all existing metadata in these formats; Provide crosswalks to Dublin Core; Assemble all metadata in a central metadata repository; Expose all metadata records in the repository for service providers to harvest; Concentrate limited human effort on collection-level metadata; Use automatic generation to augment item-level metadata. <P8> The strategy developed by Site for Science and now adopted by the NSDL is to accumulate metadata in the native formats provided by the collections . . . If a collection supports the protocols of the Open Archives Initiative, it must be able to supply unqualified Dublin Core (which is required by the OAI) as well as the native metadata format. <P9> From a computing viewpoint, the metadata repository is the key component of the Site for Science system. The repository can be thought of as a modern variant of the traditional library union catalog, a catalog that holds comprehensive catalog records from a group of libraries. . . . Metadata from all the collections is stored in the repository and made available to providers of NSDL service.
Conclusions
RQ 1 "Can a small team of librarians manage the collection development and metadata strategies for a very large library?" RQ 2 "Can the NSDL actually build services that are significantly more useful than the general web search services?"
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 1
The preservation of digital information for long periods of time is becoming feasible through the integration of archival storage technology from supercomputer centers, data grid technology from the computer science community, information models from the digital library community, and preservation models from the archivistÔÇÖs community. The supercomputer centers provide the technology needed to store the immense amounts of digital data that are being created, while the digital library community provides the mechanisms to define the context needed to interpret the data. The coordination of these technologies with preservation and management policies defines the infrastructure for a collection-based persistent archive. This paper defines an approach for maintaining digital data for hundreds of years through development of an environment that supports migration of collections onto new software systems.
ISBN
1082-9873
Critical Arguements
CA "Supercomputer centers, digital libraries, and archival storage communities have common persistent archival storage requirements. Each of these communities is building software infrastructure to organize and store large collections of data. An emerging common requirement is the ability to maintain data collections for long periods of time. The challenge is to maintain the ability to discover, access, and display digital objects that are stored within an archive, while the technology used to manage the archive evolves. We have implemented an approach based upon the storage of the digital objects that comprise the collection, augmented with the meta-data attributes needed to dynamically recreate the data collection. This approach builds upon the technology needed to support extensible database schema, which in turn enables the creation of data handling systems that interconnect legacy storage systems."
Phrases
<P1> The ultimate goal is to preserve not only the bits associated with the original data, but also the context that permits the data to be interpreted. <warrant> <P2> We rely on the use of collections to define the context to associate with digital data. The context is defined through the creation of semi-structured representations for both the digital objects and the associated data collection. <P3>A collection-based persistent archive is therefore one in which the organization of the collection is archived simultaneously with the digital objects that comprise the collection. <P4> The goal is to preserve digital information for at least 400 years. This paper examines the technical issues that must be addressed and presents a prototype implementation. <P5>Digital object representation. Every digital object has attributes that define its structure, physical context, and provenance, and annotations that describe features of interest within the object. Since the set of attributes (such as annotations) will vary across all objects within a collection, a semi-structured representation is needed. Not all digital objects will have the same set of associated attributes. <P6> If possible, a common information model should be used to reference the attributes associated with the digital objects, the collection organization, and the presentation interface. An emerging standard for a uniform data exchange model is the eXtended Markup Language (XML). <P7> A particular example of an information model is the XML Document Type Definition (DTD) which provides a description for the allowed nesting structure of XML elements. Richer information models are emerging such as XSchema (which provides data types, inheritance, and more powerful linking mechanisms) and XMI (which provides models for multiple levels of data abstraction). <P8> Although XML DTDs were originally applied to documents only, they are now being applied to arbitrary digital objects, including the collections themselves. More generally, OSDs can be used to define the structure of digital objects, specify inheritance properties of digital objects, and define the collection organization and user interface structure. <P9> A persistent collection therefore needs the following components of an OSD to completely define the collection context: Data dictionary for collection semantics; Digital object structure; Collection structure; and User interface structure. <P10> The re-creation or instantiation of the data collection is done with a software program that uses the schema descriptions that define the digital object and collection structure to generate the collection. The goal is to build a generic program that works with any schema description. <P11> The information for which driver to use for access to a particular data set is maintained in the associated Meta-data Catalog (MCAT). The MCAT system is a database containing information about each data set that is stored in the data storage systems. <P12> The data handling infrastructure developed at SDSC has two components: the SDSC Storage Resource Broker (SRB) that provides federation and access to distributed and diverse storage resources in a heterogeneous computing environment, and the Meta-data Catalog (MCAT) that holds systemic and application or domain-dependent meta-data about the resources and data sets (and users) that are being brokered by the SRB. <P13> A client does not need to remember the physical mapping of a data set. It is stored as meta-data associated with the data set in the MCAT catalog. <P14> A characterization of a relational database requires a description of both the logical organization of attributes (the schema), and a description of the physical organization of attributes into tables. For the persistent archive prototype we used XML DTDs to describe the logical organization. <P15> A combination of the schema and physical organization can be used to define how queries can be decomposed across the multiple tables that are used to hold the meta-data attributes. <P16> By using an XML-based database, it is possible to avoid the need to map between semi-structured and relational organizations of the database attributes. This minimizes the amount of information needed to characterize a collection, and makes the re-creation of the database easier. <warrant> <P17> Digital object attributes are separated into two classes of information within the MCAT: System-level meta-data that provides operational information. These include information about resources (e.g., archival systems, database systems, etc., and their capabilities, protocols, etc.) and data objects (e.g., their formats or types, replication information, location, collection information, etc.); Application-dependent meta-data that provides information specific to particular data sets and their collections (e.g., Dublin Core values for text objects). <P18> Internally, MCAT keeps schema-level meta-data about all of the attributes that are defined. The schema-level attributes are used to define the context for a collection and enable the instantiation of the collection on new technology. <P19> The logical structure should not be confused with database schema and are more general than that. For example, we have implemented the Dublin Core database schema to organize attributes about digitized text. The attributes defined in the logical structure that is associated with the Dublin Core schema contains information about the subject, constraints, and presentation formats that are needed to display the schema along with information about its use and ownership. <P20> The MCAT system supports the publication of schemata associated with data collections, schema extension through the addition or deletion of new attributes, and the dynamic generation of the SQL that corresponds to joins across combinations of attributes. <P21> By adding routines to access the schema-level meta-data from an archive, it is possible to build a collection-based persistent archive. As technology evolves and the software infrastructure is replaced, the MCAT system can support the migration of the collection to the new technology.
Conclusions
RQ Collection-Based Persistent Digital Archives - Part 2
SOW
DC "The technology proposed by SDSC for implementing persistent archives builds upon interactions with many of these groups. Explicit interactions include collaborations with Federal planning groups, the Computational Grid, the digital library community, and individual federal agencies." ... "The data management technology has been developed through multiple federally sponsored projects, including the DARPA project F19628-95-C-0194 "Massive Data Analysis Systems," the DARPA/USPTO project F19628-96-C-0020 "Distributed Object Computation Testbed," the Data Intensive Computing thrust area of the NSF project ASC 96-19020 "National Partnership for Advanced Computational Infrastructure," the NASA Information Power Grid project, and the DOE ASCI/ASAP project "Data Visualization Corridor." Additional projects related to the NSF Digital Library Initiative Phase II and the California Digital Library at the University of California will also support the development of information management technology. This work was supported by a NARA extension to the DARPA/USPTO Distributed Object Computation Testbed, project F19628-96-C-0020."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 2
"Collection-Based Persistent Digital Archives: Part 2" describes the creation of a one million message persistent E-mail collection. It discusses the four major components of a persistent archive system: support for ingestion, archival storage, information discovery, and presentation of the collection. The technology to support each of these processes is still rapidly evolving, and opportunities for further research are identified.
ISBN
1082-9873
Critical Arguements
CA "The multiple migration steps can be broadly classified into a definition phase and a loading phase. The definition phase is infrastructure independent, whereas the loading phase is geared towards materializing the processes needed for migrating the objects onto new technology. We illustrate these steps by providing a detailed description of the actual process used to ingest and load a million-record E-mail collection at the San Diego Supercomputer Center (SDSC). Note that the SDSC processes were written to use the available object-relational databases for organizing the meta-data. In the future, it may be possible to go directly to XML-based databases."
Phrases
<P1> The processes used to ingest a collection, transform it into an infrastructure independent form, and store the collection in an archive comprise the persistent storage steps of a persistent archive. The processes used to recreate the collection on new technology, optimize the database, and recreate the user interface comprise the retrieval steps of a persistent archive. <P2> In order to build a persistent collection, we consider a solution that "abstracts" all aspects of the data and its preservation. In this approach, data object and processes are codified by raising them above the machine/software dependent forms to an abstract format that can be used to recreate the object and the processes in any new desirable forms. <P3> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P4> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P5> The steps used to store the persistent archive were: (1) Define Digital Object: define meta-data, define object structure (OBJ-DTD) --- (A), define object DTD to object DDL mapping --- (B) (2) Define Collection: define meta-data, define collection structure (COLL-DTD) --- (C), define collection DTD structure to collection DDL mapping --- (D) (3) Define Containers: define packing format for encapsulating data and meta-data (examples are the AIP standard, Hierarchical Data Format, Document Type Definition) <P5> In the ingestion phase, the relational and semi-structured organization of the meta-data is defined. No database is actually created, only the mapping between the relational organization and the object DTD. <P6> Note that the collection relational organization does not have to encompass all of the attributes that are associated with a digital object. Separate information models are used to describe the objects and the collections. It is possible to take the same set of digital objects and form a new collection with a new relational organization. <P7> Multiple communities across academia, the federal government, and standards groups are exploring strategies for managing very large archives. The persistent archive community needs to maintain interactions with these communities to track development of new strategies for data management and storage. <warrant> <P8>
Conclusions
RQ "The four major components of the persistent archive system are support for ingestion, archival storage, information discovery, and presentation of the collection. The first two components focus on the ingestion of data into collections. The last two focus on access to the resulting collections. The technology to support each of these processes is still rapidly evolving. Hence consensus on standards has not been reached for many of the infrastructure components. At the same time, many of the components are active areas of research. To reach consensus on a feasible collection-based persistent archive, continued research and development is needed. Examples of the many related issues are listed below:
Type
Electronic Journal
Title
Buckets: A new digital technology for preserving NASA research
CA Buckets are information objects designed to reduce dependency on traditional archives and database systems thereby making them more resilent to the transient nature of information systems.
Phrases
Another focus of aggregation was including the metadata with data. Through experiences NASA researchers found that metadata tended to "drift" over time, becoming decoupled from the data it described or locked in specific DL systems and hard to extract or share with other systems. (p. 377) Buckets are designed to imbue the information objects with certain responsibilities, such as display, dissemination, protection, and maintenance of its contents. As such, buckets should be able to work with many DL systems simultaneously, and minimize or eliminate the necessary modification of DL systems to work with buckets. Ideally, buckets should work with everything and break nothing. This philosophy is formalized in the SODA DL model. the objects become "smarter" at the expense of the archives (that become "dumber"), as functionalities generally associated with archives are moved into the data objects themselves. (p. 390)
Conclusions
RQ The creation of high quality tools for bucket creation and administration is absolutely necessary. The extension of authentication and security measures is key to supporting more technologies. Many applications of this sort of information object independence remains to be explored.
Type
Electronic Journal
Title
A Metadata Framework Developed at the Tsinghua University Library to Aid in the Preservation of Digital Resources
This article provides an overview of work completed at Tsinghua University Library in which a metadata framework was developed to aid in the preservation of digital resources. The metadata framework is used for the creation of metadata to describe resources, and includes an encoding standard used to store metadata and resource structures in information systems. The author points out that the Tsinghua University Library metadata framework provides a successful digital preservation solution that may be an appropriate solution for other organizations as well.
Notes
Well laid out diagrams show the structural layers of resources; encoding exampes are included also.
ISBN
1082-9873
DOI
10.1045/november2002-niu
Critical Arguements
CA The author delineates the metadata schema implemented at Tsinghua University Library which allows for resource description and preservation.
These award-winning guidelines, released in August 2002, were developed by the RLG EAD Advisory Group to provide practical, community-wide advice for encoding finding aids. They are designed to: facilitate interoperability of resource discovery by imposing a basic degree of uniformity on the creation of valid EAD-encoded documents; encourage the inclusion of particular elements, and; develop a set of core data elements. 
Publisher
Research Libraries Group
Publication Location
Mountain View, CA, USA
Language
English
Critical Arguements
<CA> The objectives of the guidelines are: 1. To facilitate interoperability of resource discovery by imposing a basic degree of uniformity on the creation of valid EAD-encoded documents and to encourage the inclusion of elements most useful for retrieval in a union index and for display in an integrated (cross-institutional) setting; 2. To offer researchers the full benefits of XML in retrieval and display by developing a set of core data elements to improve resource discovery. It is hoped that by identifying core elements and by specifying "best practice" for those elements, these guidelines will be valuable to those who create finding aids, as well as to vendors and tool builders; 3. To contribute to the evolution of the EAD standard by articulating a set of best practice guidelines suitable for interinstitutional and international use. These guidelines can be applied to both retrospective conversion of legacy finding aids and the creation of new finding aids.  
Conclusions
<RQ>
SOW
<DC> "RLG organized the EAD working group as part of our continuing commitment to making archival collections more accessible on the Web. We offer RLG Archival Resources, a database of archival materials; institutions are encouraged to submit their finding aids to this database." ... "This set of guidelines, the second version promulgated by RLG, was developed between October 2001 and August 2002 by the RLG EAD Advisory Group. This group consisted of ten archivists and digital content managers experienced in creating and managing EAD-encoded finding aids at repositories in the United States and the United Kingdom."
Type
Web Page
Title
An Assessment of Options for Creating Enhanced Access to Canada's Audio-Visual Heritage
CA "This project was conducted by Paul Audley & Associates to investigate the feasibility of single window access to information about Canada's audio-visual heritage. The project follows on the recommendations of Fading Away, the 1995 report of the Task Force on the Preservation and Enhanced Use of Canada's Audio-Visual Heritage, and the subsequent 1997 report Search + Replay. Specific objectives of this project were to create a profile of selected major databases of audio-visual materials, identify information required to meet user needs, and suggest models for single-window access to audio-visual databases. Documentary research, some 35 interviews, and site visits to organizations in Vancouver, Toronto, Ottawa and Montreal provided the basis upon which the recommendations of this report were developed."
Type
Web Page
Title
Archiving The Avant Garde: Documenting And Preserving Variable Media Art.
Archiving the Avant Garde is a collaborative project to develop, document, and disseminate strategies for describing and preserving non-traditional, intermedia, and variable media art forms, such as performance, installation, conceptual, and digital art. This joint project builds on existing relationships and the previous work of its founding partners in this area. One example of such work is the Conceptual & Intermedia Arts Online (CIAO) Consortium, a collaboration founded by the BAM/PFA, the Walker Art Center, and Franklin Furnace, that includes 12 other international museums and arts organizations. CIAO develops standardized methods of documenting and providing access to conceptual and other ephemeral intermedia art forms. Another example of related work conducted by the project's partners is the Variable Media Initiative, organized by the Guggenheim Museum, which encourages artists to define their work independently from medium so that the work can be translated once its current medium is obsolete. Archiving the Avant Garde will take the ideas developed in previous efforts and develop them into community-wide working strategies by testing them on specific works of art in the practical working environments of museums and arts organizations. The final project report will outline a comprehensive strategy and model for documenting and preserving variable media works, based on case studies to illustrate practical examples, but always emphasizing the generalized strategy behind the rule. This report will be informed by specific and practical institutional practice, but we believe that the ultimate model developed by the project should be based on international standards independent of any one organization's practice, thus making it adaptable to many organizations. Dissemination of the report, discussed in detail below, will be ongoing and widespread.
Critical Arguements
CA "Works of variable media art, such as performance, installation, conceptual, and digital art, represent some of the most compelling and significant artistic creation of our time. These works are key to understanding contemporary art practice and scholarship, but because of their ephemeral, technical, multimedia, or otherwise variable natures, they also present significant obstacles to accurate documentation, access, and preservation. The works were in many cases created to challenge traditional methods of art description and preservation, but now, lacking such description, they often comprise the more obscure aspects of institutional collections, virtually inaccessible to present day researchers. Without strategies for cataloging and preservation, many of these vital works will eventually be lost to art history. Description of and access to art collections promote new scholarship and artistic production. By developing ways to catalog and preserve these collections, we will both provide current and future generations the opportunity to learn from and be inspired by the works and ensure the perpetuation and accuracy of art historical records. It is to achieve these goals that we are initiating the consortium project Archiving the Avant Garde: Documenting and Preserving Variable Media Art."
Conclusions
RQ "Archiving the Avant Garde will take a practical approach to solving problems in order to ensure the feasibility and success of the project. This project will focus on key issues previously identified by the partners and will leave other parts of the puzzle to be solved by other initiatives and projects in regular communication with this group. For instance, this project realizes that the arts community will need to develop software tools which enable collections care professionals to implement the necessary new description and metadata standards, but does not attempt to develop such tools in the context of this project. Rather, such tools are already being developed by a separate project under MOAC. Archiving the Avant Garde will share information with that project and benefit from that work. Similarly, the prospect of developing full-fledged software emulators is one best solved by a team of computer scientists, who will work closely with members of the proposed project to cross-fertilize methods and share results. Importantly, while this project is focused on immediate goals, the overall collaboration between the partner organizations and their various initiatives will be significant in bringing together the computer science, arts, standards, and museum communities in an open-source project model to maximize collective efforts and see that the benefits extend far and wide."
SOW
DC "We propose a collaborative project that will begin to establish such professional best practice. The collaboration, consisting of the Berkeley Art Museum and Pacific Film Archive (BAM/PFA), the Solomon R. Guggenheim Museum, Rhizome.org, the Franklin Furnace Archive, and the Cleveland Performance Art Festival and Archive, will have national impact due to the urgent and universal nature of the problem for contemporary art institutions, the practicality and adaptability of the model developed by this group, and the significant expertise that this nationwide consortium will bring to bear in the area of documenting and preserving variable media art." ... "We believe that a model informed by and tested in such diverse settings, with broad public and professional input (described below), will be highly adaptable." ..."Partners also represent a geographic and national spread, from East Coast to Midwest to West Coast. This coverage ensures that a wide segment of the professional community and public will have opportunities to participate in public forums, hosted at partner institutions during the course of the project, intended to gather an even broader cross-section of ideas and feedback than is represented by the partners." ... "The management plan for this project will be highly decentralized ensuring that no one person or institution will unduly influence the model strategy for preserving variable media art and thereby reduce its adaptability."
The creation and use of metadata is likely to become an important part of all digital preservation strategies whether they are based on hardware and software conservation, emulation or migration. The UK Cedars project aims to promote awareness of the importance of digital preservation, to produce strategic frameworks for digital collection management policies and to promote methods appropriate for long-term preservation - including the creation of appropriate metadata. Preservation metadata is a specialised form of administrative metadata that can be used as a means of storing the technical information that supports the preservation of digital objects. In addition, it can be used to record migration and emulation strategies, to help ensure authenticity, to note rights management and collection management data and also will need to interact with resource discovery metadata. The Cedars project is attempting to investigate some of these issues and will provide some demonstrator systems to test them.
Notes
This article was presented at the Joint RLG and NPO Preservation Conference: Guidelines for Digital Imaging, held September 28-30, 1998.
Critical Arguements
CA "Cedars is a project that aims to address strategic, methodological and practical issues relating to digital preservation (Day 1998a). A key outcome of the project will be to improve awareness of digital preservation issues, especially within the UK higher education sector. Attempts will be made to identify and disseminate: Strategies for collection management ; Strategies for long-term preservation. These strategies will need to be appropriate to a variety of resources in library collections. The project will also include the development of demonstrators to test the technical and organisational feasibility of the chosen preservation strategies. One strand of this work relates to the identification of preservation metadata and a metadata implementation that can be tested in the demonstrators." ... "The Cedars Access Issues Working Group has produced a preliminary study of preservation metadata and the issues that surround it (Day 1998b). This study describes some digital preservation initiatives and models with relation to the Cedars project and will be used as a basis for the development of a preservation metadata implementation in the project. The remainder of this paper will describe some of the metadata approaches found in these initiatives."
Conclusions
RQ "The Cedars project is interested in helping to develop suitable collection management policies for research libraries." ... "The definition and implementation of preservation metadata systems is going to be an important part of the work of custodial organisations in the digital environment."
SOW
DC "The Cedars (CURL exemplars in digital archives) project is funded by the Joint Information Systems Committee (JISC) of the UK higher education funding councils under Phase III of its Electronic Libraries (eLib) Programme. The project is administered through the Consortium of University Research Libraries (CURL) with lead sites based at the Universities of Cambridge, Leeds and Oxford."
Type
Web Page
Title
METS : Metadata Encoding and Transmission Standard
CA "METS, although in its early stages, is already sufficiently established amongst key digital library players that it can reasonably be considered the only viable standard for digital library objects in the foreseeable future. Although METS may be an excellent framework, it is just that and only that. It does not prescribe the content of the metadata itself, and this is a continuing problem for METS and all other schema to contend with if they are to realize their full functionality and usefulness."
Conclusions
RQ The standardization (via some sort of cataloging rules) of the content held by metadata "containers" urgently needs to be addressed. If not, the full value of any metadata scheme, no matter how extensible or robust, will not be realized.
Type
Web Page
Title
Report of the Ad Hoc Committee for Development of a Standardized Tool for Encoding Finding Aids
This report focuses on the development of tools for the description and intellectual control of archives and the discovery of relevant resources by users. Other archival functions, such as appraisal, acquisition, preservation, and physical control, are beyond the scope for this project. The system developed as a result of this report should be useable on stand-alone computers in small institutions, by multiple users in larger organisations, and by local, regional, national, and international networks. The development of such a system should take into account the strategies, experiences, and results of other initiatives such as the European Union Archival Network (EUAN), the Linking and Exploring Authority Files (LEAF) initiative, the European Visual Archives (EVA) project, and the Canadian Archival Information Network (CAIN). This report is divided into five sections. A description of the conceptual structure of an archival information system, described as six layers of services and protocols, follows this introduction. Section three details the functional requirements for the software tool and is followed by a discussion of the relationship of these requirements to existing archival software application. The report concludes with a series of recommendations that provide a strategy for the successful development, deployment, and maintenance of an Open Source Archival Resource Information System (OSARIS). There are two appendices: a data model and a comparison of the functional requirements statements to several existing archival systems.
Notes
3. Functional Requirements Requirements for Information Interchange 3.2: The system must support the current archival standards for machine-readable data communication, Encoded Archival Description (EAD) and Encoded Archival Context (EAC). A subset of elements found in EAD may be used to exchange descriptions based on ISAD(G) while elements in EAC may be used to exchange ISAAR(CPF)-based authority data.
Publisher
International Council on Archives Committee on Descriptive Standards
Critical Arguements
CA The Ad Hoc Committee agrees that it would be highly desirable to develop a modular, open source software tool that could be used by archives worldwide to manage the intellectual control of their holdings through the recording of standardized descriptive data. Individual archives could combine their data with that of other institutions in regional, national or international networks. Researchers could access this data either via a stand-alone computerized system or over the Internet. The model for this software would be the successful UNESCO-sponsored free library program, ISIS, which has been in widespread use around the developing world for many years. The software, with appropriate supporting documentation, would be freely available via an ICA or UNESCO web site or on CD-ROM. Unlike ISIS, however, the source code and not just the software should be freely available.
Conclusions
RQ "1. That the ICA endorses the functional requirements presented in this document as the basis for moving the initiative forward. 2. That the functional desiderata and technical specifications for the software applications, such as user requirements, business rules, and detailed data models, should be developed further by a team of experts from both ICA/CDS and ICA/ITC as the next stage of this project. 3. That following the finalization of the technical specifications for OSARIS, the requirements should be compared to existing systems and a decision made to adopt or adapt existing software or to build new applications. At that point in time, it will then be possible to estimate project costs. 4. That a solution that incorporates the functional requirements result in the development of several modular software applications. 5. That the implementation of the system should follow a modular strategy. 6. That the development of software applications must include a thorough investigation and assessment of existing solutions beginning with those identified in section four and Appendix B of this document. 7. That the ICA develop a strategy for communicating the progress of this project to members of the international archival community on a regular basis. This would include the distribution of progress reports in multiple languages. The communication strategy must include a two-way exchange of ideas. The project will benefit strongly from the ongoing comments, suggestions, and input of the members of the international archival community. 8. That a test-bed be developed to allow the testing of software solutions in a realistic archival environment. 9. That the system specifications, its documentation, and the source codes for the applications be freely available. 10. That training courses for new users, ongoing education, and webbased support groups be established. 11. That promotion of the software be carried out through the existing regional infrastructure of ICA and through UNESCO. 12. That an infrastructure for ongoing maintenance, distribution, and technical support be developed. This should include a web site to download software and supporting documentation. The ICA should also establish and maintain a mechanism for end-users to recommend changes and enhancements to the software. 13. That the ICA establishes and maintains an official mechanism for regular review of the software by an advisory committee that includes technical and archival experts. "
SOW
DC "The development of such a system should take into account the strategies, experiences, and results of other initiatives such as the European Union Archival Network (EUAN), the Linking and Exploring Authority Files (LEAF) initiative, the European Visual Archives (EVA) project, and the Canadian Archival Information Network (CAIN)."
CA The metadata necessary for successful management and use of digital objects is both more extensive than and different from the metadata used for managing collections of printed works and other physical materials. Without structural metadata, the page image or text files comprising the digital work are of little use, and without technical metadata regarding the digitization process, scholars may be unsure of how accurate a reflection of the original the digital version provides. For internal management purposes, a library must have access to appropriate technical metadata in order to periodically refresh and migrate the data, ensuring the durability of valuable resources.
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Web Page
Title
Descriptive Metadata Guidelines for RLG Cultural Materials
To ensure that the digital collections submitted to RLG Cultural Materials can be discovered and understood, RLG has compiled these Descriptive Metadata Guidelines for contributors. While these guidelines reflect the needs of one particular service, they also represent a case study in information sharing across community and national boundaries. RLG Cultural Materials engages a wide range of contributors with different local practices and institutional priorities. Since it is impossible to find -- and impractical to impose -- one universally applicable standard as a submission format, RLG encourages contributors to follow the suite of standards applicable to their particular community (p.1).
Critical Arguements
CA "These guidelines . . . do not set a new standard for metadata submission, but rather support a baseline that can be met by any number of strategies, enabling participating institutions to leverage their local descriptions. These guidelines also highlight the types of metadata that enhance functionality for RLG Cultural Materials. After a contributor submits a collection, RLG maps that description into the RLG Cultural Materials database using the RLG Cultural Materials data model. This ensures that metadata from the various participant communities is integrated for efficient searching and retrieval" (p.1).
Conclusions
RQ Not applicable.
SOW
DC RLG comprises more than 150 research and cultural memory institutions, and RLG Cultural Materials elicits contributions from countless museums, archives, and libraries from around the world that, although they might retain local descriptive standards and metadata schemas, must conform to the baseline standards prescribed in this document in order to integrate into RLG Cultural Materials. Appendix A represents and evaluates the most common metadata standards with which RLG Cultural Materians is able to work.