CA There are many challenges to devising metadata schema to manage records over time. Continuum thinking provides a conceptual framework to identify these problems.
Phrases
<P1> It is clear from the SPIRT Project definition that recordkeeping and archival control systems have always been about capturing and managing recordkeeping metadata. (p.30) <P2> One of the keys to understanding the Project's approach to what metadata needs to be captured, persistently linked to documentation of social and business activity, and managed through space and time, lies in the continuum view of records. In continuum thinking, [records] are seen not as 'passive objects to described retrospectively,' but as agents of action, 'active participants in business processes and technologies.'" (p.37)
Type
Electronic Journal
Title
The National Digital Information Infrastructure Preservation Program: Expectations, Realities, Choices and Progress to Date
CA The goals of this plan include the continued collecting of materials regardless of evolving digital formats, the long-term preservation of said materials and ensuring access to them for the American people.
Phrases
<P1> There is widespread support for a national initiative in long term preservation of digital content across a very broad range of stakeholder groups outside the traditional scholarly community (p.4) <warrant> <P2> Approaching persistent archiving from the perspective of infrastructure allows system designers to decouple the data storage from the various components that allow users to manage the data. Potentially, any component can be "swapped out" without affecting the rest of the system. Theoretically, many of the technical problems in archiving can be separated into their components, and as innovation occurs, those components can be updated so that the archive remains persistent in the context of rapid change. Similarly, as the storage media obsolesce, the data can be migrated without affecting the overall integrity of the system. (p.7)
Conclusions
RQ Scenario-planning exercises may help expose assumptions that could ultimately be limiting in the future.
CA Describes efforts undertaken at the National Library of New Zealand to ensure preservation of electronic resources.
Phrases
<P1> The National Library Act 1965 provides the legislative framework for the National Library of New Zealand '... to collect, preserve, and make available recorded knowledge, particularly that relating to New Zealand, to supplement and further the work of other libraries in New Zealand, and to enrich the cultural and economic life of New Zealand and its cultural interchanges with other nations.' Legislation currently before Parliament, if enacted, will give the National Library the mandate to collect digital resources for preservation purposes. <warrant> (p. 18) <P2> So, the Library has an organisational commitment and may soon have the legislative environment to support the collection, management and preservation of digital objects. ... The next issue is what needs to be done to ensure that a viable preservation programme can actually be put in place. (p. 18) <P3> As the Library had already begun systematising its approach to resource discovery metadata, development of a preservation metadata schema for use within the Library was a logical next step. (p. 18) <P4> Work on the schema was initially informed by other international endeavours relating to preservation metadata, particularly that undertaken by the National Library of Australia. Initiatives through the CEDARS programme, OCLC/RLG activities and the emerging consensus regarding the role of the OAIS Reference Model ... were also taken into account. <warrant> (p. 18-19) <P5> The Library's Preservation Metadata schema is designed to strike a balance between the principles of preservation metadata, as expressed through the OAIS Information Model, and the practicalities of implementing a working set of preservation metadata. The same incentive informs a recent OCLC/RLG report on the OAIS model. (p. 19) <P6> [I]t is unlikely that anything resembling a comprehensive schema will become available in the short term. However, the need is pressing. (p. 19) <P7> The development of the preservation metadata schema is one component of an ongoing programme of activities needed to ensure the incorporation of digital material into the Library's core business processes with a view to the long-term accessibility of those resources. <warrant> (p. 19) <P8> The aim of the above activities is for the Library to be acknowledged as a 'trusted repository' for digital material which ensures the viability and authenticity of digital objects over time. (p. 20) <P9> The Library will also have to develop relationships with other organisations that might wish to achieve 'trusted repository' status in a country with a small population base and few agencies of appropriate size, funding and willingness to take on the role.
Conclusions
RQ There are still a number of important issues to be resolved before the Library's preservation programme can be deemed a success, including the need for: higher level of awareness of the need for digital preservation within the community of 'memory institutions' and more widely; metrics regarding the size and scope of the problem; finance to research and implement digital preservation; new skill sets for implementing digital preservation, e.g. running the multiplicity of hardware/software involved, digital conservation/archaeology; agreed international approaches to digital preservation; practical models to match the high level conceptual work already undertaken internationally; co-operation/collaboration between the wider range of agents potentially able to assist in developing digital preservation solutions, e.g. the computing industry; and, last but not least, clarity around intellectual property, copyright, privacy and moral rights.
SOW
DC OAIS emerged out of an initiative spearheaded by NASA's Consultative Committee for Space Data Systems. It has been shaped and promoted by the RLG and OCLC. Several international projects have played key roles in shaping the OAIS model and adapting it for use in libraries, archives and research repositories. OAIS-modeled repositories include the CEDARS Project, Harvard's Digital Repository, Koninklijke Bibliotheek (KB), the Library of Congress' Archival Information Package for audiovisual materials, MIT's D-Space, OCLC's Digital Archive and TERM: the Texas Email Repository Model.
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 1
The preservation of digital information for long periods of time is becoming feasible through the integration of archival storage technology from supercomputer centers, data grid technology from the computer science community, information models from the digital library community, and preservation models from the archivistÔÇÖs community. The supercomputer centers provide the technology needed to store the immense amounts of digital data that are being created, while the digital library community provides the mechanisms to define the context needed to interpret the data. The coordination of these technologies with preservation and management policies defines the infrastructure for a collection-based persistent archive. This paper defines an approach for maintaining digital data for hundreds of years through development of an environment that supports migration of collections onto new software systems.
ISBN
1082-9873
Critical Arguements
CA "Supercomputer centers, digital libraries, and archival storage communities have common persistent archival storage requirements. Each of these communities is building software infrastructure to organize and store large collections of data. An emerging common requirement is the ability to maintain data collections for long periods of time. The challenge is to maintain the ability to discover, access, and display digital objects that are stored within an archive, while the technology used to manage the archive evolves. We have implemented an approach based upon the storage of the digital objects that comprise the collection, augmented with the meta-data attributes needed to dynamically recreate the data collection. This approach builds upon the technology needed to support extensible database schema, which in turn enables the creation of data handling systems that interconnect legacy storage systems."
Phrases
<P1> The ultimate goal is to preserve not only the bits associated with the original data, but also the context that permits the data to be interpreted. <warrant> <P2> We rely on the use of collections to define the context to associate with digital data. The context is defined through the creation of semi-structured representations for both the digital objects and the associated data collection. <P3>A collection-based persistent archive is therefore one in which the organization of the collection is archived simultaneously with the digital objects that comprise the collection. <P4> The goal is to preserve digital information for at least 400 years. This paper examines the technical issues that must be addressed and presents a prototype implementation. <P5>Digital object representation. Every digital object has attributes that define its structure, physical context, and provenance, and annotations that describe features of interest within the object. Since the set of attributes (such as annotations) will vary across all objects within a collection, a semi-structured representation is needed. Not all digital objects will have the same set of associated attributes. <P6> If possible, a common information model should be used to reference the attributes associated with the digital objects, the collection organization, and the presentation interface. An emerging standard for a uniform data exchange model is the eXtended Markup Language (XML). <P7> A particular example of an information model is the XML Document Type Definition (DTD) which provides a description for the allowed nesting structure of XML elements. Richer information models are emerging such as XSchema (which provides data types, inheritance, and more powerful linking mechanisms) and XMI (which provides models for multiple levels of data abstraction). <P8> Although XML DTDs were originally applied to documents only, they are now being applied to arbitrary digital objects, including the collections themselves. More generally, OSDs can be used to define the structure of digital objects, specify inheritance properties of digital objects, and define the collection organization and user interface structure. <P9> A persistent collection therefore needs the following components of an OSD to completely define the collection context: Data dictionary for collection semantics; Digital object structure; Collection structure; and User interface structure. <P10> The re-creation or instantiation of the data collection is done with a software program that uses the schema descriptions that define the digital object and collection structure to generate the collection. The goal is to build a generic program that works with any schema description. <P11> The information for which driver to use for access to a particular data set is maintained in the associated Meta-data Catalog (MCAT). The MCAT system is a database containing information about each data set that is stored in the data storage systems. <P12> The data handling infrastructure developed at SDSC has two components: the SDSC Storage Resource Broker (SRB) that provides federation and access to distributed and diverse storage resources in a heterogeneous computing environment, and the Meta-data Catalog (MCAT) that holds systemic and application or domain-dependent meta-data about the resources and data sets (and users) that are being brokered by the SRB. <P13> A client does not need to remember the physical mapping of a data set. It is stored as meta-data associated with the data set in the MCAT catalog. <P14> A characterization of a relational database requires a description of both the logical organization of attributes (the schema), and a description of the physical organization of attributes into tables. For the persistent archive prototype we used XML DTDs to describe the logical organization. <P15> A combination of the schema and physical organization can be used to define how queries can be decomposed across the multiple tables that are used to hold the meta-data attributes. <P16> By using an XML-based database, it is possible to avoid the need to map between semi-structured and relational organizations of the database attributes. This minimizes the amount of information needed to characterize a collection, and makes the re-creation of the database easier. <warrant> <P17> Digital object attributes are separated into two classes of information within the MCAT: System-level meta-data that provides operational information. These include information about resources (e.g., archival systems, database systems, etc., and their capabilities, protocols, etc.) and data objects (e.g., their formats or types, replication information, location, collection information, etc.); Application-dependent meta-data that provides information specific to particular data sets and their collections (e.g., Dublin Core values for text objects). <P18> Internally, MCAT keeps schema-level meta-data about all of the attributes that are defined. The schema-level attributes are used to define the context for a collection and enable the instantiation of the collection on new technology. <P19> The logical structure should not be confused with database schema and are more general than that. For example, we have implemented the Dublin Core database schema to organize attributes about digitized text. The attributes defined in the logical structure that is associated with the Dublin Core schema contains information about the subject, constraints, and presentation formats that are needed to display the schema along with information about its use and ownership. <P20> The MCAT system supports the publication of schemata associated with data collections, schema extension through the addition or deletion of new attributes, and the dynamic generation of the SQL that corresponds to joins across combinations of attributes. <P21> By adding routines to access the schema-level meta-data from an archive, it is possible to build a collection-based persistent archive. As technology evolves and the software infrastructure is replaced, the MCAT system can support the migration of the collection to the new technology.
Conclusions
RQ Collection-Based Persistent Digital Archives - Part 2
SOW
DC "The technology proposed by SDSC for implementing persistent archives builds upon interactions with many of these groups. Explicit interactions include collaborations with Federal planning groups, the Computational Grid, the digital library community, and individual federal agencies." ... "The data management technology has been developed through multiple federally sponsored projects, including the DARPA project F19628-95-C-0194 "Massive Data Analysis Systems," the DARPA/USPTO project F19628-96-C-0020 "Distributed Object Computation Testbed," the Data Intensive Computing thrust area of the NSF project ASC 96-19020 "National Partnership for Advanced Computational Infrastructure," the NASA Information Power Grid project, and the DOE ASCI/ASAP project "Data Visualization Corridor." Additional projects related to the NSF Digital Library Initiative Phase II and the California Digital Library at the University of California will also support the development of information management technology. This work was supported by a NARA extension to the DARPA/USPTO Distributed Object Computation Testbed, project F19628-96-C-0020."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 2
"Collection-Based Persistent Digital Archives: Part 2" describes the creation of a one million message persistent E-mail collection. It discusses the four major components of a persistent archive system: support for ingestion, archival storage, information discovery, and presentation of the collection. The technology to support each of these processes is still rapidly evolving, and opportunities for further research are identified.
ISBN
1082-9873
Critical Arguements
CA "The multiple migration steps can be broadly classified into a definition phase and a loading phase. The definition phase is infrastructure independent, whereas the loading phase is geared towards materializing the processes needed for migrating the objects onto new technology. We illustrate these steps by providing a detailed description of the actual process used to ingest and load a million-record E-mail collection at the San Diego Supercomputer Center (SDSC). Note that the SDSC processes were written to use the available object-relational databases for organizing the meta-data. In the future, it may be possible to go directly to XML-based databases."
Phrases
<P1> The processes used to ingest a collection, transform it into an infrastructure independent form, and store the collection in an archive comprise the persistent storage steps of a persistent archive. The processes used to recreate the collection on new technology, optimize the database, and recreate the user interface comprise the retrieval steps of a persistent archive. <P2> In order to build a persistent collection, we consider a solution that "abstracts" all aspects of the data and its preservation. In this approach, data object and processes are codified by raising them above the machine/software dependent forms to an abstract format that can be used to recreate the object and the processes in any new desirable forms. <P3> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P4> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P5> The steps used to store the persistent archive were: (1) Define Digital Object: define meta-data, define object structure (OBJ-DTD) --- (A), define object DTD to object DDL mapping --- (B) (2) Define Collection: define meta-data, define collection structure (COLL-DTD) --- (C), define collection DTD structure to collection DDL mapping --- (D) (3) Define Containers: define packing format for encapsulating data and meta-data (examples are the AIP standard, Hierarchical Data Format, Document Type Definition) <P5> In the ingestion phase, the relational and semi-structured organization of the meta-data is defined. No database is actually created, only the mapping between the relational organization and the object DTD. <P6> Note that the collection relational organization does not have to encompass all of the attributes that are associated with a digital object. Separate information models are used to describe the objects and the collections. It is possible to take the same set of digital objects and form a new collection with a new relational organization. <P7> Multiple communities across academia, the federal government, and standards groups are exploring strategies for managing very large archives. The persistent archive community needs to maintain interactions with these communities to track development of new strategies for data management and storage. <warrant> <P8>
Conclusions
RQ "The four major components of the persistent archive system are support for ingestion, archival storage, information discovery, and presentation of the collection. The first two components focus on the ingestion of data into collections. The last two focus on access to the resulting collections. The technology to support each of these processes is still rapidly evolving. Hence consensus on standards has not been reached for many of the infrastructure components. At the same time, many of the components are active areas of research. To reach consensus on a feasible collection-based persistent archive, continued research and development is needed. Examples of the many related issues are listed below:
Type
Electronic Journal
Title
Buckets: A new digital technology for preserving NASA research
CA Buckets are information objects designed to reduce dependency on traditional archives and database systems thereby making them more resilent to the transient nature of information systems.
Phrases
Another focus of aggregation was including the metadata with data. Through experiences NASA researchers found that metadata tended to "drift" over time, becoming decoupled from the data it described or locked in specific DL systems and hard to extract or share with other systems. (p. 377) Buckets are designed to imbue the information objects with certain responsibilities, such as display, dissemination, protection, and maintenance of its contents. As such, buckets should be able to work with many DL systems simultaneously, and minimize or eliminate the necessary modification of DL systems to work with buckets. Ideally, buckets should work with everything and break nothing. This philosophy is formalized in the SODA DL model. the objects become "smarter" at the expense of the archives (that become "dumber"), as functionalities generally associated with archives are moved into the data objects themselves. (p. 390)
Conclusions
RQ The creation of high quality tools for bucket creation and administration is absolutely necessary. The extension of authentication and security measures is key to supporting more technologies. Many applications of this sort of information object independence remains to be explored.
Type
Electronic Journal
Title
Search for Tomorrow: The Electronic Records Research Program of the U.S. National Historical Publications and Records Commission
The National Historical Publications and Records Commission (NHPRC) is a small grant-making agency affiliated with the U.S. National Archives and Records Administration. The Commission is charged with promoting the preservation and dissemination of documentary source materials to ensure an understanding of U.S. history. Recognizing that the increasing use of computers created challenges for preserving the documentary record, the Commission adopted a research agenda in 1991 to promote research and development on the preservation and continued accessibility of documentary materials in electronic form. From 1991 to the present the Commission awarded 31 grants totaling $2,276,665 for electronic records research. Most of this research has focused on two issues of central concern to archivists: (1) electronic record keeping (tools and techniques to manage electronic records produced in an office environment, such as word processing documents and electronic mail), and (2) best practices for storing, describing, and providing access to all electronic records of long-term value. NHPRC grants have raised the visibility of electronic records issues among archivists. The grants have enabled numerous archives to begin to address electronic records problems, and, perhaps most importantly, they have stimulated discussion about electronic records among archivists and records managers.
Publisher
Elsevier Science Ltd
Critical Arguements
CA "The problem of maintaining electronic records over time is big, expensive, and growing. A task force on digital archives established by the Commission on Preservation and Access in 1994 commented that the life of electronic records could be characterized in the same words Thomas Hobbes once used to describe life: ÔÇ£nasty, brutish, and shortÔÇØ [1]. Every day, thousands of new electronic files are created on federal, state, and local government computers across the nation. A small but important portion of these records will be designated for permanent retention. Government agencies are increasingly relying on computers to maintain information such as census files, land titles, statistical data, and vital records. But how should electronic records with long-term value be maintained? Few government agencies have developed comprehensive policies for managing current electronic records, much less preserving those with continuing value for historians and other researchers. Because of this serious and growing problem, the National Historical Publications and Records Commission (NHPRC), a small grantmaking agency affiliated with the U.S. National Archives and Records Administration (NARA), has been making grants for research and development on the preservation and use of electronic documentary sources. The program is conducted in concert with NARA, which in 1996 issued a strategic plan that gives high priority to mastering electronic records problems in partnership with federal government agencies and the NHPRC.
Phrases
<P1> How can data dictionaries, information resource directory systems, and other metadata systems be used to support electronic records management and archival requirements? <P2> In spite of the number of projects the Commission has supported, only four questions from the research agenda have been addressed to date. Of these, the question relating to requirements for the development of data dictionaries and other metadata systems (question number four) has produced a single grant for a state information locator system in South Carolina, and the question relating to needs for archival education (question 10) has led to two grants to the Society of American Archivists for curricular materials. <P3> Information systems created without regard for these considerations may have deficiencies that limit the usefulness of the records contained on them. <warrant> <P4> The NHPRC has awarded major grants to four institutions over the past five years for projects to develop and test requirements for electronic record keeping: University of Pittsburgh (1993): A working set of functional requirements and metadata specifications for electronic record keeping systems; City of Philadelphia (1995, 1996, and 1997): A project to incorporate a subset of the Pittsburgh metadata specifications into a new human resources information system and other city systems as test cases and to develop comprehensive record keeping policies and standards for the cityÔÇÖs information technology systems; Indiana University (1995): A project to develop an assessment tool and methodology for analyzing existing electronic records systems, using the Pittsburgh functional requirements as a model and the student academic record system and a financial system as test cases; Research Foundation of the State University of New York-Albany, Center for Technology in Government (1996): A project to identify best practices for electronic record keeping, including work by the U.S. Department of Defense and the University of British Columbia in addition to the University of Pittsburgh. The Center is working with the stateÔÇÖs Adirondack Parks Agency in a pilot project to develop a system model for incorporating record keeping and archival considerations into the creation of networked computing and communications applications. <P5> No definitive solution has yet been identified for the problems posed by electronic records, although progress has been made in learning what will be needed to design functional electronic record keeping systems. <P6> With the proliferation of digital libraries, the need for long-term storage, migration and retrieval strategies for electronic information has become a priority for a wide variety of information providers. <warrant>
Conclusions
RQ "How best to preserve existing and future electronic formats and provide access to them over time has remained elusive. The answers cannot be found through theoretical research alone, or even through applied research, although both are needed. Answers can only emerge over time as some approaches prove able to stand the test of time and others do not. The problems are large because the costs of maintaining, migrating, and retrieving electronic information continue to be high." ... "Perhaps most importantly, these grants have stimulated widespread discussion of electronic records issues among archivists and record managers, and thus they have had an impact on the preservation of the electronic documentary record that goes far beyond the CommissionÔÇÖs investment."
SOW
DC The National Historic Publications and Records Commission (NHPRC) is the outreach arm of the National Archives and makes plans for and studies issues related to the preservation, use and publication of historical documents. The Commission also makes grants to non-Federal archives and other organizations to promote the preservation use of America's documentary heritage.
Type
Report
Title
Advice: Introduction to the Victorian Electronic Records Strategy (VERS) PROS 99/007 (Version 2)
This document is an introduction to the PROV Standard Management of Electronic Records (PROS 99/007), also known as the VERS Standard. This document provides background information on the goals and the VERS approach to preservation. Nothing in this document imposes any requirements on agencies.
Critical Arguements
CA The Victorian Elextronic Records Strategy (VERS) addresses the cost-effective, long-term preservation of electronic records. The structure and requirements of VERS are formally specified in the STandard for the Management of Electronic Records (PROS 99/007) and its five technical specifications. This Advice provides background to the Standard. It covers: the history of the VERS project; the preservation theory behind VERS; how the five specifications support the preservation theory; a brief introduction to the VERS Encapsulated Object (VEO). In this document we distinguish between the record and the content of the record. The content is the actuial information contained in the record; for example, the report or the image. The record as a whole contains the record content and metadata that contains information about the record, including its context, description, history, and integrity cvontrol. 
Conclusions
<RQ>
SOW
<DC>Public Record Office Victoria is the archives of the State Government of Victoria. They hold records from the beginnings of the colonial administration of Victoria in the mid-1830s to today and are responsible for ensuring the accountability of the Victoria State Government. 
Type
Report
Title
Management of Electronic Records PROS 99/007 (Version 2)
This document is the Victorian Electronic Records Strategy (VERS) Standard (PROS 99/007). This document is the standard itself and is primarly concerned with conformance. The technical requirements of the Standard are contained in five Specifications.
Accessed Date
August 24, 2005
Critical Arguements
CA VERS has two major goals: the preservation of electronic records and enabling efficient management in doing so. Version 2 has an improved structure, additional metadata elements, requirements for preservation and compliance requirements for agencies. "Export" compliance allows agencies to maintain their records within their own recordkeeping systems and add a module so they can generate the VERS format for export, especially for long term preservation. "Native" complicance is when records are converted to long term preservation format upon registration which is seen as the ideal approach. ... "The Victorian Electronic Records Strategy (VERS) is designed to assist agencies in managing their electronic records. The strategy focuses on the data or information contained in electronic records, rather than the systems that are used to produce them."
SOW
<DC> "VERS was developed with the assistance of CSIRO, Ernst & Young, the Department of Infrastructure, and records managers across government. The recommendations included in the VERS Final Report1 issued in March 1999 provide a framework for the management of electronic records." ... "Public Records Office Victoria is the Archives of the State of Victoria. They hold the records from the beginnings of the colonial administration of Victoria in the mid-1830s to today.
1. Also at http://www.access.gpo.gov/uscode/title28a/28a_8_6_.html2. As amended Feb. 28, 1966, eff. July 1, 1966; Mar. 2, 1987, eff. Aug. 1, 1987; Apr. 30, 1991, eff. Dec. 1, 1991.
CA This is the first of four articles describing Geospatial Standards and the standards bodies working on these standards. This article will discuss what geospatial standards are and why they matter, identify major standards organizations, and list the characteristics of successful geospatial standards.
Conclusions
RQ Which federal and international standards have been agreed upon since this article's publication?
SOW
DC FGDC approved the Content Standard for Digital Geospatial Metadata (FGDC-STD-001-1998) in June 1998. FGDC is a 19-member interagency committee composed of representatives from the Executive Office of the President, Cabinet-level and independent agencies. The FGDC is developing the National Spatial Data Infrastructure (NSDI) in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector. The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce and share geographic data.
Type
Web Page
Title
Update on the National Digital Infrastructure Initiative
CA Describes progress on a five-year national strategy for preserving digital content.
Conclusions
RQ "These sessions helped us set priorities. Participants agreed about the need for a national preservation strategy. People from industry were receptive to the idea that the public good, as well as their own interests, would be served by coming together to think about long-term preservation. <warrant> They also agreed on the need for some form of distributor-decentralized solution. Like others, they realize that no library can tackle the digital preservation challenge alone. Many parties will need to come together. Participants agreed about the need for digital preservation research, a clearer agenda, a better focus, and a greater appreciation that technology is not necessarily the prime focus. The big challenge might be organizational architecture, i.e., roles and responsibilities. Who is going to do what? How will we reach agreement?"
Type
Web Page
Title
Practical Tools for Electronic Records Management and Preservation
"This briefing paper summarizes the results of a cooperative project sponsored in part, by a research grant from the National Historical Publications and Records Commission. The project, called "Models for Action: Practical Approaches to Electronic Records Management and Preservation," focused on the development of practical tools to support the integration of essential electronic records management requirements into the design of new information systems. The project was conducted from 1996 to 1998 through a partnership between the New York State Archives and Records Administration and the Center for Technology in Government. The project team also included staff from the NYS Adirondack Park Agency, eight corporate partners led by Intergraph Corporation, and University at Albany faculty and graduate students."
Publisher
Center for Technology in Government
Critical Arguements
CA "This briefing paper bridges the gap between theory and practice by presenting generalizable tools that link records management practices to business objectives."
Type
Web Page
Title
Deliberation No. 11/2004 of 19 February 2004: "Technical Rules for Copying and Preserving Electronic Documents on Digital Media which are Suitable to Guarantee Authentic Copies"
CA Recognizes that preservation of authentic electronic records means preservation of authentic/true copies. Thus the preservation process is called substitute preservation, and the authenticity of a preserved document is not established on the object itself (as it was with traditional media), but through the authority of the preserver (and possibly a notary), who would attest to the identity and integrity of the whole of the reproduced documents every time a migration occurs. The preserver's task list is also noteworthy. Archival units description stands out as an essential activity (not replaceable by the metadata which are associated to each single document) in order to maintain intellectual control over holdings.
SOW
DC CNIPA (Centro Nazionale per l'Informatica nella Pubblica Amministrazione) replaced AIPA (Autorita' per l'Informatica nella Pubblica Amministrazione) in 2003. Such an Authority (established in 1993 according to art. 4 of the Legislative Decree 39/1993, as amended by art. 176 of the Legislative Decree 196/2003) operates as a branch of the Council of Ministers' Presidency with the mandate to put the Ministry for Innovation and Technologies' policies into practice. In particular, CNIPA is responsible for bringing about reforms relevant to PA's modernization, the spread of e-government and the development of nationwide networks to foster better communication among public offices and between citizens and the State. In the Italian juridical system, CNIPA's deliberations have a lower enabling power, but they nevertheless are part of the State's body of laws. The technical rules provided in CNIPA's deliberation 11/2004 derive from art. 6, par. 2 of the DPR 445/2000, which says: "Preservation obligations are fully satisfied, both for administrative and probative purposes, also with the use of digital media when the employed procedures comply with the technical rules provided by AIPA." In order to keep those rules up to date according to the latest technology, AIPA's deliberation no. 42 of 13 December 2001 on "Technical rules for documents reproduction and preservation on digital media that are suitable to guarantee true copies of the original documents" has been replaced by the current CNIPA deliberation.
The CDISC Submission Metadata Model was created to help ensure that the supporting metadata for these submission datasets should meet the following objectives: Provide FDA reviewers with clear describtions of the usage, structure, contents, and attributes of all datasets and variables; Allow reviewers to replicate most analyses, tables, graphs, and listings with minimal or no transformations; Enable reviewers to easily view and subset the data used to generate any analysis, table, graph, or listing without complex programming. ... The CDISC Submission Metadata Model has been defined to guide sponsors in the preparation of data that is to be submitted to the FDA. By following the principles of this model, sponsors will help reviewers to accurately interpret the contents of submitted data and work with it more effectively, without sacrificing the scientific objectives of clinical development.
Publisher
The Clinical Data Interchange Standards Consortium
Critical Arguements
CA "The CDISC Submission Data Model has focused on the use of effective metadata as the most practical way of establishing meaningful standards applicable to electronic data submitted for FDA review."
Conclusions
RQ "Metadata prepared for a domain (such as an efficacy domain) which has not been described in a CDISC model should follow the general format of the safety domains, including the same set of core selection variables and all of the metadata attributes specified for the safety domains. Additional examples and usage guidelines are available on the CDISC web site at www.cdisc.org." ... "The CDISC Metadata Model describes the structure and form of data, not the content. However, the varying nature of clinical data in general will require the sponsor to make some decisions about how to represent certain real-world conditions in the dataset. Therefore, it is useful for a metadata document to give the reviewer an indication of how the datasets handle certain special cases."
SOW
DC CDISC is an open, multidisciplinary, non-profit organization committed to the development of worldwide standards to support the electronic acquisition, exchange, submission and archiving of clinical trials data and metadata for medical and biopharmaceutical product development. CDISC members work together to establish universally accepted data standards in the pharmaceutical, biotechnology and device industries, as well as in regulatory agencies worldwide. CDISC currently has more than 90 members, including the majority of the major global pharmaceutical companies.
Type
Web Page
Title
CDISC Achieves Two Significant Milestones in the Development of Models for Data Interchange
CA "The Clinical Data Interchange Standards Consortium has achieved two significant milestones towards its goal of standard data models to streamline drug development and regulatory review processes. CDISC participants have completed metadata models for the 12 safety domains listed in the FDA Guidance regarding Electronic Submissions and have produced a revised XML-based data model to support data acquisition and archive."
Conclusions
RQ "The goal of the CDISC XML Document Type Definition (DTD) Version 1.0 is to make available a first release of the definition of this CDISC model, in order to support sponsors, vendors and CROs in the design of systems and processes around a standard interchange format."
SOW
DC "This team, under the leadership of Wayne Kubick of Lincoln Technologies, and Dave Christiansen of Genentech, presented their metadata models to a group of representatives at the FDA on Oct. 10, and discussed future cooperative efforts with Agency reviewers."... "CDISC is a non-profit organization with a mission to lead the development of standard, vendor-neutral, platform-independent data models that improve process efficiency while supporting the scientific nature of clinical research in the biopharmaceutical and healthcare industries"
Type
Web Page
Title
Approaches towards the Long Term Preservation of Archival Digital Records
The Digital Preservation Testbed is carrying out experiments according to pre-defined research questions to establish the best preservation approach or combination of approaches. The Testbed will be focusing its attention on three different digital preservation approaches - Migration; Emulation; and XML - evaluating the effectiveness of these approaches, their limitations, costs, risks, uses, and resource requirements.
Language
English; Dutch
Critical Arguements
CA "The main problem surrounding the preservation of authentic electronic records is that of technology obsolescence. As changes in technology continue to increase exponentially, the problem arises of what to do with records that were created using old and now obsolete hardware and software. Unless action is taken now, there is no guarantee that the current computing environment (and thus also records) will be accessible and readable by future computing environments."
Conclusions
RQ "The Testbed will be conducting research to discover if there is an inviolable way to associate metadata with records and to assess the limitations such an approach may incur. We are also working on the provision of a proposed set of preservation metadata that will contain information about the preservation approach taken and any specific authenticity requirements."
SOW
DC The Digital Preservation Testbed is part of the non-profit organisation ICTU. ICTU is the Dutch organisation for ICT and government. ICTU's goal is to contribute to the structural development of e-government. This will result in improving the work processes of government organisations, their service to the community and interaction with the citizens. Government institutions, such as Ministries, design the policies in the area of e-government, and ICTU translates these policies into projects. In many cases, more than one institution is involved in a single project. They are the principals in the projects and retain control concerning the focus of the project. In case of the Digital Preservation Testbed the principals are the Ministry of the Interior and the Dutch National Archives.
Type
Web Page
Title
Appendix N to Proceedings of The Uniform Law Conference of Canada, Proposals for a Uniform Electronic Evidence Act
CA "First, there is a great deal of uncertainty about how the [Canada Evidence Act], particularly s. 30(6), will be applied, and this makes it difficult for the parties to prepare for litigation and for businesses to know how they should keep their records. Second, there are risks to the integrity of records kept on a computer that do not exist with respect to other forms of information processing and storage, and if alterations are made, either negligently or deliberately, they can be extremely difficult to detect. Third, s. 30(1) provides little assurance that the record produced to the court is the same as the one that was originally made in the usual and ordinary course of business, for while self-interest may be an adequate guarantee that most businesses will maintain accurate and truthful records, it is not true for many others. The second and third problems combined place the party opposing the introduction of computer-produced business records in a difficult situation."
SOW
DC The Uniform Law Conference of Canada undertook to adopt uniform legislation to ensure that computer records could be used appropriately in court.
Abstract The ability of investigators to share data is essential to the progress of integrative scientific research both within and across disciplines. This paper describes the main issues in achieving effective data sharing based on previous efforts in building scientific data networks and, particularly, recent efforts within the Earth sciences. This is presented in the context of a range of information architectures for effecting differing levels of standardization and centralization both from a technology perspective as well as a publishing protocol perspective. We propose a new Metadata Interchange Format (.mif) that can be used for more effective sharing of data and metadata across digital libraries, data archives and research projects.
Critical Arguements
CA "In this paper, we discuss two important information technology aspects of the electronic publication of data in the Earth sciences, metadata, and a variety of different concepts of electronic data publication. Metadata are the foundation of electronic data publications and they are determined by needs of archiving, the scientific analysis and reproducibility of a data set, and the interoperability of diverse data publication methods. We use metadata examples drawn from the companion paper by Staudigel et al. (this issue) to illustrate the issues involved in scaling-up the publication of data and metadata by individual scientists, disciplinary groups, the Earth science community-at-large and to libraries in general. We begin by reviewing current practices and considering a generalized alternative." ... 'For this reason, we will we first discuss different methods of data publishing via a scientific data network followed by an inventory of desirable characteristics of such a network. Then, we will introduce a method for generating a highly portable metadata interchange format we call .mif (pronounced dot-mif) and conclude with a discussion of how this metadata format can be scaled to support the diversity of interests within the Earth science community and other scientific communities." ... "We can borrow from the library community the methods by which to search for the existence and location of data (e.g., Dublin Core http://www.dublincore.org) but we must invent new ways to document the metadata needed within the Earth sciences and to comply with other metadata standards such as the Federal Geographic Data Committee (FGDC). To accomplish this, we propose a metadata interchange format that we call .mif that enables interoperability and an open architecture that is maximally independent of computer systems, data management approaches, proprietary software and file formats, while encouraging local autonomy and community cooperation. "
Conclusions
RQ "These scalable techniques are being used in the development of a project we call SIOExplorer that can found at http://sioexplorer.ucsd.edu although we have not discussed that project in any detail. The most recent contributions to this discussion and .mif applications and examples may be found at http:\\Earthref.org\metadata\GERM\."
SOW
DC This article was written by representatives of the San Diego Supercomputer Center and the Insititute of Geophysics and Planetary Physics under the auspices of the University of California, San Diego.
Type
Web Page
Title
Softening the borderlines of archives through XML - a case study
Archives have always had troubles getting metadata in formats they can process. With XML, these problems are lessening. Many applications today provide the option of exporting data into an application-defined XML format that can easily be post-processed using XSLT, schema mappers, etc, to fit the archives┬┤ needs. This paper highlights two practical examples for the use of XML in the Swiss Federal Archives and discusses advantages and disadvantages of XML in these examples. The first use of XML is the import of existing metadata describing debates at the Swiss parliament whereas the second concerns preservation of metadata in the archiving of relational databases. We have found that the use of XML for metadata encoding is beneficial for the archives, especially for its ease of editing, built-in validation and ease of transformation.
Notes
The Swiss Federal Archives defines the norms and basis of records management and advises departments of the Federal Administration on their implementation. http://www.bar.admin.ch/bar/engine/ShowPage?pageName=ueberlieferung_aktenfuehrung.jsp
Critical Arguements
CA "This paper briefly discusses possible uses of XML in an archival context and the policies of the Swiss Federal Archives concerning this use (Section 2), provides a rough overview of the applications we have that use XML (Section 3) and the experiences we made (Section 4)."
Conclusions
RQ "The systems described above are now just being deployed into real world use, so the experiences presented here are drawn from the development process and preliminary testing. No hard facts in testing the sustainability of XML could be gathered, as the test is time itself. This test will be passed when we can still access the data stored today, including all metadata, in ten or twenty years." ... "The main problem area with our applications was the encoding of the XML documents and the non-standard XML document generation of some applications. When dealing with the different encodings (UTF-8, UTF-16, ISO-8859-1, etc) some applications purported a different encoding in the header of the XML document than the true encoding of the document. These errors were quickly identified, as no application was able to read the documents."
SOW
DC The author is currently a private digital archives consultant, but at the time of this article, was a data architect for the Swiss Federal Archives. The content of this article owes much to the work being done by a team of architects and engineers at the Archives, who are working on an e-government project called ARELDA (Archiving of Electronic Data and Records).
Type
Web Page
Title
Recommended Best Practices for Encoded Archival Description Finding Aids at the Library of Congress
The Library of Congress EAD Practices Working Group has drafted these proposed guidelines for the creation of EAD finding aids at the Library of Congress, a process which has included documenting current practices at the Library, examining other documented standards and practices, and addressing outstanding issues.  
Publisher
Library of Congress
Language
English
Critical Arguements
<CA>These guidelines are intended for use in conjunction with the EAD Tag Library Version 1.0 and EAD Application Guidelines, published by the Society of American Archivists and the Library of Congress and available online at http://www.loc.gov/ead/.
Conclusions
RQ
SOW
DC "The guidelines were made available to the Library of Congress EAD Technical Group for review, and many suggestions for improvement have been incorporated into this final draft which is now available for use by Library staff."
Type
Web Page
Title
NHPRC: Minnesota State Archives Strategic Plan: Electronic Records Consultant Project
National Historical Publications and Records Commission Grant No. 95-030
Critical Arguements
CA "The Electronic Records Consultant Project grant was carried out in conjunction with the strategic planning effort for the Minnesota Historical Society's State Archives program. The objective was to develop a plan for a program that will be responsive to the changing nature of government records." ... "The strategic plan that was developed calls for specific actions to meet five goals: 1) strengthening partnerships, 2) facilitating the identification of historically valuable records, 3) integrating electronic records into the existing program, 4) providing quality public service, and 5) structuring the State Archives Department to meet the demands of this plan."
"The ERMS Metadata Standard forms Part 2 of the National Archives' 'Requirements for Electronic Records Management Systems' (commonly known as the '2002 Requirements'). It is specified in a technology independent manner, and is aligned with the e-Government Metadata Standard (e-GMS) version 2, April 2003. A version of e-GMS v2 including XML examples was published in the autumn of 2003. This Guide should be read in conjunction with the ERMS Metadata Standard. Readers may find the GovTalk Schema Guidelines (available via http://www.govtalk.gov.uk ) helpful regarding design rules used in building the schemas."
Conclusions
RQ Electronically enabled processes need to generate appropriate records, according to established records management principles. These records need to reach the ERMS that captures them with enough information to enable the ERMS to classify them appropriately, allocate an appropriate retention policy, etc.
SOW
DC This document is a draft.
Type
Web Page
Title
Recordkeeping Metadata Standard for Commonwealth Agencies
This standard describes the metadata that the National Archives of Australia recommends should be captured in the recordkeeping systems used by Commonwealth government agencies. ... Part One of the standard explains the purpose and importance of standardised recordkeeping metadata and details the scope, intended application and features of the standard. Features include: flexibility of application; repeatability of data elements; extensibility to allow for the management of agency-specific recordkeeping requirements; interoperability across systems environments; compatibility with related metadata standards, including the Australian Government Locator Service (AGLS) standard; and interdependency of metadata at the sub-element level.
Critical Arguements
CA Compliance with the Recordkeeping Metadata Standard for Commonwealth Agencies will help agencies to identify, authenticate, describe and manage their electronic records in a systematic and consistent way to meet business, accountability and archival requirements. In this respect the metadata is an electronic recordkeeping aid, similar to the descriptive information captured in file registers, file covers, movement cards, indexes and other registry tools used in the paper-based environment to apply intellectual and physical controls to records.
Conclusions
RQ "The National Archives intends to consult with agencies, vendors and other interested parties on the implementation and continuing evolution of the Recordkeeping Metadata Standard for Commonwealth Agencies." ... "The National Archives expects to re-examine and reissue the standard in response to broad agency feedback and relevant advances in theory and methodology." ... "The development of public key technology is one area the National Archives will monitor closely, in consultation with the Office for Government Online, for possible additions to a future version of the standard."
SOW
DC "This standard has been developed in consultation with recordkeeping software vendors endorsed by the Office for Government OnlineÔÇÖs Shared Systems Initiative, as well as selected Commonwealth agencies." ... "The standard has also been developed with reference to other metadata standards emerging in Australia and overseas to ensure compatibility, as far as practicable, between related resource management tools, including: the Dublin Core-derived Australian Government Locator Service (AGLS) metadata standard for discovery and retrieval of government services and information in web-based environments, co-ordinated by the National Archives of Australia; and the non-sector-specific Recordkeeping Metadata Standards for Managing and Accessing Information Resources in Networked Environments Over Time for Government, Social and Cultural Purposes, co-ordinated by Monash University using an Australian Research Council Strategic Partnership with Industry Research and Training (SPIRT) Support Grant."
Joined-up government needs joined-up information systems. The e-Government Metadata Standard (e-GMS) lays down the elements, refinements and encoding schemes to be used by government officers when creating metadata for their information resources or designing search interfaces for information systems. The e-GMS is needed to ensure maximum consistency of metadata across public sector organisations.
Publisher
Office of the e-Envoy, Cabinet Office, UK.
Critical Arguements
CA "The e-GMS is concerned with the particular facets of metadata intended to support resource discovery and records management. The Standard covers the core set of ÔÇÿelementsÔÇÖ that contain data needed for the effective retrieval and management of official information. Each element contains information relating to a particular aspect of the information resource, e.g. 'title' or 'creator'. Further details on the terminology being used in this standard can be found in Dublin Core and Part Two of the e-GIF."
Conclusions
RQ "The e-GMS will need to evolve, to ensure it remains comprehensive and consistent with changes in international standards, and to cater for changes in use and technology. Some of the elements listed here are already marked for further development, needing additional refinements or encoding schemes. To limit disruption and cost to users, all effort will be made to future-proof the e-GMS. In particular we will endeavour: not to remove any elements or refinements; not to rename any elements or refinements; not to add new elements that could contain values contained in the existing elements."
SOW
DC The E-GMS is promulgated by the British government as part of its e-government initiative. It is the technical cornerstone of the e-government policy for joining up the public sector electronically and providing modern, improved public services.
During the past decade, the recordkeeping practices in public and private organizations have been revolutionized. New information technologies from mainframes, to PC's, to local area networks and the Internet have transformed the way state agencies create, use, disseminate, and store information. These new technologies offer a vastly enhanced means of collecting information for and about citizens, communicating within state government and between state agencies and the public, and documenting the business of government. Like other modern organizations, Ohio state agencies face challenges in managing and preserving their records because records are increasingly generated and stored in computer-based information systems. The Ohio Historical Society serves as the official State Archives with responsibility to assist state and local agencies in the preservation of records with enduring value. The Office of the State Records Administrator within the Department of Administrative Services (DAS) provides advice to state agencies on the proper management and disposition of government records. Out of concern over its ability to preserve electronic records with enduring value and assist agencies with electronic records issues, the State Archives has adapted these guidelines from guidelines created by the Kansas State Historical Society. The Kansas State Historical Society, through the Kansas State Historical Records Advisory Board, requested a program development grant from the National Historical Publications and Records Commission to develop policies and guidelines for electronic records management in the state of Kansas. With grant funds, the KSHS hired a consultant, Dr. Margaret Hedstrom, an Associate Professor in the School of Information, University of Michigan and formerly Chief of State Records Advisory Services at the New York State Archives and Records Administration, to draft guidelines that could be tested, revised, and then implemented in Kansas state government.
Notes
These guidelines are part of the ongoing effort to address the electronic records management needs of Ohio state government. As a result, this document continues to undergo changes. The first draft, written by Dr. Margaret Hedstrom, was completed in November of 1997 for the Kansas State Historical Society. That version was reorganized and updated and posted to the KSHS Web site on August 18, 1999. The Kansas Guidelines were modified for use in Ohio during September 2000
Critical Arguements
CA "This publication is about maintaining accountability and preserving important historical records in the electronic age. It is designed to provide guidance to users and managers of computer systems in Ohio government about: the problems associated with managing electronic records, special recordkeeping and accountability concerns that arise in the context of electronic government; archival strategies for the identification, management and preservation of electronic records with enduring value; identification and appropriate disposition of electronic records with short-term value, and
Type
Web Page
Title
Record Keeping Metadata Requirements for the Government of Canada
This document comprises descriptions for metadata elements utilized by the Canadian Government as of January 2001.
Critical Arguements
CA "The Record Keeping Metadata is defined broadly to include the type of information Departments are required to capture to describe the identity, authenticity, content, context, structure and management requirements of records created in the context of a business activity. The Metadata model consists of elements, which are the attributes of a record that are comparable to fields in a database. The model is modular in nature. It permits Departments to use a core set of elements that will meet the minimum requirements for describing and sharing information, while facilitating interoperability between government Departments. It also allows Departments with specialized needs or the need for more detailed descriptions to add new elements and/or sub-elements to the basic metadata in order to satisfy their particular business requirements."
Type
Web Page
Title
Capturing Electronic Transactional Evidence: The Future
This standard sets out principles for making and keeping full and accurate records as required under section 12(1) of the State Records Act 1998. The principles are: Records must be made; Records must be accurate; Records must be authentic; Records must have integrity; Records must be useable. Each principle is supported by mandatory compliance requirements.
Critical Arguements
CA "Section 21(1) of the State Records Act 1998 requires public offices to 'make and keep full and accurate records'. The purpose of this standard is to assist public offices to meet this obligation and to provide a benchmark against which a public office's compliance may be measured."
Conclusions
RQ None
SOW
DC This standard is promulgated by the State Records Agency of New South Wales, Australia, as required under section 12(1) of the State Records Act 1998.
Type
Web Page
Title
Archiving of Electronic Digital Data and Records in the Swiss Federal Archives (ARELDA): e-government project ARELDA - Management Summary
The goal of the ARELDA project is to find long-term solutions for the archiving of digital records in the Swiss Federal Archives. This includes the accession, the long-term storage, preservation of data, description, and access for the users of the Swiss Federal Archives. It is also coordinated with the basic efforts of the Federal Archives to realize a uniform records management solution in the federal administration and therefore to support the pre-archival creation of documents of archival value for the benefits of the administration as well as of the Federal Archives. The project is indispensable for the long-term execution of the Federal Archives Act; Older IT systems are being replaced by newer ones. A complete migration of the data is sometimes not possible or too expensive; A constant increase of small database applications, built and maintained by people with no IT background; More and more administrative bodies are introducing records and document management systems.
Publisher
Swiss Federal Archives
Publication Location
Bern
Critical Arguements
CA "Archiving in general is a necessary prerequisite for the reconstruction of governmental activities as well as for the principle of legal certainty. It enables citizens to understand governmental activities and ensures a democratic control of the federal administration. And finally are archives a prerequisite for the scientific research, especially in the social and historical fields and ensure the preservation of our cultural heritage. It plays a vital role for an ongoing and efficient records management. A necessary prerequisite for the Federal Archives in the era of the information society will be the system ARELDA (Archiving of Electronic Data and Records)."
Conclusions
RQ "Because of the lack of standard solutions and limited or lacking personal resources for an internal development effort, the realisation of ARELDA will have to be outsourced and the cooperation with the IT division and the Federal Office for Information Technology, Systems and Telecommunication must be intensified. The guidelines for the projects are as follows:
SOW
DC ARELDA is one of the five key projects in the Swiss government's e-government strategy.