Artiste is a European project developing a cross-collection search system for art galleries and museums. It combines image content retrieval with text based retrieval and uses RDF mappings in order to integrate diverse databases. The test sites of the Louvre, Victoria and Albert Museum, Uffizi Gallery and National Gallery London provide their own database schema for existing metadata, avoiding the need for migration to a common schema. The system will accept a query based on one museumÔÇÖs fields and convert them, through an RDF mapping into a form suitable for querying the other collections. The nature of some of the image processing algorithms means that the system can be slow for some computations, so the system is session-based to allow the user to return to the results later. The system has been built within a J2EE/EJB framework, using the Jboss Enterprise Application Server.
Secondary Title
WWW2002: The Eleventh International World Wide Web Conference
Publisher
International World Wide Web Conference Committee
ISBN
1-880672-20-0
Critical Arguements
CA "A key aim is to make a unified retrieval system which is targeted to usersÔÇÖ real requirements and which is usable with integrated cross-collection searching. Museums and Galleries often have several digital collections ranging from public access images to specialised scientific images used for conservation purposes. Access from one gallery to another was not common in terms of textual data and not done at all in terms of image-based queries. However the value of cross-collection access is recognised as important for example in comparing treatments and conditions of paintings. While ARTISTE is primarily designed for inter-museum searching it could equally be applied to museum intranets. Within a MuseumÔÇÖs intranet there may be systems which are not interlinked due to local management issues."
Conclusions
RQ "The query language for this type of system is not yet standardised but we hope that an emerging standard will provide the session-based connectivity this application seems to require due to the possibility of long query times." ... "In the near future, the project will be introducing controlled vocabulary support for some of the metadata fields. This will not only make retrieval more robust but will also facilitate query expansion. The LouvreÔÇÖs multilingual thesaurus will be used in order to ensure greater interoperability. The system is easily extensible to other multimedia types such as audio and video (eg by adding additional query items such as "dialog" and "video sequence" with appropriate analysers). A follow-up project is scheduled to explore this further. There is some scope for relating our RDF query format to the emerging query standards such as XQuery and we also plan to feed our experience into standards such as the ZNG initiative.
SOW
DC "The Artiste project is a European Commission funded collaboration, investigating the use of integrated content and metadata-based image retrieval across disparate databases in several major art galleries across Europe. Collaborating galleries include the Louvre in Paris, the Victoria and Albert Museum in London, the Uffizi Gallery in Florence and the National Gallery in London." ... "Artiste is funded by the European CommunityÔÇÖs Framework 5 programme. The partners are: NCR, The University of Southampton, IT Innovation, Giunti Multimedia, The Victoria and Albert Museum, The National Gallery, The research laboratory of the museums of France (C2RMF) and the Uffizi Gallery. We would particularly like to thank our collaborators Christian Lahanier, James Stevenson, Marco Cappellini, John Cupitt, Raphaela Rimabosci, Gert Presutti, Warren Stirling, Fabrizio Giorgini and Roberto Vacaro."
Type
Conference Proceedings
Title
Integrating Metadata Schema Registries with Digital Preservation Systems to Support Interoperability: A Proposal
There are a large number of metadata standards and initiatives that have relevance to digital preservation, e.g. those designed to support the work of national and research libraries, archives and digitization initiatives. This paper introduces some of these, noting that the developers of some have acknowledged the importance of maintaining or re-using existing metadata. It is argued here that the implementation of metadata registries as part of a digital preservation system may assist repositories in enabling the management and re-use of this metadata and may also help interoperability, namely the exchange of metadata and information packages between repositories.
Publisher
2003 Dublin Core Conference: Supporting Communities of Discourse and Practice-Metadata Research & Applications
Publication Location
Seatle, WA
Critical Arguements
CA "This paper will introduce a range of preservation metadata initiatives including the influential Open Archival Information System (OAIS) reference model and a number of other initiatives originating from national and research libraries, digitization projects and the archives community. It will then comment on the need for interoperability between these specifications and propose that the implementation of metadata registries as part of a digital preservation system may help repositories manage diverse metadata and facilitate the exchange of metadata or information packages between repositories."
Conclusions
RQ "The plethora of metadata standards and formats that have been developed to support the management and preservation of digital objects leaves us with several questions about interoperability. For example, will repositories be able to cope with the wide range of standards and formats that exist? Will they be able to transfer metadata or information packages containing metadata to other repositories? Will they be able to make use of the 'recombinant potential' of existing metadata?" ... "A great deal of work needs to be done before this registry-based approach can be proved to be useful. While it would undoubtedly be useful to have registries of the main metadata standards developed to support preservation, it is less clear how mapping-based conversions between them would work in practice. Metadata specifications are based on a range of different models and conversions often lead to data loss. Also, much more consideration needs to be given to the practical issues of implementation." 
SOW
DC Michael Day is a research officer at UKOLN, which is based at the University of Bath. He belongs to UKOLN's research and development team, and works primarily on projects concerning metadata, interoperability and digital preservation. 
This study focuses upon access to authentic electronic records that are no longer required in day-to-day operations and that have been set aside in a recordkeeping system or storage repository for future reference. One school of thought, generally associated with computer information technology specialists, holds that long-term access to electronic records is primarily a technological issue with little attention devoted to authenticity. Another school of thought, associated generally with librarians, archivists, and records managers, contends that long-term access to electronic records is as much an intellectual issue as it is a technological issue. This latter position is clearly evident in several recent research projects and studies about electronic records whose findings illuminate the discussion of long-term access to electronic records. Therefore, a review of eight research projects highlighting findings relevant for long-term access to electronic records begins this chapter. This review is followed by a discussion, from the perspective of archival science, of nine questions that a long-term access strategy must take into account. The nine issues are: What is a document?; What is a record?; What are authentic electronic records?; What does "archiving" mean?; What is an authentic reformatted electronic record?; What is a copy of an authentic electronic record?; What is an authentic converted electronic record?; What is involved in the migration of authentic electronic records?; What is technology obsolescence?
Book Title
Authentic Electronic Records: Strategies for Long-Term Access
Publisher
Cohasset Associates, Inc.
Publication Location
Chicago
ISBN
0970064004
Critical Arguements
CA "Building upon the key concepts and concerns articulated by the studies described above, this report attempts to move the discussion of long-term access to electronic records towarad more clearly identified, generally applicable and redily im(TRUNCATED)
Conclusions
RQ
SOW
DC This book chapter was written by Charles M. Dollar for Cohasset Associates, Inc. Mr. Dollar has "twenty-five years of experience in working with electronic records as a manager at the National Archives and Records Administration, as an archival educator at the University of British Columbia, and a consultant to governments and businesses in North America, Asia, Europe, and the Middle East." Cohasset Associates Inc. is "one of the nation's foremost consulting firms specializing in document-based information management."
Type
Journal
Title
Migration Strategies within an Electronic Archive: Practical Experience and Future Research
Pfizer Central Research, Sandwich, England has developed an Electronic Archive to support the maintenance and preservation of electronic records used in the discovery and development of new medicines. The Archive has been developed to meet regulatory, scientific and business requirements. The long-term preservation of electronic records requires that migration strategies be developed both for the Archive and the records held within the Archive. The modular design of the Archive will facilitate the migration of hardware components. Selecting an appropriate migration strategy for electronic records requires careful project management skills allied to appraisal and retention management. Having identified when the migration of records is necessary, it is crucial that alternative technical solutions remain open.
DOI
10.1023/A:1009093604632
Critical Arguements
CA Describes a system of archiving and migration of electronic records (Electronic Archive) at Pfizer Central Research. "Our objective is to provide long-term, safe and secure storage for electronic records. The archive acts as an electronic record center and borrows much from traditional archive theory." (p. 301)
Phrases
<P1> Migration, an essential part of the life-cycle of electronic records, is not an activity that occurs in isolation. It is deeply related to the "Warrant" which justifies our record-keeping systems, and to the metadata which describe the data on our systems. (p. 301-302) <warrant> <P2> Our approach to electronic archiving, and consequently our migration strategy, has been shaped by the business requirements of the Pharmaceutical industry, the technical infrastructure in which we work, the nature of scientific research and development, and by new applications for traditional archival skills. <warrant> (p. 302) <P3> The Pharmaceutical industry is regulated by industry Good Practice Guidelines such as Good Laboratory Practice, Good Clinical Practice and GoodManufacturing Practice. Adherence to these standards is monitored by Government agencies such as the U.S. Food and Drug Administration (FDA) and in Britain the Department of Health (DoH). The guidelines require that data relating to any compound used in man be kept for the lifetime of that compound during its use in man. This we may take to be 40 years or more, during which time the data must remain identifiable and reproducible in case of regulatory inspection. <warrant> (p. 302) <P4> The record-keeping requirements of the scientific research and development process also shape migration strategies. ... Data must be able to be manipulated as well as being identifiable and legible. <warrant> (p. 303) <P5> [W]e have adapted traditional archival theory to our working environment and the new imperatives of electronic archiving. We have utilised retention scheduling to provide a vehicle for metadata file description alongside retention requirements. We have also placed great importance on appraisal as a tool to evaluate records which require to be migrated. (p. 303) <P6> Software application information is therefore collected as part of the metadata description for each file. (p. 303) <P7> The migration of the database fromone version to another or to a new schema represents a significant migration challenge in terms of the project management and validation necessary to demonstrate that a new database accurately represents our original data set. (p. 303-304) <P8> Assessing the risk of migration exercises is only one of several issues we have identified which need to be addressed before any migration of the archive or its components takes place. (p. 304) <P9> [F]ew organisations can cut themselves off totally from their existing record-keeping systems, whether they be paper or electronic. (p. 304) <P10> Critical to this model is identifying the data which are worthy of long-term preservation and transfer to the Archive. This introduces new applications for the retention and appraisal of electronic records. Traditional archival skills can be utilised in deciding which records are worthy of retention. Once they are in the Archive it will become critical to return time and again to those records in a process of "constant review" to ensure that records remain, identifiable, legible and manipulatable. (p. 305) <P11> Having decided when to migrate electronic records, it is important to decide if it is worth it. Our role in Records Management is to inform the business leaders and budget holders when a migration of electronic records will be necessary. It is also our role to provide the business with an informed decision. A key vehicle in this process will be the retention schedule, which is not simply a tool to schedule the destruction of records. It could also be used to schedule software versions. More importantly, with event driven requirements it is a vehicle for constant review and appraisal of record holdings. The Schedule also defines important parts of the metadata description for each file in the Archive. The role of appraisal is critical in evaluating record holdings from a migration point of view and will demand greater time and resources from archivists and records managers. (p. 305)
Conclusions
RQ "Any migration of electronic records must be supported by full project management. Migration of electronic records is an increasingly complex area, with the advent of relational databases, multi-dimensional records and the World Wide Web. New solutions must be found, and new research undertaken. ... To develop a methodology for the migration of electronic records demands further exploration of the role of the "warrant" both external and internal to any organisation, which underpins electronic record-keeping practices. It will become critical to find new and practical ways to identify source software applications. ... The role of archival theory, especially appraisal and retention scheduling, in migration strategies demands greater consideration. ... The issues raised by complex documents are perhaps the area which demands the greatest research for the future. In this respect however, the agenda is being set by vendors promoting new technologies with short-term business goals. It may appear that electronic records do not lend themselves to long-term preservation. ... The development, management and operation of an Electronic Archive and migration strategy demands a multitude of skills that can only be achieved by a multi-disciplinary team of user, records management, IT, and computing expertise. Reassuringly, the key factor in migrating electronic archives will remain people." (p. 306)
Type
Journal
Title
Managing the Present: Metadata as Archival Description
Traditional archival description undertaken at the terminal stages of the life cycle has had two deleterious effects on the archival profession. First, it has resulted in enormous, and in some cases, insurmountable processing backlogs. Second, it has limited our ability to capture crucial contextual and structural information throughout the life cycle of record-keeping systems that are essential for fully understanding the fonds in our institutions. This shortcoming has resulted in an inadequate knowledge base for appraisal and access provision. Such complications will only become more magnified as distributed computering and complex software applications continue to expand throughout organizations. A metadata strategy for archival description will help mitigate these problems and enhance the organizational profile of archivists who will come to be seen as valuable organizational knowledge and accountability managers.
Critical Arguements
CA "This essay affirms this call for evaluation and asserts that the archival profession must embrace a metadata systems approach to archival description and management." ... "It is held here that the requirements for records capture and description are the requirements for metadata."
Phrases
<P1> New archival organizational structures must be created to ensure that records can be maintained in a usable form. <warrant> <P2> The recent report of Society of American Archivists (SAA) Committee on Automated Records and Techniques (CART) on curriculum development has argued that archivists need to "understand the nature and utility of metadata and how to interpret and use metadata for archival purposes." <warrant> <P3> The report advises archivists to acquire knowledge on the meanings of metadata, its structures, standards, and uses for the management of electronic records. Interestingly, the requirements for archival description immediately follow this section and note that archivists need to isolate the descriptive requirements, standards, documentiation, and practices needed for managing electronic records. <warrant> <P4> Clearly, archivists need to identify what types of metadata will best suit their descriptive needs, underscoring the need for the profession to develop strategies aand tactics to satisfy these requirements within active software environments. <warrant> <P5> Underlying the metadata systems strategy for describing and managing electronic information technologies is the seemingly universal agreement amongst electronic records archivists on the requirement to intervene earlier in the life cycle of electronic information systems. <warrant> <P6> Metadata has loomed over the archival management of electronic records for over five years now and is increasingly being promised as a basic control strategy for managing these records. <warrant> <P7> However, she [Margaret Hedstrom] also warns that as descriptive practices shift from creating descriptive information to capturing description along with the records, archivists may discover that managing the metadata is a much greater challenge than managing the records themselves. <P8> Archivists must seek to influence the creation of record-keeping systems within organizations by connecting the transaction that created the data to the data itself. Such a connection will link informational content, structure, and the context of transactions. Only when these conditions are met will we have records and an appropriate infrastructure for archival description. <warrant> <P9> Charles Dollar has argued that archivists increasingly will have to rely upon and shape the metadata associated with electronic records in order to fully capture provenance information about them. <warrant> <P10> Bearman proposes a metadata systems strategy, which would focus more explicitly on the context out of which records arise, as opposed to concentrating on their content. This axiom is premised on the assumption that "lifecycle records systems control should drive provenance-based description and link to top-down definitions of holdings." <warrant> <P11> Bearman and Margaret Hedstrom have built upon this model and contend that properly specified metadata capture could fully describe sytems while they are still active and eliminate the need for post-hoc description. The fundamental change wrought in this approach is the shift from doing things to records (surveying, scheduling, appraising, disposing/accessioning, describing, preserving, and accessing) to providing policy direction for adequate documentation through management of organizational behavior (analyzing organizational functions, defining business transactions, defining record metadata, indentifying control tactics, and establishing the record-keeping regime). Within this model archivists focus on steering how records will be captured (and that they will be captured) and how they will be managed and described within record-keeping systems while they are still actively serving their parent organization. <P12> Through the provision of policy guidance and oversight, organizational record-keeping is managed in order to ensure that the "documentation of organizational missions, functions, and responsibilities ... and reporting relationships within the organization, will be undertaken by the organizations themselves in their administrative control systems." <warrant> <P13> Through a metadata systems approach, archivists can realign themselves strategically as managers of authoritative information about organizational record-keeping systems, providing for the capture of information about each system, its contextual attributes, its users, its hardware configurations, its software configurations, and its data configurations. <warrant> <P14> The University of Pittsburgh's functional requirements for record-keeping provides a framework for such information management structure. These functional requirements are appropriately viewed as an absolute ideal, requiring testing within live systems and organizations. If properly implemented, however, they can provide a concrete model for metadata capture that can automatically supply many of the types of descriptive information both desired by archivists and required for elucidating the context out of which records arise. <P15> It is possible that satisfying these requirements will contribute to the development of a robust archival description process integrating "preservation of meaning, exercise of control, and provision of access'" within "one prinicipal, multipurpose descriptive instrument" hinted at by Luciana Duranti as a possible outcome of the electronic era. <P16> However, since electronic records are logical and not physical entities, there is no physical effort required to access and process them, just mental modelling. <P17> Depending on the type of metadata that is built into and linked to electronic information systems, it is possible that users can identify individual records at the lowest level of granularity and still see the top-level process it is related to. Furthermore, records can be reaggregated based upon user-defined criteria though metadata links that track every instance of their use, their relations to other records, and the actions that led to their creation. <P18> A metadata strategy for archival description will help to mitigate these problems and enhance the organizational profile of archivists, who will come to be seen as valuable organizational knowledge and accountability managers. <warrant>
Conclusions
RQ "First and foremost, the promise of metadata for archival description is contingent upon the creation of electronic record-keeping systems as opposed to a continuation of the data management orientation that seems to dominate most computer applications within organizations." ... "As with so many other aspects of the archival endeavour, these requirements and the larger metadata model for description that they are premised upon necessitate further exploration through basic research."
SOW
DC "In addition to New York State, recognition of the failure of existing software applications to capture a full compliment of metadata required for record-keeping and the need for such records management control has also been acknowledged in Canada, the Netherlands, and the World Bank." ... "In conjunction with experts in electronic records managment, an ongoing research project at the University of Pittsburgh has developed a set of thirteen functional requirements for record-keeping. These requirements provide a concrete metadata tool sought by archivists for managing and describing electronic records and electronic record-keeping systems." ... David A. Wallace is an Assistant Professor at the School of Information, University of Michigan, where he teaches in the areas of archives and records management. He holds a B.A. from Binghamton University, a Masters of Library Science from the University at Albany, and a doctorate from the University of Pittsburgh. Between 1988 and 1992, he served as Records/Systems/Database Manager at the National Security Archive in Washington, D.C., a non-profit research library of declassified U.S. government records. While at the NSA he also served as Technical Editor to their "The Making of U.S. Foreign Policy" series. From 1993-1994, he served as a research assistant to the University of Pittsburgh's project on Functional Requirements for Evidence in Recordkeeping, and as a Contributing Editor to Archives and Museum Informatics: Cultural Heritage Informatics Quarterly. From 1994 to 1996, he served as a staff member to the U.S. Advisory Council on the National Information Infrastructure. In 1997, he completed a dissertation analyzing the White House email "PROFS" case. Since arriving at the School of Information in late 1997, he has served as Co-PI on an NHPRC funded grant assessing strategies for preserving electronic records of collaborative processes, as PI on an NSF Digital Government Program funded planning grant investigating the incorporation of born digital records into a FOIA processing system, co-edited Archives and the Public Good: Accountability and Records in Modern Society (Quorum, 2002), and was awarded ARMA International's Britt Literary Award for an article on email policy. He also serves as a consultant to the South African History Archives Freedom of Information Program and is exploring the development of a massive digital library of declassified imaged/digitized U.S. government documents charting U.S. foreign policy.
Type
Electronic Journal
Title
ARTISTE: An integrated Art Analysis and Navigation Environment
This article focuses on the description of the objectives of the ARTISTE project (for "An integrated Art Analysis and Navigation environment") that aims at building a tool for the intelligent retrieval and indexing of high resolution images. The ARTISTE project will address professional users in the fine arts as the primary end-user base. These users provide services for the ultimate end-user, the citizen.
Critical Arguements
CA "European museums and galleries are rich in cultural treasures but public access has not reached its full potential. Digital multimedia can address these issues and expand the accessible collections. However, there is a lack of systems and techniques to support both professional and citizen access to these collections."
Phrases
<P1> New technology is now being developed that will transform that situation. A European consortium, partly funded by the EU under the fifth R&D framework, is working to produce a new management system for visual information. <P2> Four major European galleries (The Uffizi in Florence, The National Gallery and the Victoria and Albert Museum in London and the Louvre related restoration centre, Centre de Recherche et de Restauration des Mus├®es de France) are involved in the project. They will be joining forces with NCR, a leading player in database and Data Warehouse technology; Interactive Labs, the new media design and development facility of Italy's leading art publishing group, Giunti; IT Innovation, Web-based system developers; and the Department of Electronics and Computer Science at the University of Southampton. Together they will create web based applications and tools for the automatic indexing and retrieval of high-resolution art images by pictorial content and information. <P3> The areas of innovation in this project are as follows: Using image content analysis to automatically extract metadata based on iconography, painting style etc; Use of high quality images (with data from several spectral bands and shadow data) for image content analysis of art; Use of distributed metadata using RDF to build on existing standards; Content-based navigation for art documents separating links from content and applying links according to context at presentation time; Distributed linking and searching across multiple archives allowing ownership of data to be retained; Storage of art images using large (>1TeraByte) multimedia object relational databases. <P4> The ARTISTE approach will use the power of object-related databases and content-retrieval to enable indexing to be made dynamically, by non-experts. <P5> In other words ARTISTE would aim to give searchers tools which hint at links due to say colour or brush-stroke texture rather than saying "this is the automatically classified data". <P6> The ARTISTE project will build on and exploit the indexing scheme proposed by the AQUARELLE consortia. The ARTISTE project solution will have a core component that is compatible with existing standards such as Z39.50. The solution will make use of emerging technical standards XML, RDF and X-Link to extend existing library standards to a more dynamic and flexible metadata system. The ARTISTE project will actively track and make use of existing terminology resources such as the Getty "Art and Architecture Thesaurus" (AAT) and the "Union List of Artist Names" (ULAN). <P7> Metadata will also be stored in a database. This may be stored in the same object-relational database, or in a separate database, according to the incumbent systems at the user partners. <P8> RDF provides for metadata definition through the use of schemas. Schemas define the relevant metadata terms (the namespace) and the associated semantics. Individual RDF queries and statements may use multiple schemas. The system will make use of existing schemas such as the Dublin Core schema and will provide wrappers for existing resources such as the Art and Architecture thesaurus in a RDF schema wrapper. <P9> The Distributed Query and Metadata Layer will also provide facilities to enable queries to be directed towards multiple distributed databases. The end user will be able to seamlessly search the combined art collection. This layer will adhere to worldwide digital library standards such as Z39.50, augmenting and extending as necessary to allow the richness of metadata enabled by the RDF standard.
Conclusions
RQ "In conclusion the Artiste project will result into an interesting and innovative system for the art analysis, indexing storage and navigation. The actual state of the art of content-based retrieval systems will be positively influenced by the development of the Artiste project, which will pursue the following goals: A solution which can be replicated to European galleries, museums, etc.; Deep-content analysis software based on object relational database technology.; Distributed links server software, user interfaces, and content-based navigation software.; A fully integrated prototype analysis environment.; Recommendations for the exploitation of the project solution by European museums and galleries. ; Recommendations for the exploitation of the technology in other sectors.; "Impact on standards" report detailing augmentations of Z39.50 with RDF." ... ""Not much research has been carried out worldwide on new algorithms for style-matching in art. This is probably not a major aim in Artiste but could be a spin-off if the algorithms made for specific author search requirements happen to provide data which can be combined with other data to help classify styles." >
SOW
DC "Four major European galleries (The Uffizi in Florence, The National Gallery and the Victoria and Albert Museum in London and the Louvre related restoration centre, Centre de Recherche et de Restauration des Mus├®es de France) are involved in the project. They will be joining forces with NCR, a leading player in database and Data Warehouse technology; Interactive Labs, the new media design and development facility of Italy's leading art publishing group, Giunti; IT Innovation, Web-based system developers; and the Department of Electronics and Computer Science at the University of Southampton. Together they will create web based applications and tools for the automatic indexing and retrieval of high-resolution art images by pictorial content and information."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 1
The preservation of digital information for long periods of time is becoming feasible through the integration of archival storage technology from supercomputer centers, data grid technology from the computer science community, information models from the digital library community, and preservation models from the archivistÔÇÖs community. The supercomputer centers provide the technology needed to store the immense amounts of digital data that are being created, while the digital library community provides the mechanisms to define the context needed to interpret the data. The coordination of these technologies with preservation and management policies defines the infrastructure for a collection-based persistent archive. This paper defines an approach for maintaining digital data for hundreds of years through development of an environment that supports migration of collections onto new software systems.
ISBN
1082-9873
Critical Arguements
CA "Supercomputer centers, digital libraries, and archival storage communities have common persistent archival storage requirements. Each of these communities is building software infrastructure to organize and store large collections of data. An emerging common requirement is the ability to maintain data collections for long periods of time. The challenge is to maintain the ability to discover, access, and display digital objects that are stored within an archive, while the technology used to manage the archive evolves. We have implemented an approach based upon the storage of the digital objects that comprise the collection, augmented with the meta-data attributes needed to dynamically recreate the data collection. This approach builds upon the technology needed to support extensible database schema, which in turn enables the creation of data handling systems that interconnect legacy storage systems."
Phrases
<P1> The ultimate goal is to preserve not only the bits associated with the original data, but also the context that permits the data to be interpreted. <warrant> <P2> We rely on the use of collections to define the context to associate with digital data. The context is defined through the creation of semi-structured representations for both the digital objects and the associated data collection. <P3>A collection-based persistent archive is therefore one in which the organization of the collection is archived simultaneously with the digital objects that comprise the collection. <P4> The goal is to preserve digital information for at least 400 years. This paper examines the technical issues that must be addressed and presents a prototype implementation. <P5>Digital object representation. Every digital object has attributes that define its structure, physical context, and provenance, and annotations that describe features of interest within the object. Since the set of attributes (such as annotations) will vary across all objects within a collection, a semi-structured representation is needed. Not all digital objects will have the same set of associated attributes. <P6> If possible, a common information model should be used to reference the attributes associated with the digital objects, the collection organization, and the presentation interface. An emerging standard for a uniform data exchange model is the eXtended Markup Language (XML). <P7> A particular example of an information model is the XML Document Type Definition (DTD) which provides a description for the allowed nesting structure of XML elements. Richer information models are emerging such as XSchema (which provides data types, inheritance, and more powerful linking mechanisms) and XMI (which provides models for multiple levels of data abstraction). <P8> Although XML DTDs were originally applied to documents only, they are now being applied to arbitrary digital objects, including the collections themselves. More generally, OSDs can be used to define the structure of digital objects, specify inheritance properties of digital objects, and define the collection organization and user interface structure. <P9> A persistent collection therefore needs the following components of an OSD to completely define the collection context: Data dictionary for collection semantics; Digital object structure; Collection structure; and User interface structure. <P10> The re-creation or instantiation of the data collection is done with a software program that uses the schema descriptions that define the digital object and collection structure to generate the collection. The goal is to build a generic program that works with any schema description. <P11> The information for which driver to use for access to a particular data set is maintained in the associated Meta-data Catalog (MCAT). The MCAT system is a database containing information about each data set that is stored in the data storage systems. <P12> The data handling infrastructure developed at SDSC has two components: the SDSC Storage Resource Broker (SRB) that provides federation and access to distributed and diverse storage resources in a heterogeneous computing environment, and the Meta-data Catalog (MCAT) that holds systemic and application or domain-dependent meta-data about the resources and data sets (and users) that are being brokered by the SRB. <P13> A client does not need to remember the physical mapping of a data set. It is stored as meta-data associated with the data set in the MCAT catalog. <P14> A characterization of a relational database requires a description of both the logical organization of attributes (the schema), and a description of the physical organization of attributes into tables. For the persistent archive prototype we used XML DTDs to describe the logical organization. <P15> A combination of the schema and physical organization can be used to define how queries can be decomposed across the multiple tables that are used to hold the meta-data attributes. <P16> By using an XML-based database, it is possible to avoid the need to map between semi-structured and relational organizations of the database attributes. This minimizes the amount of information needed to characterize a collection, and makes the re-creation of the database easier. <warrant> <P17> Digital object attributes are separated into two classes of information within the MCAT: System-level meta-data that provides operational information. These include information about resources (e.g., archival systems, database systems, etc., and their capabilities, protocols, etc.) and data objects (e.g., their formats or types, replication information, location, collection information, etc.); Application-dependent meta-data that provides information specific to particular data sets and their collections (e.g., Dublin Core values for text objects). <P18> Internally, MCAT keeps schema-level meta-data about all of the attributes that are defined. The schema-level attributes are used to define the context for a collection and enable the instantiation of the collection on new technology. <P19> The logical structure should not be confused with database schema and are more general than that. For example, we have implemented the Dublin Core database schema to organize attributes about digitized text. The attributes defined in the logical structure that is associated with the Dublin Core schema contains information about the subject, constraints, and presentation formats that are needed to display the schema along with information about its use and ownership. <P20> The MCAT system supports the publication of schemata associated with data collections, schema extension through the addition or deletion of new attributes, and the dynamic generation of the SQL that corresponds to joins across combinations of attributes. <P21> By adding routines to access the schema-level meta-data from an archive, it is possible to build a collection-based persistent archive. As technology evolves and the software infrastructure is replaced, the MCAT system can support the migration of the collection to the new technology.
Conclusions
RQ Collection-Based Persistent Digital Archives - Part 2
SOW
DC "The technology proposed by SDSC for implementing persistent archives builds upon interactions with many of these groups. Explicit interactions include collaborations with Federal planning groups, the Computational Grid, the digital library community, and individual federal agencies." ... "The data management technology has been developed through multiple federally sponsored projects, including the DARPA project F19628-95-C-0194 "Massive Data Analysis Systems," the DARPA/USPTO project F19628-96-C-0020 "Distributed Object Computation Testbed," the Data Intensive Computing thrust area of the NSF project ASC 96-19020 "National Partnership for Advanced Computational Infrastructure," the NASA Information Power Grid project, and the DOE ASCI/ASAP project "Data Visualization Corridor." Additional projects related to the NSF Digital Library Initiative Phase II and the California Digital Library at the University of California will also support the development of information management technology. This work was supported by a NARA extension to the DARPA/USPTO Distributed Object Computation Testbed, project F19628-96-C-0020."
Type
Electronic Journal
Title
Collection-Based Persistent Digital Archives - Part 2
"Collection-Based Persistent Digital Archives: Part 2" describes the creation of a one million message persistent E-mail collection. It discusses the four major components of a persistent archive system: support for ingestion, archival storage, information discovery, and presentation of the collection. The technology to support each of these processes is still rapidly evolving, and opportunities for further research are identified.
ISBN
1082-9873
Critical Arguements
CA "The multiple migration steps can be broadly classified into a definition phase and a loading phase. The definition phase is infrastructure independent, whereas the loading phase is geared towards materializing the processes needed for migrating the objects onto new technology. We illustrate these steps by providing a detailed description of the actual process used to ingest and load a million-record E-mail collection at the San Diego Supercomputer Center (SDSC). Note that the SDSC processes were written to use the available object-relational databases for organizing the meta-data. In the future, it may be possible to go directly to XML-based databases."
Phrases
<P1> The processes used to ingest a collection, transform it into an infrastructure independent form, and store the collection in an archive comprise the persistent storage steps of a persistent archive. The processes used to recreate the collection on new technology, optimize the database, and recreate the user interface comprise the retrieval steps of a persistent archive. <P2> In order to build a persistent collection, we consider a solution that "abstracts" all aspects of the data and its preservation. In this approach, data object and processes are codified by raising them above the machine/software dependent forms to an abstract format that can be used to recreate the object and the processes in any new desirable forms. <P3> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P4> The SDSC infrastructure uses object-relational databases to organize information. This makes data ingestion more complex by requiring the mapping of the XML DTD semi-structured representation onto a relational schema. <P5> The steps used to store the persistent archive were: (1) Define Digital Object: define meta-data, define object structure (OBJ-DTD) --- (A), define object DTD to object DDL mapping --- (B) (2) Define Collection: define meta-data, define collection structure (COLL-DTD) --- (C), define collection DTD structure to collection DDL mapping --- (D) (3) Define Containers: define packing format for encapsulating data and meta-data (examples are the AIP standard, Hierarchical Data Format, Document Type Definition) <P5> In the ingestion phase, the relational and semi-structured organization of the meta-data is defined. No database is actually created, only the mapping between the relational organization and the object DTD. <P6> Note that the collection relational organization does not have to encompass all of the attributes that are associated with a digital object. Separate information models are used to describe the objects and the collections. It is possible to take the same set of digital objects and form a new collection with a new relational organization. <P7> Multiple communities across academia, the federal government, and standards groups are exploring strategies for managing very large archives. The persistent archive community needs to maintain interactions with these communities to track development of new strategies for data management and storage. <warrant> <P8>
Conclusions
RQ "The four major components of the persistent archive system are support for ingestion, archival storage, information discovery, and presentation of the collection. The first two components focus on the ingestion of data into collections. The last two focus on access to the resulting collections. The technology to support each of these processes is still rapidly evolving. Hence consensus on standards has not been reached for many of the infrastructure components. At the same time, many of the components are active areas of research. To reach consensus on a feasible collection-based persistent archive, continued research and development is needed. Examples of the many related issues are listed below:
Type
Electronic Journal
Title
Search for Tomorrow: The Electronic Records Research Program of the U.S. National Historical Publications and Records Commission
The National Historical Publications and Records Commission (NHPRC) is a small grant-making agency affiliated with the U.S. National Archives and Records Administration. The Commission is charged with promoting the preservation and dissemination of documentary source materials to ensure an understanding of U.S. history. Recognizing that the increasing use of computers created challenges for preserving the documentary record, the Commission adopted a research agenda in 1991 to promote research and development on the preservation and continued accessibility of documentary materials in electronic form. From 1991 to the present the Commission awarded 31 grants totaling $2,276,665 for electronic records research. Most of this research has focused on two issues of central concern to archivists: (1) electronic record keeping (tools and techniques to manage electronic records produced in an office environment, such as word processing documents and electronic mail), and (2) best practices for storing, describing, and providing access to all electronic records of long-term value. NHPRC grants have raised the visibility of electronic records issues among archivists. The grants have enabled numerous archives to begin to address electronic records problems, and, perhaps most importantly, they have stimulated discussion about electronic records among archivists and records managers.
Publisher
Elsevier Science Ltd
Critical Arguements
CA "The problem of maintaining electronic records over time is big, expensive, and growing. A task force on digital archives established by the Commission on Preservation and Access in 1994 commented that the life of electronic records could be characterized in the same words Thomas Hobbes once used to describe life: ÔÇ£nasty, brutish, and shortÔÇØ [1]. Every day, thousands of new electronic files are created on federal, state, and local government computers across the nation. A small but important portion of these records will be designated for permanent retention. Government agencies are increasingly relying on computers to maintain information such as census files, land titles, statistical data, and vital records. But how should electronic records with long-term value be maintained? Few government agencies have developed comprehensive policies for managing current electronic records, much less preserving those with continuing value for historians and other researchers. Because of this serious and growing problem, the National Historical Publications and Records Commission (NHPRC), a small grantmaking agency affiliated with the U.S. National Archives and Records Administration (NARA), has been making grants for research and development on the preservation and use of electronic documentary sources. The program is conducted in concert with NARA, which in 1996 issued a strategic plan that gives high priority to mastering electronic records problems in partnership with federal government agencies and the NHPRC.
Phrases
<P1> How can data dictionaries, information resource directory systems, and other metadata systems be used to support electronic records management and archival requirements? <P2> In spite of the number of projects the Commission has supported, only four questions from the research agenda have been addressed to date. Of these, the question relating to requirements for the development of data dictionaries and other metadata systems (question number four) has produced a single grant for a state information locator system in South Carolina, and the question relating to needs for archival education (question 10) has led to two grants to the Society of American Archivists for curricular materials. <P3> Information systems created without regard for these considerations may have deficiencies that limit the usefulness of the records contained on them. <warrant> <P4> The NHPRC has awarded major grants to four institutions over the past five years for projects to develop and test requirements for electronic record keeping: University of Pittsburgh (1993): A working set of functional requirements and metadata specifications for electronic record keeping systems; City of Philadelphia (1995, 1996, and 1997): A project to incorporate a subset of the Pittsburgh metadata specifications into a new human resources information system and other city systems as test cases and to develop comprehensive record keeping policies and standards for the cityÔÇÖs information technology systems; Indiana University (1995): A project to develop an assessment tool and methodology for analyzing existing electronic records systems, using the Pittsburgh functional requirements as a model and the student academic record system and a financial system as test cases; Research Foundation of the State University of New York-Albany, Center for Technology in Government (1996): A project to identify best practices for electronic record keeping, including work by the U.S. Department of Defense and the University of British Columbia in addition to the University of Pittsburgh. The Center is working with the stateÔÇÖs Adirondack Parks Agency in a pilot project to develop a system model for incorporating record keeping and archival considerations into the creation of networked computing and communications applications. <P5> No definitive solution has yet been identified for the problems posed by electronic records, although progress has been made in learning what will be needed to design functional electronic record keeping systems. <P6> With the proliferation of digital libraries, the need for long-term storage, migration and retrieval strategies for electronic information has become a priority for a wide variety of information providers. <warrant>
Conclusions
RQ "How best to preserve existing and future electronic formats and provide access to them over time has remained elusive. The answers cannot be found through theoretical research alone, or even through applied research, although both are needed. Answers can only emerge over time as some approaches prove able to stand the test of time and others do not. The problems are large because the costs of maintaining, migrating, and retrieving electronic information continue to be high." ... "Perhaps most importantly, these grants have stimulated widespread discussion of electronic records issues among archivists and record managers, and thus they have had an impact on the preservation of the electronic documentary record that goes far beyond the CommissionÔÇÖs investment."
SOW
DC The National Historic Publications and Records Commission (NHPRC) is the outreach arm of the National Archives and makes plans for and studies issues related to the preservation, use and publication of historical documents. The Commission also makes grants to non-Federal archives and other organizations to promote the preservation use of America's documentary heritage.
Type
Electronic Journal
Title
Metadata: The right approach, An integrated model for descriptive and rights metadata in E-commerce
If you've ever completed a large and difficult jigsaw puzzle, you'll be familiar with that particular moment of grateful revelation when you find that two sections you've been working on separately actually fit together. The overall picture becomes coherent, and the task at last seems achievable. Something like this seems to be happening in the puzzle of "content metadata." Two communities -- rights owners on one hand, libraries and cataloguers on the other -- are staring at their unfolding data models and systems, knowing that somehow together they make up a whole picture. This paper aims to show how and where they fit.
ISBN
1082-9873
Critical Arguements
CA "This paper looks at metadata developments from this standpoint -- hence the "right" approach -- but does so recognising that in the digital world many Chinese walls that appear to separate the bibliographic and commercial communities are going to collapse." ... "This paper examines three propositions which support the need for radical integration of metadata and rights management concerns for disparate and heterogeneous materials, and sets out a possible framework for an integrated approach. It draws on models developed in the CIS plan and the DOI Rights Metadata group, and work on the ISRC, ISAN, and ISWC standards and proposals. The three propositions are: DOI metadata must support all types of creation; The secure transaction of requests and offers data depends on maintaining an integrated structure for documenting rights ownership agreements; All elements of descriptive metadata (except titles) may also be elements of agreements. The main consequences of these propositions are: A cross-sector vocabulary is essential; Non-confidential terms of rights ownership agreements must be generally accessible in a standard form. (In its purest form, the e-commerce network must be able to automatically determine the current owner of any right in any creation for any territory.); All descriptive metadata values (except titles) must be stored as unique, coded values. If correct, the implications of these propositions on the behaviour, and future inter-dependency, of the rights-owning and bibliographic communities are considerable."
Phrases
<P1> Historically, metadata -- "data about data" -- has been largely treated as an afterthought in the commercial world, even among rights owners. Descriptive metadata has often been regarded as the proper province of libraries, a battlefield of competing systems of tags and classification and an invaluable tool for the discovery of resources, while "business" metadata lurked, ugly but necessary, in distribution systems and EDI message formats. Rights metadata, whatever it may be, may seem to have barely existed in a coherent form at all. <P2> E-commerce offers the opportunity to integrate the functions of discovery, access, licensing and accounting into single point-and-click actions in which metadata is a critical agent, a glue which holds the pieces together. <warrant> <P3> E-commerce in rights will generate global networks of metadata every bit as vital as the networks of optical fibre -- and with the same requirements for security and unbroken connectivity. <warrant> <P4> The sheer volume and complexity of future rights trading in the digital environment will mean that any but the most sporadic level of human intervention will be prohibitively expensive. Standardised metadata is an essential component. <warrant> <P5> Just as the creators and rights holders are the sources of the content for the bibliographic world, so it seems inevitable they will become the principal source of core metadata in the web environment, and that metadata will be generated simultaneously and at source to meet the requirements of discovery, access, protection, and reward. <P6> However, under the analysis being carried out within the communities identified above and by those who are developing technology and languages for rights-based e-commerce, it is becoming clear that "functional" metadata is also a critical component. It is metadata (including identifiers) which defines a creation and its relationship to other creations and to the parties who created and variously own it; without a coherent metadata infrastructure e-commerce cannot properly flow. Securing the metadata network is every bit as important as securing the content, and there is little doubt which poses the greater problem. <warrant> <P7> Because creations can be nested and modified at an unprecedented level, and because online availability is continuous, not a series of time-limited events like publishing books or selling records, dynamic and structured maintenance of rights ownership is essential if the currency and validity of offers is to be maintained. <warrant> <P8> Rights metadata must be maintained and linked dynamically to all of its related content. <P9> A single, even partial, change to rights ownership in the original creation needs to be communicated through this chain to preserve the currency of permissions and royalty flow. There are many options for doing this, but they all depend, among other things, on the security of the metadata network. <warrant> <P10>As digital media causes copyright frameworks to be rewritten on both sides of the Atlantic, we can expect measures of similar and greater impact at regular intervals affecting any and all creation types: yet such changes can be relatively simple to implement if metadata is held in the right way in the right place to begin with. <warrant> <P11> The disturbing but inescapable consequence is that it is not only desirable but essential for all elements of descriptive metadata, except for titles, to be expressed at the outset as structured and standardised values to preserve the integrity of the rights chain. <P12> Within the DOI community, which embraces commercial and library interests, the integration of rights and descriptive metadata has become a matter of priority. <P13> What is required is that the establishment of a creation description (for example, the registration of details of a new article or audio recording) or of change of rights control (for example, notification of the acquisition of a work or a catalogue of works) can be done in a standardised and fully structured way. <warrant> <P14> Unless the chain is well maintained at source, all downstream transactions will be jeopardised, for in the web environment the CIS principle of "do it once, do it right" is seen at its ultimate. A single occurrence of a creation on the web, and its supporting metadata, can be the source for all uses. <P15> One of the tools to support this development is the RDF (Resource Description Framework). RDF provides a means of structuring metadata for anything, and it can be expressed in XML. <P16> Although formal metadata standards hardly exist within ISO, they are appearing through the "back door" in the form of mandatory supporting data for identifier standards such as ISRC, ISAN and ISWC. A major function of the INDECS project will be to ensure the harmonisation of these standards within a single framework. <P17> In an automated, protected environment, this requires that the rights transaction is able to generate automatically a new descriptive metadata set through the interaction of the agreement terms with the original creation metadata. This can only happen (and it will be required on a massive scale) if rights and descriptive metadata terminology is integrated and standardised. <warrant> <P18>As resources become available virtually, it becomes as important that the core metadata itself is not tampered with as it is that the object itself is protected. Persistence is now not only a necessary characteristic of identifiers but also of the structured metadata that attends them. <P19> This leads us also to the conclusion that, ideally, standardised descriptive metadata should be embedded into objects for its own protection. <P20> It also leads us to the possibility of metadata registration authorities, such as the numbering agencies, taking wider responsibilities. <P21>If this paper is correct in its propositions, then rights metadata will have to rewrite half of Dublin Core or else ignore it entirely. <P22> The web environment with its once-for-all means of access provides us with the opportunity to eliminate duplication and fragmentation of core metadata; and at this moment, there are no legacy metadata standards to shackle the information community. We have the opportunity to go in with our eyes open with standards that are constructed to make the best of the characteristics of the new digital medium. <warrant>
Conclusions
RQ "The INDECS project (assuming its formal adoption next month), in which the four major communities are active, and with strong links to ISO TC46 and MPEG, will provide a cross-sector framework for this work in the short-term. The DOI Foundation itself may be an appropriate umbrella body in the future. We may also consider that perhaps the main function of the DOI itself may not be, as originally envisaged, to link user to content -- which is a relatively trivial task -- but to provide the glue to link together creation, party, and agreement metadata. The model that rights owners may be wise to follow in this process is that of MPEG, where the technology industry has tenaciously embraced a highly-regimented, rolling standardisation programme, the results of which are fundamental to the success of each new generation of products. Metadata standardisation now requires the same technical rigour and commercial commitment. However, in the meantime the bibliographic world, working on what it has always seen its own part of the jigsaw puzzle, is actively addressing many of these issues in an almost parallel universe. The question remains as to how in practical terms the two worlds, rights and bibliographic, can connect, and what may be the consequences of a prolonged delay in doing so." ... "The former I encourage to make a case for continued support and standardisation of a flawed Dublin Core in the light of the propositions I have set out in this paper, or else engage with the DOI and rights owner communities in its revision to meet the real requirements of digital commerce in its fullest sense."
SOW
DC "There are currently four major active communities of rights-holders directly confronting these questions: the DOI community, at present based in the book and electronic publishing sector; the IFPI community of record companies; the ISAN community embracing producers, users, and rights owners of audiovisuals; and the CISAC community of collecting societies for composers and publishers of music, but also extending into other areas of authors' rights, including literary, visual, and plastic arts." ... "There are related rights-driven projects in the graphic, photographic, and performers' communities. E-commerce means that metadata solutions from each of these sectors (and others) require a high level of interoperability. As the trading environment becomes common, traditional genre distinctions between creation-types become meaningless and commercially destructive."
Type
Report
Title
Mapping of the Encoded Archival Description DTD Element Set to the CIDOC CRM
The CIDOC CRM is the first ontology designed to mediate contents in the area of material cultural heritage and beyond, and has been accepted by ISO TC46 as work item for an international standard. The EAD Document Type Definition (DTD) is a standard for encoding archival finding aids using the Standard Generalized Markup Language (SGML). Archival finding aids are detailed guides to primary source material which provide fuller information than that normally contained within cataloging records. 
Publisher
Institute of Computer Science, Foundation for Research and Technology - Hellas
Publication Location
Heraklion, Crete, Greece
Language
English
Critical Arguements
CA "This report describes the semantic mapping of the current EAD DTD Version 1.0 Element Set to the CIDOC CRM and its latest extension. This work represents a proof of concept for the functionality the CIDOC CRM is designed for." 
Conclusions
RQ "Actually, the CRM seems to do the job quite well ÔÇô problems in the mapping arise more from underspecification in the EAD rather than from too domain-specific notions. "┬á... "To our opinion, the archival community could benefit from the conceptualizations of the CRM to motivate more powerful metadata standards with wide interoperability in the future, to the benefit of museums and other disciplines as well."
SOW
DC "As a potential international standard, the EAD DTD is maintained in the Network Development and MARC Standards Office of the Library of Congress in partnership with the Society of American Archivists." ... "The CIDOC Conceptual Reference Model (see [CRM1999], [Doerr99]), in the following only referred to as ┬½CRM┬╗, is outcome of an effort of the Documentation Standards Group of the CIDOC Committee (see ┬½http:/www.cidoc.icom.org┬╗, ÔÇ£http://cidoc.ics.forth.grÔÇØ) of ICOM, the International Council of Museums beginning in 1996."
Type
Report
Title
RLG Best Practice Guidelines for Encoded Archival Description
These award-winning guidelines, released in August 2002, were developed by the RLG EAD Advisory Group to provide practical, community-wide advice for encoding finding aids. They are designed to: facilitate interoperability of resource discovery by imposing a basic degree of uniformity on the creation of valid EAD-encoded documents; encourage the inclusion of particular elements, and; develop a set of core data elements. 
Publisher
Research Libraries Group
Publication Location
Mountain View, CA, USA
Language
English
Critical Arguements
<CA> The objectives of the guidelines are: 1. To facilitate interoperability of resource discovery by imposing a basic degree of uniformity on the creation of valid EAD-encoded documents and to encourage the inclusion of elements most useful for retrieval in a union index and for display in an integrated (cross-institutional) setting; 2. To offer researchers the full benefits of XML in retrieval and display by developing a set of core data elements to improve resource discovery. It is hoped that by identifying core elements and by specifying "best practice" for those elements, these guidelines will be valuable to those who create finding aids, as well as to vendors and tool builders; 3. To contribute to the evolution of the EAD standard by articulating a set of best practice guidelines suitable for interinstitutional and international use. These guidelines can be applied to both retrospective conversion of legacy finding aids and the creation of new finding aids.  
Conclusions
<RQ>
SOW
<DC> "RLG organized the EAD working group as part of our continuing commitment to making archival collections more accessible on the Web. We offer RLG Archival Resources, a database of archival materials; institutions are encouraged to submit their finding aids to this database." ... "This set of guidelines, the second version promulgated by RLG, was developed between October 2001 and August 2002 by the RLG EAD Advisory Group. This group consisted of ten archivists and digital content managers experienced in creating and managing EAD-encoded finding aids at repositories in the United States and the United Kingdom."
Type
Web Page
Title
Archiving The Avant Garde: Documenting And Preserving Variable Media Art.
Archiving the Avant Garde is a collaborative project to develop, document, and disseminate strategies for describing and preserving non-traditional, intermedia, and variable media art forms, such as performance, installation, conceptual, and digital art. This joint project builds on existing relationships and the previous work of its founding partners in this area. One example of such work is the Conceptual & Intermedia Arts Online (CIAO) Consortium, a collaboration founded by the BAM/PFA, the Walker Art Center, and Franklin Furnace, that includes 12 other international museums and arts organizations. CIAO develops standardized methods of documenting and providing access to conceptual and other ephemeral intermedia art forms. Another example of related work conducted by the project's partners is the Variable Media Initiative, organized by the Guggenheim Museum, which encourages artists to define their work independently from medium so that the work can be translated once its current medium is obsolete. Archiving the Avant Garde will take the ideas developed in previous efforts and develop them into community-wide working strategies by testing them on specific works of art in the practical working environments of museums and arts organizations. The final project report will outline a comprehensive strategy and model for documenting and preserving variable media works, based on case studies to illustrate practical examples, but always emphasizing the generalized strategy behind the rule. This report will be informed by specific and practical institutional practice, but we believe that the ultimate model developed by the project should be based on international standards independent of any one organization's practice, thus making it adaptable to many organizations. Dissemination of the report, discussed in detail below, will be ongoing and widespread.
Critical Arguements
CA "Works of variable media art, such as performance, installation, conceptual, and digital art, represent some of the most compelling and significant artistic creation of our time. These works are key to understanding contemporary art practice and scholarship, but because of their ephemeral, technical, multimedia, or otherwise variable natures, they also present significant obstacles to accurate documentation, access, and preservation. The works were in many cases created to challenge traditional methods of art description and preservation, but now, lacking such description, they often comprise the more obscure aspects of institutional collections, virtually inaccessible to present day researchers. Without strategies for cataloging and preservation, many of these vital works will eventually be lost to art history. Description of and access to art collections promote new scholarship and artistic production. By developing ways to catalog and preserve these collections, we will both provide current and future generations the opportunity to learn from and be inspired by the works and ensure the perpetuation and accuracy of art historical records. It is to achieve these goals that we are initiating the consortium project Archiving the Avant Garde: Documenting and Preserving Variable Media Art."
Conclusions
RQ "Archiving the Avant Garde will take a practical approach to solving problems in order to ensure the feasibility and success of the project. This project will focus on key issues previously identified by the partners and will leave other parts of the puzzle to be solved by other initiatives and projects in regular communication with this group. For instance, this project realizes that the arts community will need to develop software tools which enable collections care professionals to implement the necessary new description and metadata standards, but does not attempt to develop such tools in the context of this project. Rather, such tools are already being developed by a separate project under MOAC. Archiving the Avant Garde will share information with that project and benefit from that work. Similarly, the prospect of developing full-fledged software emulators is one best solved by a team of computer scientists, who will work closely with members of the proposed project to cross-fertilize methods and share results. Importantly, while this project is focused on immediate goals, the overall collaboration between the partner organizations and their various initiatives will be significant in bringing together the computer science, arts, standards, and museum communities in an open-source project model to maximize collective efforts and see that the benefits extend far and wide."
SOW
DC "We propose a collaborative project that will begin to establish such professional best practice. The collaboration, consisting of the Berkeley Art Museum and Pacific Film Archive (BAM/PFA), the Solomon R. Guggenheim Museum, Rhizome.org, the Franklin Furnace Archive, and the Cleveland Performance Art Festival and Archive, will have national impact due to the urgent and universal nature of the problem for contemporary art institutions, the practicality and adaptability of the model developed by this group, and the significant expertise that this nationwide consortium will bring to bear in the area of documenting and preserving variable media art." ... "We believe that a model informed by and tested in such diverse settings, with broad public and professional input (described below), will be highly adaptable." ..."Partners also represent a geographic and national spread, from East Coast to Midwest to West Coast. This coverage ensures that a wide segment of the professional community and public will have opportunities to participate in public forums, hosted at partner institutions during the course of the project, intended to gather an even broader cross-section of ideas and feedback than is represented by the partners." ... "The management plan for this project will be highly decentralized ensuring that no one person or institution will unduly influence the model strategy for preserving variable media art and thereby reduce its adaptability."
CA "The purpose of this document is: (1) To provide a better understanding of the functionality that the MPEG-21 multimedia framework should be capable of providing; (2) To offer high level descriptions of different MPEG-21 applications against which the formal requirements for MPEG-21 can be checked; (3) To act as a basis for devising Core Experiments which establish proof of concept; (4) To provide a point of reference to support the evaluation of responses submitted against ongoing MPEG-21 Calls for Proposals; (5) To be a 'Public Relations' instrument that can help to explain what MPEG-21 is about."
Conclusions
RQ not applicable
SOW
DC The Moving Picture Experts Group (MPEG) is a working group of ISO/IEC, made up of some 350 members from various industries and universities, in charge of the development of international standards for compression, decompression, processing, and coded representation of moving pictures, audio and their combination. MPEG's official designation is ISO/IEC JTC1/SC29/WG11. So far MPEG has produced the following compression formats and ancillary standards: MPEG-1, the standard for storage and retrieval of moving pictures and audio on storage media (approved Nov. 1992); MPEG-2, the standard for digital television (approved Nov. 1994); MPEG-4, the standard for multimedia applications; MPEG-7, the content representation standard for multimedia information search, filtering, management and processing; and MPEG-21, the multimedia framework.
There are many types of standards used to manage museum collections information. These "standards", which range from precise technical  standards to general guidelines, enable museum data to be efficiently  and consistently indexed, sorted, retrieved, and shared, both  in automated and paper-based systems. Museums often use metadata standards  (also called data structure standards) to help them: define what types of information to record in their database  (or card catalogue); structure this information (the relationships between the  different types of information). Following (or mapping data to) these standards makes it possible  for museums to move their data between computer systems, or share  their data with other organizations.
Notes
The CHIN Web site features sections dedicated to Creating and Managing Digital Content, Intellectual Property, Collections Management, Standards, and more. CHIN's array of training tools, online publications, directories and databases are especially designed to meet the needs of both small and large institutions. The site also provides access to up-to-date information on topics such as heritage careers, funding and conferences.
Critical Arguements
CA "Museums often want to use their collections data for many purposes, (exhibition catalogues, Web access for the public, and curatorial research, etc.), and they may want to share their data with other museums, archives, and libraries in an automated way. This level of interoperability between systems requires cataloguing standards, value standards, metadata standards, and interchange standards to work together. Standards enable the interchange of data between cataloguer and searcher, between organizations, and between computer systems."
Conclusions
RQ "HIN is also involved in a project to create metadata for a pan-Canadian inventory of learning resources available on Canadian museum Web sites. Working in consultation with the Consortium for the Interchange of Museum Information (CIMI), the Gateway to Educational Materials (GEM) [link to GEM in Section G], and SchoolNet, the project involves the creation of a Guide to Best Practices and cataloguing tool for generating metadata for online learning materials. " 
SOW
DC "CHIN is involved in the promotion, production, and analysis of standards for museum information. The CHIN Guide to Museum Documentation Standards includes information on: standards and guidelines of interest to museums; current projects involving standards research and implementation; organizations responsible for standards research and development; Links." ... "CHIN is a member of CIMI (the Consortium for the Interchange of Museum Information), which works to enable the electronic interchange of museum information. From 1998 to 1999, CHIN participated in a CIMI Metadata Testbed which aimed to explore the creation and use of metadata for facilitating the discovery of electronic museum information. Specifically, the project explored the creation and use of Dublin Core metadata in describing museum collections, and examined how Dublin Core could be used as a means to aid in resource discovery within an electronic, networked environment such as the World Wide Web." 
Abstract The ability of investigators to share data is essential to the progress of integrative scientific research both within and across disciplines. This paper describes the main issues in achieving effective data sharing based on previous efforts in building scientific data networks and, particularly, recent efforts within the Earth sciences. This is presented in the context of a range of information architectures for effecting differing levels of standardization and centralization both from a technology perspective as well as a publishing protocol perspective. We propose a new Metadata Interchange Format (.mif) that can be used for more effective sharing of data and metadata across digital libraries, data archives and research projects.
Critical Arguements
CA "In this paper, we discuss two important information technology aspects of the electronic publication of data in the Earth sciences, metadata, and a variety of different concepts of electronic data publication. Metadata are the foundation of electronic data publications and they are determined by needs of archiving, the scientific analysis and reproducibility of a data set, and the interoperability of diverse data publication methods. We use metadata examples drawn from the companion paper by Staudigel et al. (this issue) to illustrate the issues involved in scaling-up the publication of data and metadata by individual scientists, disciplinary groups, the Earth science community-at-large and to libraries in general. We begin by reviewing current practices and considering a generalized alternative." ... 'For this reason, we will we first discuss different methods of data publishing via a scientific data network followed by an inventory of desirable characteristics of such a network. Then, we will introduce a method for generating a highly portable metadata interchange format we call .mif (pronounced dot-mif) and conclude with a discussion of how this metadata format can be scaled to support the diversity of interests within the Earth science community and other scientific communities." ... "We can borrow from the library community the methods by which to search for the existence and location of data (e.g., Dublin Core http://www.dublincore.org) but we must invent new ways to document the metadata needed within the Earth sciences and to comply with other metadata standards such as the Federal Geographic Data Committee (FGDC). To accomplish this, we propose a metadata interchange format that we call .mif that enables interoperability and an open architecture that is maximally independent of computer systems, data management approaches, proprietary software and file formats, while encouraging local autonomy and community cooperation. "
Conclusions
RQ "These scalable techniques are being used in the development of a project we call SIOExplorer that can found at http://sioexplorer.ucsd.edu although we have not discussed that project in any detail. The most recent contributions to this discussion and .mif applications and examples may be found at http:\\Earthref.org\metadata\GERM\."
SOW
DC This article was written by representatives of the San Diego Supercomputer Center and the Insititute of Geophysics and Planetary Physics under the auspices of the University of California, San Diego.
Just like other memory institutions, libraries will have to play an important part in the Semantic Web. In that context, ontologies and conceptual models in the field of cultural heritage information are crucial, and the interoperability between these ontologies and models perhaps even more crucial. This document reviews four projects and models that the FRBR Review Group recommends for consideration as to interoperability with FRBR.
Publisher
International Federation of Library Associations and Institutions
Critical Arguements
CA "Just like other memory institutions, libraries will have to play an important part in the Semantic Web. In that context, ontologies and conceptual models in the field of cultural heritage information are crucial, and the interoperability between these ontologies and models perhaps even more crucial."
Conclusions
RQ 
SOW
DC "Some members of the CRM-SIG, including Martin Doerr himself, also are subscribers to the FRBR listserv, and Patrick Le Boeuf, chair of the FRBR Review Group, also is a member of the CRM-SIG and ISO TC46/SC4/WG9 (the ISO Group on CRM). A FRBR to CRM mapping is available from the CIDOC CRM-SIG listserv archive." ... This report was produced by the Cataloguing Section of IFLA, the International Federation of Library Associations and Institutions. 
This document is a draft version 1.0 of requirements for a metadata framework to be used by the International Press Telecommunications Council for all new and revised IPTC standards. It was worked on and agreed to by members of the IPTC Standards Committee, who represented a variety of newspaper, wire agencies, and other interested members of the IPTC.
Notes
Misha Wolf is also listed as author.
Publisher
International Press Telecommunications Council (IPTC)
Critical Arguements
CA "This Requirements document forms part of the programme of work called ITPC Roadmap 2005. The Specification resulting from these Requirements will define the use of metadata by all new IPTC standards and by new major versions of existing IPTC standards." (p. 1) ... "The purpose of the News Metadata Framework (NMDF) WG is to specify how metadata will be expressed, referenced, and managed in all new major versions of IPTC standards. The NMF WG will: Gather, discuss, agree and document functional requirements for the ways in which metadata will be expressed, referenced and managed in all new major versions of IPTC standards; Discuss, agree and document a model, satisfying these requirements; Discuss, agree and document possible approaches to expressing this model in XML, and select those most suited to the tasks. In doing so, the NMDF WG will, where possible, make use of the work of other standards bodies. (p. 2)
Conclusions
RQ "Open issues include: The versioning of schemes, including major and minor versions, and backward compatibility; the versioning of TopicItems; The design of URIs for TopicItem schemes and TopicItem collections, including the issues of: versions (relating to TopicItems, schemes, and collections); representations (relating to TopicItems and collections); The relationship between a [scheme, code] pair, the corresponding URI and the scheme URI." (p. 17)
SOW
DC The development of this framework came out of the 2003 News Standards Summit, which was attended by representatives from over 80 international press and information agencies ... "The News Standards Summit brings together major players--experts on news metadata standards as well as commercial news providers, users, and aggregators. Together, they will analyze the current state and future expectations for news and publishing XML and metadata efforts from both the content and processing model perspectives. The goal is to increase understanding and to drive practical, productive convergence." ... This is a draft version of the standard.
Type
Web Page
Title
Interactive Fiction Metadata Element Set version 1.1, IFMES 1.1 Specification
This document defines a set of metadata elements for describing Interactive Fiction games. These elements incorporate and enhance most of the previous metadata formats currently in use for Interactive Fiction, and attempts to bridge them to modern standards such as the Dublin Core.
Critical Arguements
CA "There are already many metadata standards in use, both in the Interactive Fiction community and the internet at large. The standards used by the IF community cover a range of technologies, but none are fully compatible with bleeding-edge internet technology like the Semantic Web. Broader-based formats such as the Dublin Core are designed for the Semantic Web, but lack the specialized fields needed to describe Interactive Fiction. The Interactive Fiction Metadata Element Set was designed with three purposes. One, to fill in the specialized elements that Dublin Core lacks. Two, to unify the various metadata formats already in use in the IF community into a single standard. Three, to bridge these older standards to the Dublin Core element set by means of the RDF subclassing system. It is not IFMES's goal to provide every single metadata element needed. RDF, XML, and other namespace-aware languages can freely mix different vocabularies, therefore IFMES does not subclass Dublin Core elements that do not relate to previous Interactive Fiction metadata standards. For these elements, IFMES recommends using the existing Dublin Core vocabulary, to maximize interoperability with other tools and communities."
Conclusions
RQ "Several of the IFMES elements can take multiple values. Finding a standard method of expressing multiple values is tricky. The approved method in RDF is either to repeat the predicate with different objects, or create a container as a child object. However, some RDF parsers don't work well with either of these methods, and many other languages don't allow them at all. XML has a value list format in which the values are separated with spaces, however this precludes spaces from appearing within the values themselves. A few legacy HTML attributes whose content models were never formally defined used commas to separate values that might contain spaces, and a few URI schemes accept multiple values separated by semicolons. The IFMES discussion group continues to examine this problem, and hopes to have a well-defined solution by the time this document reaches Candidate Recommendation status. For the time being IFMES recommends repeating the elements whenever possible, and using a container when that fails (for example, JSON could set the value to an Array). If an implementation simply must concatenate the values into a single string, the recommended separator is a space for URI and numeric types, and a comma followed by a space for text types."
SOW
DC The authors are writers and programmers in the interactive fiction community.