22 August 2001


A Question of Balance

Private Rights and the Public Interest
in Scientific and Technical Databases

National Research Council, 1999

Chapters 1-4

Return to Title Page - through - Summary  
Scientific and Technical Data and the Creation of New Knowledge  
Scientific and Technical Databases as a Resource--The Current Context  
Collection of Original Data and Production of New Databases  
Dissemination of Scientific and Technical Data and the Issue of Access  
Use of Scientific and Technical Databases  
The Challenge of Effectively Balancing Private Rights and the Public Interest in Scientific and Technical Databases  
Divergent Objectives of Organizations That Produce and Distribute Scientific and Technical Databases  
Scientific and Technical Database Costs, Pricing, and Access  
Production and Distribution Costs  
Pricing and Access  
Stronger Statutory Protection and the Incentives for Investment  
Mounting Pressures on Government Producers and Distributors of Scientific and Technical Databases for Cost Recovery  
Access to U.S. Government-funded Scientific and Technical Data  
Existing Protections for Databases in the United States  
Legal Protections  
Technological Protections  
Market-based Database Protection Through Updating and Customizing  
Tipping the Balance: The European Union's Database Directive  
Assessment of Legislative Options, with Recommendations on Guiding Principles  
The Standard of Harm  
Scope of Protection  
Term of Protection  
Exemptions for Not-for-Profit Research and Education  
Periodic Assessments of Effects Under Any New Statute  
Exemptions for Government Databases  
Assessment of Policy Options, with Recommendations for Government Action  
Promoting Availability of Government Scientific and Technical Data  
Maintaining Nonexclusive Rights by Not-for-Profits in Government-funded Databases  
Organizing Discussions of Licensing Terms for Not-for-Profit Uses of Commercial Scientific and Technical Databases  
Improving the Understanding of Complex Economic Aspects of Scientific and Technical Database Activities  
Promoting International Access to Scientific and Technical Data  
Recommended Approach for the Not-for-Profit Scientific and Technical Community  

A Biographical Sketches of Committee Members  
B Workshop Agenda and Participants  
C Workshop Proceedings--Listing of Contents  
D European Union Directive on the Legal Protection of Databases  

Chapter 3

Access to and Protections for Databases: Existing Policies and Approaches

The escalating drive to enhance legal protection for databases arises primarily from three developments. The first is the evolution of a digital world in which information is an increasingly important commercial commodity whose unauthorized appropriation can be accomplished cheaply and accurately, and the information broadly disseminated. The second is the 1991 U.S. Supreme Court decision in Feist Publications, Inc. v. Rural Telephone Service Co.,1 limiting copyright protection to creative works and denying protection to a "sweat-of-the-brow" database whose composition, even though it may require the investment of effort and resources, is not sufficiently creative in selection and arrangement to qualify for copyright protection. The third is the European Parliament's adoption in 1996 of the Directive on the Legal Protection of Databases (hereinafter, the E.U. Database Directive; the E.U. Directive)2 that requires countries of the European Union to adopt strong property protection for databases (see Appendix D for the full text of the E.U. Directive). The E.U. Directive also stipulates that sui generis protection for databases in Europe will be extended to foreign database rights holders only if their home countries have adopted substantially similar protection.3 These three developments have resulted in a perceived increased vulnerability of databases to misappropriation and to a new European legal regime that has been alleged to place U.S. database rights holders at a competitive disadvantage in Europe.4 It is the last factor that appears to concern private-sector scientific and technical (S&T) database producers and vendors the most, based on input received at the committee's January 1999 workshop. Nevertheless, it is important to note that all other laws protecting foreign rights holders in the European Union remain independently applicable to them under the national treatment clause of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement)5 and related conventions.

This chapter briefly describes the law and policy governing U.S. government databases; the existing legal, technical, and market-based measures that are available to protect private-sector databases in the United States; and the new E.U. Database Directive.


The U.S. government is the world's largest creator, user, and disseminator of data and information, including the federal records and S&T databases that are considered highly valuable national assets. A basic principle underlying most U.S. information law is that democracy thrives and the economic and social benefits of information are maximized in society by fostering wide diversity in the creation, dissemination, and use of information.6 By extension, to gain the greatest economic and social benefits from government information assets, such information should be made available to all in the most efficient, timely, and equitable ways possible. U.S. laws and policies generally implement this proposition. In direct contrast to those laws that encourage protection of the proprietary rights of private-sector entities, U.S. domestic information policy at the federal level may be summarized as one comprising "a strong freedom of information law, no government copyright, fees limited to recouping the cost of dissemination, and no restrictions on reuse."7

U.S. law expressly forbids federal departments and agencies from claiming copyright in their written works, thereby placing these information resources in the public domain. The 1976 Copyright Act states that "[c]opyright protection under this title is not available for any work of the United States Government."8 The reasons are several. One is the fundamental belief that government copyright of public records is the antithesis of open access whereby an informed citizenry can check official actions and possible abuses. However, other values also are at work. Taxpayers should not have to pay twice for the same information--once for the cost of generating the work, and a second time to obtain it. Also important to avoid is the danger that government could exercise copyright in a manner that would burden free speech (e.g., so as to prevent critics from obtaining particular information at any price). Finally, individuals ought to be able to derive benefit from public goods (such as public S&T data and information) and enjoy improved educational opportunities through increased access to data and information, opportunities that are inherently beneficial in their own right.9 Thus, the position of Congress has been to support the development of secondary markets for government information by individuals and private businesses, and to otherwise encourage the distribution of government information in the public interest.

The U.S. Freedom of Information Act (FOIA)10 and the open records laws of the individual states11 together balance the right of citizens to be informed about government activities and the need to maintain the confidentiality of some government records. Both the national FOIA and state open records laws generally support a policy of broad disclosure by government. For instance, if a database held by a federal agency is determined to be an agency record, the record must be disclosed to any person requesting it unless the record falls within one of nine exceptions contained in the FOIA.12 Exceptions are construed narrowly by the courts so that disclosure is typically favored over non-disclosure. In responding to citizen requests for records, government agencies at most levels are authorized to recover the costs of responding to those requests.

Federal departments and agencies also have affirmative obligations to actively disseminate their information as defined by the provisions of OMB Circular A-130.13 They are particularly encouraged to disseminate raw content on which value-added products can be based and to do so at cost of distribution and through diverse channels, with no imposition of restrictions on the use of the data. The core provisions of OMB Circular A-130 were incorporated into the Paperwork Reduction Act of 1995,14 which additionally encourages agencies to use information technology to provide public access, rather than relying on cumbersome FOIA processes. Given federal agencies' expanding use of World Wide Web servers to meet their internal objectives, as well as to better implement the government's data-sharing policies, the additional cost of disseminating information to the public has become so negligible that many government databases are now made freely available to anyone with the ability to access them over the Internet.15

Open access policies specifically targeted at public and publicly funded S&T data availability and exchange, on agency and interagency program levels as well as internationally, have been adopted in recent years especially in the area of environmental and earth science research.16 These policies all restate in similar terms the federal policy of full and open availability of data pointed to in Chapter 1.

The Federal Acquisition Regulation (FAR) applies generally to all federal agency contractual relationships with the private sector.17 In subpart 27.4, "Rights in Data and Copyrights," the FAR delineates the respective rights and obligations of the government and the contractor regarding the use, duplication, and disclosure of data produced under contracts with the government.18 As a general proposition, the government acquires unlimited rights in most data first produced in the performance of a contract, while the contractor may receive limited rights in some data.

Article 36(a) of OMB Circular A-110, "Uniform Administrative Requirements for Grants and Agreements with Institutions of Higher Education, Hospitals, and Other Not-for-profit Organizations,"19 states that the recipient of a grant or agreement subject to this circular "may copyright any work that is subject to copyright and was developed, or for which rights holdership was purchased, under an award. The Federal awarding agency(ies) reserve a royalty-free, nonexclusive and irrevocable right to reproduce, publish or otherwise use the work for Federal purposes, and to authorize others to do so." Article 24(h) states, "Unless Federal awarding agency regulations or the terms and conditions of the award provide otherwise, recipients shall have no obligation to the Federal Government with respect to program income earned from license fees and royalties for copyrighted material, patents, patent applications, trademarks, and inventions produced under an award."

Similarly, the Grant Policy Manual of the National Science Foundation (NSF) specifies in Section 732.1 that the following principles govern the treatment of copyrightable material produced under NSF grants:20

a. NSF normally will acquire only such rights in copyrightable material as are needed to achieve its purposes or to comply with the requirements of any applicable government-wide policy or international agreement.
b. To preserve incentives for private dissemination and development, NSF normally will not restrict or take any part of income earned from copyrightable material except as necessary to comply with the requirements of any applicable government-wide policy or international agreement.
c. In exceptional circumstances, NSF may restrict or eliminate an awardee's control of NSF-supported copyrightable material and of income earned from it, if NSF determines that this would best serve the purposes of a particular program or grant.

Cooperative research and development agreements (CRADAs) provide yet another legal mechanism by which databases or services can be created to meet particular governmental needs.21 The CRADA legislation, however, creates an exception to the Freedom of Information Act. Databases created under a CRADA potentially may be withheld from citizen requests under FOIA.22

State and local governmental entities in the United States also create and maintain records and databases that have substantial value for various segments of the research and educational community. State governments historically have been a primary source of detailed information in the areas of health, welfare, education, labor markets, transportation, the environment, and criminal justice.23 Because communities have a great interest in knowing about themselves and their activities, local governments often produce detailed databases on the characteristics and attributes of physical, social, and human resources in the community that are unavailable from other sources.

The U.S. Copyright Act does not explicitly ban copyright claims in the works of state and local governments, as it does for the works of the U.S. government.24 As such, most state and local governments believe they have the option of asserting copyright in their public records if they choose to do so. Some legal scholars argue that although allowed by law, generally it is unwise economic and social policy for state and local governments to allow government commercialization of public information.25 Other legal scholars argue that claims of copyright by state and local governments in many of their works and databases are illegal.26 Under the patents and copyright clause of the U.S. Constitution, the argument is made that Congress lacks the ability to extend copyright beyond that which is necessary to provide "incentives" to authors to make their works available.27 When state or local government agencies collect information in response to a legislated obligation, it is the public need as defined by the legislative obligation that provides the incentive to gather information or to create a public record. If copyright failed to exist, the information would still be collected. This being the case, copyright provides no incentive for data collection and database production, and the works therefore may not be protected by copyright.28 Yet other legal scholars claim that government commercialization of public information raises significant First Amendment free speech issues.29 One argument is that contractual provisions that ban the reuse or further dissemination of public information or that establish varying fee structures depending on purpose of use might readily be used by government officials for ulterior motives of censorship or manipulation of public information for political purposes.

Regardless of the legal and economic arguments, many local and state agencies have pursued the imposition of copyright in at least some public records and databases, both hard copy and digital.30 These local government authorities perceive the possibility of paying for the creation and maintenance of local government information systems other than through general tax revenues. Restricting access to public records is contrary to the plain letter language of most state open records laws in the United States, and therefore explicit legislation is typically required to allow the restrictions. Those who seek to impose new access restrictions on citizens bear the burden of overcoming the underlying policy arguments on which the existing laws are based, foremost of which are that open access keeps government accountable and that open access to government information, subject to appropriate limitations based on privacy, confidentiality, national security, and other considerations, has far greater long-term economic benefits for a community than does pursuing revenue-generation approaches.

It is noteworthy that the United States has become a world leader in research and technology at a time when its domestic public information laws have been so divergent from those of most other nations. In general, the U.S. legal system allows greater access to and use of government information at the local, state, and national government levels than is allowed in other nations. U.S. law also grants individuals greater leeway to use the work products of others without permission than is often granted by the laws of other nations. The role of the U.S. legal system in supporting full and open access to scientific data for the academic and commercial sectors and the role of U.S. federal funding in defraying the costs of collecting and providing access to scientific data are factors that should not be overlooked when exploring the competitive success of U.S. scientists and businesses.


Currently available legal protections of databases in the United States include copyright, private contracts and licensing, trade secret law, and state unfair competition law. Significant augmentation of the existing legal regime is provided for online databases by various technical means as well as by additional market-based measures.

Legal Protections

Copyright Law

A database can be protected under the Copyright Act as a "compilation," defined as a work that results from the collection and assembly of data that are "selected, coordinated, and arranged" in an original way.31 As the Supreme Court stated in the Feist decision, if the selection or arrangement of the data displays a "modicum of creativity" it is protectable by copyright.32 The term of copyright protection is long--the life of the author plus 70 years, or in the case of a work for hire, the shorter of 95 years from first publication or 120 years from the year of creation.33 An unauthorized reproduction, which is not otherwise privileged by the law, is illegal, and substantial civil and criminal remedies exist to punish infringers.34

Since the fall of 1998, copyright law has prohibited the manufacture and sale of devices designed primarily to circumvent technologies such as signal scrambling and encryption that are used to protect copyrighted works. Beginning in the fall of 2000, the law will also prohibit the attempt to circumvent such technological protections. The 1998 law, known as the Digital Millennium Copyright Act,35 contains some exemptions for libraries, educational institutions, law enforcement, and research activities, and opens the possibility of additional future exemptions to be made after further study of the statute's operation by the Librarian of Congress. The new statute also prohibits the removal or alteration of "copyright management information" from copyrighted works. "Copyright management information" essentially means any identifying mark, such as the name and address of an author or copyright rights holder that is associated with a work.36

The copyright law, however, permits some unauthorized uses that are deemed to be "fair" or that are specifically exempted from infringement in the statute.37 Section 107 of the 1976 Copyright Act states that:

. . . the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include--
(1)   the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
(2)   the nature of the copyrighted work;
(3)   the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
(4)   the effect of the use upon the potential market for or value of the copyrighted work.38

A more significant limitation on copyright protection, particularly for databases, is that copyright protects only the manner of expression and does not "extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery" incorporated into a copyrighted work.39 For some databases, the line between protected expression and unprotected facts can be difficult to identify. Generally speaking, however, copyright would most likely protect the selection or arrangement of a database, but not the data as such. Consider, for example, a scientific journal article reporting the results of an experiment, with the data from the experiment reproduced as an appendix in a database qualifying as a compilation. Copyright would protect the narrative description of the experiment, but subsequent researchers could use the findings of the experiment without permission. Moreover, other researchers could extract the data from the appendix.

Unlike in Europe and in many other countries, U.S. copyright law does not protect works of authorship created by the federal government.40 This asymmetry is significant, since a similar asymmetry with the European Union would exist upon enactment of any new database protection legislation in the United States.

Because of the speculative nature of the outcome of basic scientific research, much science is conducted with the use of public funds by researchers in government and in academia who do not directly depend financially on the economic exploitation of their results. The realities of the scientific process and the limited legal protection for data under copyright law have contributed to the tradition and culture within the scientific community of open sharing of ideas and data discussed in Chapters 1 and 2.

Private Contracts and Licensing

Rights holders of trade secrets may protect the confidentiality of their information by disclosing the information only to those who agree, in a binding contract, to keep it confidential and to refrain from reproducing the information. Where two individuals bargain face to face and one agrees to disclose in return for a promise of confidentiality, a court ordinarily will enforce the terms as it would any other contractual provision. If the information is sold outright to another party--such as when a book is sold to a consumer--contractual restraints on the buyer's right to use or resell the copy usually are not enforced.41 Increasingly, however, information products are not "sold," but rather are licensed with restrictive terms as to use.42 For example, all the commercial database vendors who participated in the committee's January 1999 workshop, as well as those not-for-profits that disseminated their data for a fee, indicated that they relied primarily on site licensing arrangements for disseminating their databases to customers. Because customers typically have the opportunity to read the license terms and conditions in advance, and even to make agreed-upon changes, such terms are ordinarily enforceable.

Database rights holders can easily protect their interests with a confidentiality agreement when distribution of a database is limited to those who directly contract with them. The fact that contract terms are only effective between the contracting parties and not binding on third parties who may get access to the database has been cited as a weakness,43 since many databases must be publicly distributed in order to be commercially viable. In recent years, however, rights holders of some digital databases have marketed their databases to the public using "shrink-wrap" license agreements. These are agreements that are enclosed within the package containing the database (usually a database on a CD-ROM or on other electronic media) and provide notice on the outside of the package that breaking the "shrink-wrap," or entering the electronic gateway if online, constitutes acceptance of the terms within. These terms often include restrictions on distribution of the database to others. The enforceability of these provisions remains uncertain, however. Some legal scholars argue that the shrink-wrap license constitutes a valid contract, while others believe that a contract cannot exist unless the parties have access to the terms prior to paying for the product.44 Similarly, some leading cases have upheld the shrink-wrap license and the terms that restrict use;45 other cases have refused to enforce the terms of a shrink-wrap license.46 Most shrink-wrap licenses permit the licensee to return the product, unused, if the terms are unacceptable. The reality is that most consumers do not read these license terms.

Private transactions are an important method of distributing valuable information and are increasingly the method of choice for providing access to data and information on the Internet. On the one hand, particularly with vulnerable digital information, the right to distribute information with restrictions on use allows original rights holders of databases to capture the economic returns from their initial investment. Moreover, private transactions are flexible in permitting the two parties to tailor their agreement to the mix of their particular interests, as long as they have the opportunity to negotiate the terms. On the other hand, enforcing "contractual" terms imposed through shrink-wrap licenses (and now "click-on" licenses in the online medium), which effectively are imposed on the public at large, may interfere with the balance between private property rights and public-interest access to information.47 A term in a shrink-wrap or click-on license that prohibits what would otherwise be a privileged use of the data might effectively limit scientists' access to or use of the raw materials necessary for their research, contrary to public-interest policies.

Trade Secret Law

State trade secret law protects valuable commercial information that is kept secret by its rights holder from unauthorized access or reproduction by improper means.48 Trade secret doctrines do not require that the information be kept absolutely secret, but the trade secret rights holder must take reasonable steps to maintain the confidentiality of the information. Trade secret law also might be applicable in the absence of any contractual provisions for confidentiality if the parties understood that the disclosure was in fact in confidence. Once information is made public it loses its secrecy and enters the public domain. It then can be free for others to use, absent some form of protection, such as copyright.

Unfair Competition Law in State Common Law

When Congress acts in areas in which it has authority (e.g., copyright, interstate commerce), federal law preempts state law when Congress explicitly so states in the legislation, or when state law would interfere with implementation of the federal law. The copyright system preempts state laws that duplicate or disrupt the protection accorded works of authorship by copyright.49 Thus, states may not grant a general right to database owners to prevent unauthorized reproduction or use of databases that qualify as an original work of authorship, and it is unlikely that they could grant a naked right against reproduction or use to unoriginal databases. However, in some states, a common law cause of action for misappropriation is understood to survive preemption and under limited circumstances may provide protection to some database owners from certain forms of unfair competition.

The doctrine of misappropriation derives from the early U.S. Supreme Court decision in International News Service v. Associated Press.50 In that case, a news wire service appropriated the dispatches of a competing service from published newspapers on the East Coast and published them simultaneously and in direct competition with the originating service on the West Coast. The Supreme Court, while denying that a statutory property right could exist in the news, found that the unauthorized appropriation in this case was prohibited because compiling the data gave rise to a "quasi-property right" and also because the unauthorized appropriation directly undermined the investment in news gathering of the originating service. Courts have usually been reluctant to apply the decision beyond the facts of the case, and the Restatement (Third) of Unfair Competition, section 38 (1995) described the misappropriation doctrine as lacking a coherent application.51

The most recent discussion of the misappropriation doctrine occurred in National Basketball Ass'n v. Motorola, Inc.52 Although the court recognized a limited scope for the doctrine--one confined to situations in which time-sensitive data were appropriated and used in direct competition with the originator--it refused the request of a professional basketball league to prohibit the unauthorized taking and distribution of the scores of its currently played games to paid subscribers of a paging service.

While database rights holders can and do assert protection under the misappropriation doctrine, the reluctance of the courts to apply this doctrine and the likelihood that its broad application would be preempted by the Copyright Act makes the misappropriation doctrine of questionable value in protecting databases beyond those with extremely time-sensitive value, such as real-time stock-price quotations. However, nothing prevents Congress from developing a minimalist form of statutory protection that builds on this foundation.

Technological Protections

The danger of database misappropriation can be mitigated with increasing efficiency by technologies that help enforce the terms of licensing contracts, or that enable the rights holder to keep the database as a trade secret while also providing access to subsets of data at arm's length.53 A number of technological innovations have been developed to provide various forms of security, privacy protection, and intellectual property management. Table 3.1 provides a summary of some of these approaches. No form of computer protection is perfect, and no method will likely prevent copying of small amounts of data. Moreover, it is almost certain that every technological security method will eventually be able to be countered through the use of other technological advances. The technological approaches that currently are available, however, can hinder or prevent, to varying degrees of efficiency, the wholesale copying and redistribution of databases without compensation to their rightful rights holders. These approaches are reviewed briefly in the discussion below.


A powerful and frequently cited technology for computer security is encryption, the encoding of data to make them unreadable to those who do not have the key for deciphering them. Separating those who should have access from those who should not, encryption enables differences in level of security with increases in key length. Encryption is applicable to practically any kind of digital information. It is the technology of choice for protecting data in storage and during transmission over an insecure channel.

But encrypted data are only as secure as the key, and there are different approaches to the problem of protecting the key. Given the data and the key, a knowledgeable user can decrypt the data and then distribute them, or distribute the encrypted data together with the key. If the protections in the system are insufficient, there are various ways in which an attacker could obtain the key or gain access to an unencrypted form of data (e.g., digital music or video while it is being played back, or text or data while they are being displayed). Thus, although encryption is an important element of any modern computer security system, it often must be combined with other elements in a security architecture to achieve the degree of protection desired for the digital data or information product.


The term "watermark" initially signified a special mark made in paper during its manufacture. The mark, which becomes visible when the paper bearing it is held up to a light, is taken as indicating an original. The term now covers a wide range of technologies for embedding information in digital files and rendered works, including text, pictures, and audio. As a technique for intellectual property protection, watermarks carry information that identifies a work or provides a means of tracing its purchaser or user. Watermarks can be visible or hidden. Hidden watermarks are designed to avoid interfering with the use of the data. For example, in digital music, watermarks can be encoded in such a way that they are not detectable to the human ear when the music is played. A watermark could be hidden in a database as extra (but unused) elements of data that would not interfere with information processing systems.

Watermarks do not prevent copying but could potentially provide a means for tracing the source of an unauthorized copy. This trace-back capability provided by watermarks is not necessarily foolproof; those who would misappropriate data or otherwise infringe on rights in a database might be able to write programs for tampering with or removing the watermarks. Such tampering may well be illegal under the recent Digital Millennium Copyright Act, however.

Online Database Access Controls

Several kinds of technology have been developed to control or limit access to databases, particularly those available online. One of the simplest is an online regulator, which limits the quantity of information that can be downloaded by a given individual or site.54 The technology provides an impediment to automatic data-mining programs that would acquire a database's contents by merging the results of a large set of individual queries. A second approach to online database access control marks the data according to different levels and categories and regulates the availability of the information to authorized parties. This technology is relevant in business and intelligence applications, where access to information is regulated according to "need to know" tiers or categories of access.

Trusted Systems

"Trusted" systems are those that can be relied on to obey certain rules for distributing information. In the context of intellectual property protection, the rules take the form of a digital contract between the information provider and user. The contract spells out fees and other terms and conditions of use, such as the period of time over which the information can be used and whether the user is allowed to print out or make copies of the information for distribution or sale.

Trusted systems with secure hardware support have been used in military and intelligence organizations for several years and currently are in limited use in digital and networked publishing. For example, they support pay-per-view and subscription viewing of satellite television services. Software and hardware for distributing music via trusted systems are in early prototyping and testing stages, and systems supporting digital network document and software publishing are in limited use on personal computers and in "electronic books." Although trusted systems for database access could be developed, such currently available technologies possess minimal security and control measures. The systems seldom have digital contract provisions specifying more than the duration of the subscription and perhaps the number of simultaneous users.

Several obstacles prevent effective, widespread deployment of trusted systems for databases. First, the legal standing of digital contracts enforced by machines has many practical limits with respect to enforceability and liability. Second, appropriate public-key infrastructure to support authentication and authorization is not yet widely used or available. Third, the only commercially viable approach to trusted systems for databases depends exclusively on software for implementing security, but software-only approaches are vulnerable to tampering and to widespread, catastrophic failure caused, for example, by computer viruses. Fourth, it can be both very costly and time consuming to attain the levels of confidence necessary to achieve a "trusted system," which conflicts with the rapid pace of the industry and time-to-market considerations.

Summary of Technological Protections

Several technological measures have been developed that can be deployed for management of intellectual property rights in databases. The main goal of protection is to prevent widescale unauthorized redistribution of databases without compensation to database rights holders. Although no totally effective technological solution has yet been developed to protect intellectual property comprehensively, several measures are already in use with increasingly satisfactory results. A potentially effective technological approach appears to be the use of trusted systems, with digital contracts that specify appropriate terms and conditions. These systems would use encryption technology for protecting databases during storage or communication, watermark technologies to enable tracing the source of pirated copies when such theft occurs, and database access controls and query governors to flexibly control database access.

Current limitations affecting technology available for protection of ownership rights in databases include absence of a widespread public-key infrastructure for encryption, legal uncertainties about the enforceability of digital contracts, and the relatively low level of security that is possible with software-only security systems. In addition, despite advances in technological measures for protecting digital databases, human fallibility--or overt malicious action--will continue to result in system security breaches for the foreseeable future.

Market-based Database Protection Through Updating and Customizing

There are various business practices that database vendors can use to protect their investments.55 One type of protection for databases arises from their commercial perishability. Many data become rapidly obsolete; consequently, databases are updated frequently. For example, meteorological data and stock market prices are provided in real time on a continuous basis. Some biotechnology databases are updated every night. Most commercially viable databases are updated at least annually. Since copying intrinsically introduces a lag, updating provides some level of protection against piracy, because the copier, like the database originator, must provide updates, thereby reducing the cost advantage of copying.56

Frequent updates can constrain the market price for a database because the most recent update competes with its previous version in the market, although the former versions may nonetheless be almost as useful. Database pricing will almost certainly permit recovering the cost of updating, but the cost of the original database might not be recoverable. Collaborations among the government, not-for-profit, and commercial sectors, however, can overcome some of these problems. For example, as discussed at the January 1999 workshop, commercial meteorological, geographic, and biotechnology database producers utilizing the original data made available freely under the federal government's full and open access mandate have successfully marketed and disseminated their value-added databases. The joint effort of the original public-data collector and commercial database value-adder and vendor accomplished the twin goals of enhanced data quality and wide dissemination at a reasonable price. Most important, in the context of this report, this broad distribution of data was achieved without statutory database protection.

Another market-based approach used by database producers and vendors to limit the potential for misappropriation, while meeting the needs of their customers, is production of customized or targeted versions of their databases. Different versions of the same database tailored to different market segments can appeal to a broader swath of the market while making it more difficult for an unfair competitor to steal all versions and undermine the customer base.57

Finally, database producers or vendors who have a well-established reputation in the market will have an advantage over most competitors who would copy their products. Customers are frequently willing to pay more to vendors who are reputed to sell quality databases and data products.


Other nations have legal and other protective measures for databases similar to those already in place in the United States, but a discussion of foreign law is beyond the scope of this study. There is, however, one important new legal development--the aforementioned E.U. Database Directive--that is particularly relevant to the present discussion, since it has been cited by commentators58 as well as by congressional legislators59 as a major driver for the adoption of a similar legal regime in the United States (see Appendix D for the full text of the directive).

The E.U. Database Directive requires that each member country of the European Union (and affiliated states) adopt legislation protecting databases.60 The E.U. Directive imposes a uniform copyright provision that protects only the "selection or arrangement" of the contents of a database that is the "author's own intellectual creation."61 Countries are permitted to provide for privileged unauthorized uses in accordance with the Berne Convention for the Protection of Literary and Artistic Works.62 The specific privilege recommended for not-for-profit educational or scientific uses is very narrowly limited, however, "for the sole purposes of illustration for teaching or scientific research as long as the source is indicated and to the extent justified by the non-commercial purpose to be achieved."63

The E.U. Directive also provides for an independent right to protect databases that are not protectable by copyright. The right attaches to any database that is a product of substantial investment and prohibits any extraction or reutilization of a substantial part of a protected database--judged qualitatively or quantitatively--without permission of the rights holder.64 The E.U. Directive provides that a noncopyrightable database is protected for 15 years from its date of completion.65 "Lawful users" of databases that have been made available to the public may extract or use insubstantial parts of the database for any purpose and may make other such use that does not conflict with the "normal exploitation of the database or unreasonably prejudice the legitimate interests of the maker of the database."66 Member states may, but are not required to, incorporate some very narrow and specific exceptions, including one for the purposes of illustration for teaching or scientific research that is more limited than the one provided for under copyright.67

Most significant, from the U.S. perspective, the E.U. Directive provides that member states should make the protection applicable only to databases owned by nationals or habitual residents of a member state or to databases owned by nationals of a third country only if the third country offers comparable protection to databases produced by nationals of a member state.68

Although preliminary drafts of the E.U. Database Directive were founded on an unfair competition law model of database protection, the final version was based on a strong property rights model. The initial right to exclude extraction or use applies even when the unauthorized use is not a competitive threat to the protected database. Only express privileged uses can escape potential liability. The privilege for scientific research appears to apply only for non-commercial purposes, and this is further qualified by an ambiguous limitation for the purposes of illustration. Since most scientific research has at least the potential for commercial application, including commercial publication, and is not simply for "illustration," the privilege may turn out to be a very narrow one, indeed, even if it is adopted by a member state. Similarly, the "insubstantial part" exception is undermined by the qualitative impact test. Moreover, the term of the right is 15 years, and potentially much longer, a very long period given the commercial half-life of many kinds of scientific data.69

When combined with unrestricted online licensing rights, strong database protection legislation such as the E.U. Directive subjects a research user of, say, a chemical handbook, to a starkly different situation than that experienced under traditional copyright law under the print paradigm. Table 3.2 provides a summary comparison of research user rights under these two legal regimes.70 The net result of unrestricted licensing coupled with strong statutory database protection is that the most borderline of all the objects of protection under intellectual property law--raw or factual data, whether S&T or any other--paradoxically receives the strongest scope of protection available from any intellectual property regime except, perhaps, patent law.71 The committee believes that the adoption of a law such as the E.U. Directive, either in the United States or internationally, would retard the advancement of science, the growth of knowledge, and opportunities for innovation.


1 499 U.S. 340 (1991). It is important to note, however, that Feist did not "overturn" the "sweat-of-the-brow" doctrine under copyright, which Congress had actually done already under the Copyright Act of 1976. Moreover, the sweat-of-the-brow doctrine under state law was never a prevailing legal approach.

2 Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the Legal Protection of Databases, 1996 O.J. (L77) 20. The E.U. Database Directive is reprinted as Appendix D of this report.

3 E.U. Database Directive (1996), note 2, Recital 56.

4 See testimony of Henry Horbaczewski, Reed Elsevier, Inc., on behalf of the Coalition Against Database Piracy in June 15, 1999, hearing on H.R. 1858, the Consumer and Investor Access to Information Act of 1999, before the House Commerce Subcommittee on Telecommunications, Trade, and Consumer Protection, U.S. House of Representatives, U.S. Congress, Washington, D.C.

5 See Final Act Embodying the Results for the Uruguay Round of Multilateral Trade Negotiations, done at Marrakesh, Morocco, April 15, 1994, reprinted in The Results of the Uruguay Round of Multilateral Trade Negotiations: The Legal Texts 2-3 (GATT Secretariat ed., 1994); Marrakesh Agreement Establishing the World Trade Organization, Annex 1C: Agreement on Trade-Related Aspects of Intellectual Property Rights, Apr. 15, 1994. The TRIPS Agreement holds all member states of the World Trade Organization to a common set of intellectual property norms.

6 44 U.S.C., section 3506(d)(1)(A) (Supp. 1995).

7 Peter N. Weiss and Peter Backlund (1997), "International Information Policy in Conflict: Open and Unrestricted Access versus Government Commercialization," in Borders in Cyberspace: Information Policy and the Global Information Infrastructure, Brian Kahin and Charles Nesson, eds., MIT Press, Cambridge, MA.

8 17 U.S.C., section 105.

9 U.S. Congress, Office of Technology Assessment (1986), Intellectual Property Rights in an Age of Electronics and Information, U.S. Government Printing Office, Washington, D.C.

10 5 U.S.C., section 552.

11 R. Daugherty, G. Leslie, and L. Reis, eds. (1997), "Tapping Official's Secrets: A State Open Government Compendium," Reporters' Committee for Freedom of the Press, Arlington, VA, available online at <>.

12 The nine exceptions as set forth in 5 U.S.C., section 552 (b) are as follows:

13 Office of Management and Budget (1993), Circular A-130, "Management of Federal Information Resources," U.S. Government Printing Office, Washington, D.C.

14 Paperwork Reduction Act of 1995, P.L. No. 104-13, 109 Stat. 163 (May 22, 1995), 44 U.S.C. Chapter 35.

15 For example, federal legislation and court decisions generally may be accessed online at <>, while many federal geographic data sets may be accessed online at <>. The databases being made available by federal agencies typically may be traced through their official Web sites indexed online at <>.

16 For a comprehensive collection of such policies in the environmental data area, see Interagency Data Management Working Group of the U.S. Global Change Research Program (1999), Data Access Policy Actions of Importance to Global Environmental Change Data Users, U.S. Global Change Research Program, Washington, D.C. See also, the Data Policies portion of the U.S. Global Change Research Program's Global Change Data and Information System Web site online at <>.

17 48 CFR, Chapter 1.

18 48 CFR, at subpart 27.4 on "Rights in Data and Copyrights."

19 Office of Management and Budget (1997), Circular A-110, "Uniform Administrative Requirements for Grants and Agreements with Institutions of Higher Education, Hospitals, and Other Not-for-profit Organizations," revised November 19, 1993; as further amended August 29, 1997.

20 National Science Foundation (1995), Grant Policy Manual, NSF 95-26, Arlington, VA, available online at <>.

21 15 U.S.C., section 3710a:

22 See DeLorme Publishing Company, Inc. v. National Oceanic and Atmospheric Administration, 917 F.Supp. 867 (DC Maine 1996) upholding 15 U.S.C. 3710a as a legislative exception to FOIA:

23 Weiss and Backlund (1997), note 7, p. 304.

24 17 U.S.C., section 105.

25 In support of the general proposition for all levels of government, see L. Ray Patterson and Craig Joyce (1989), "Monopolizing the Law: The Scope of Copyright Protection for Law Reports and Statutory Compilations," UCLA Law Review, Vol. 36, p. 719; J.H. Reichman and Pamela Samuelson (1997), "Intellectual Property Rights in Data?" Vanderbilt Law Review, Vol. 50, p. 51; and J. Littman (1992), "After Feist," U. Dayton Law Review, Vol. 17, p. 607. For a state-level statement of policy in general accord see Minnesota Government Information Access Council, Digital Democracy: Citizens' Guide for Government Policy in the Information Age, available online at <>.

26 H. Perritt (1996), Law and the Information Superhighway, John Wiley & Sons, New York, p. 484.

27 United States Constitution, Article I, Section 8, clause 8.

28 See generally the discussion in Perritt (1996), note 26, pp. 482-487.

29 See Henry Perritt, Jr. (1996), Section 11.10 First Amendment Role, Law and the Information Superhighway, Wiley Law Publications, New York, pp. 489-491; Philip H. Miller (1991), "Life After Feist: Facts, the First Amendment, and Copyright Status of Automated Databases," Fordham L. Rev., Vol. 60, pp. 507, 509; and Michael J. Haungs (1990), "Copyright of Factual Compilations: Public Policy and the First Amendment," Colum. J. Law & Soc. Probs., Vol. 23, pp. 347, 364.

30 Iver Petersen (1997), "Public Information, Business Rates: State Agencies Turn Data Base Records Into Cash Cows," New York Times, July 14, p. D1; For What It's Worth: A Guide to Valuing and Pricing Local Government Information (1996), Public Technology, Inc. Press, Washington, D.C. (Note: Public Technology, Inc. is a not-for-profit technology organization of the National League of Cities, the National Association of Counties, and the International City/County Management Association.) For a survey of local government policies relating to the distribution of digital geographic information, see H.J. Onsrud, J.P. Johnson, and J. Winnecki (1996), "GIS Dissemination Policy: Two Surveys and a Suggested Approach," Journal of Urban and Regional Information Systems, Vol. 8, No. 2, pp. 8-23.

31 17 U.S.C., section 101.

32 Feist, note 1.

33 17 U.S.C., section 302.

34 17 U.S.C., sections 501-512.

35 Digital Millennium Copyright Act, P.L. 105-304 (October 28, 1998), U.S. Congress, Washington, D.C.

36 17 U.S.C., sections 1201 and 1202.

37 17 U.S.C., section 107.

38 17 U.S.C., section 107.

39 17 U.S.C., section 102(b).

40 17 U.S.C., section 105.

41 The "first sale" doctrine of copyright law, 17 U.S.C., section 109, specifically authorizes the rights holder of a copy to "sell or otherwise dispose of" the copy. See also Bobbs-Merrill v. Straus, 210 U.S. 339 (1908).

42 For a discussion of the emerging legal issues pertaining to online database and information licensing, see the special issue on licensing of information and proposed changes to Article 2B of the Uniform Commercial Code (UCC) in Berkeley Technology Law Journal (1998), Vol. 13, No. 3, and in California Law Review (1999), Vol. 87, No. 1. Both present the results of a symposium, "Intellectual Property and Contract Law for the Information Age," held at the University of California, Berkeley, in April 1998. The symposium Web site is at <>. In April 1999, however, the efforts to amend UCC Article 2B were terminated and the proposed revisions to state law in this area were proposed instead as the "Uniform Computer Information and Technology Act." For comprehensive background information on the evolution of this issue, see generally "A Guide to the Proposed Uniform Computer Information Transactions Act" online at <>.

43 See U.S. Copyright Office (1997), Report on Legal Protection for Databases, U.S. Congress, Washington, D.C., available online at <>.

44 See generally, Mark Lemley (1995), "Intellectual Property & Shrinkwrap Licenses," USC L. Rev., Vol. 68, p. 1269.

45 Pro CD v. Zeidenberg, 86 F.3d. 1447 (7th Cir. 1996).

46 Vault Corp. v. Quaid Software Ltd., 655 F. Supp. 750 (1987).

47 See J.H. Reichman and Jonathan Franklin (1999), "Privately Legislated Intellectual Property Rights: Reconciling Freedom of Contract with Public Good Uses of Information," U. Penn. Law Review, Vol. 147, p. 875.

48 See generally, Restatement (Third) of Unfair Competition, sections 39-45 (1995).

49 Restatement (Third) of Unfair Competition, section 301 (1995). Cf. Sears, Roebuck & Co. v. Stiffel Co., 376 U.S. 225 (1964); Compco Corp. v. Day-Brite Lighting, Inc., 376 U.S. 234 (1964); and Bonito Boats Inc. v. Thunder Craft Boats, Inc. 489 U.S. 141 (1989).

50 248 U.S. 215 (1918).

51 But see Goldstein et al. v. California, 412 U.S. 546 (1973) allowing state misappropriation protection of noncopyrightable sound recordings.

52 105 F.3d 841 (2d Cir. 1997).

53 The committee did not focus extensively on the increasingly important area of technological protections for digital information, because a concurrent NRC report is examining this issue in depth. See Computer Science and Telecommunications Board, National Research Council (2000), The Digital Dilemma: Intellectual Property in the Information Age, National Academy Press, Washington, D.C., in press. For additional information on these technologies, see Mark Stefik and Teresa Lunt, "Overview of Technologies for Protecting and Misappropriating Digital Intellectual Property Rights: The Current Situation and Future Prospects," Chapter 5 in the committee's online Proceedings. See National Research Council (1999), Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options, National Academy Press, Washington, D.C., <>. Also see Berkeley Technology Law Journal, Vol. 13, No. 3, Fall 1998 (issue dedicated to intellectual property and contract law); I. Trotter Hardy (1998), Project Looking Forward: Sketching the Future of Copyright in a Networked World, report prepared for the U.S. Copyright Office; Mark Stefik and Alexander Silverman (1997), "The Bit and the Pendulum: Balancing the Interests of Stakeholders in Digital Publishing," American Programmer, September, pp. 18-35 (also published in The Computer Lawyer, Vol. 16, No. 1, pp. 1-15, January 1999); Brian Kahin and Kate Arms, eds. (1996), Proceedings: Electronic Commerce for Content, Forum on Technology-Based Intellectual Property Management, Interactive Multimedia Association, Annapolis, MD; Bruce Schneier (1996), Applied Cryptography, second edition, John Wiley & Sons, New York; and Lars Lyberg, ed. (1993), Journal of Official Statistics, Special Issue on Confidentiality and Data Access, Vol. 9, No. 2, Statistics Sweden.

54 For example, several of the commercial and not-for-profit database vendors at the committee's January 1999 workshop noted that this online protection strategy, in conjunction with other available technical measures and well-structured licensing terms, provided satisfactory protection for their businesses, despite the possibility that a determined individual could access and download from a database under multiple identities, or collude with others to do so, in order to reconstruct illegally the entire database.

55 See Computer Science and Telecommunications Board (2000), The Digital Dilemma, note 53, in press.

56 See Stephen M. Maurer in Appendix C of the committee's online Proceedings, note 53.

57 See generally, Carl Shapiro and Hal R. Varian (1998), "Versioning: The Smart Way to Sell Information," Harvard Business Review, Nov.-Dec., pp. 106-114.

58 See Reichman and Samuelson (1997), note 25.

59 See statement by Senator Orrin Hatch on Database Antipiracy Legislation, Cong. Rec., Vol. 106, S. 316 (Jan. 19, 1999).

60 E.U. Database Directive, note 2, Article 16. The directive required all member states to comply with its requirements by January 1, 1998. Only a few had done so by that date, and not all countries had complied as of September 1999.

61 Id., Article 3(1).

62 Id., Article 6.

63 Id., Article 6(2)(b).

64 Id., Article 7.

65 Id., Article 10. Although the nominal term of protection is limited to 15 years, Article 10(3) has the effect of extending protection in perpetuity to databases that continue to be updated or revised pursuant to a "substantial new investment, evaluated qualitatively or quantitatively."

66 Id., Article 8(2).

67 Id., Article 9(b).

68 Id., Recital 56 and Article 11.

69 See the discussion of the term of protection in .

70 See J.H. Reichman and Paul F. Uhlir (1999), "Database Protection at the Crossroads: Recent Developments and Their Impact on Science and Technology," Berkeley Technology Law Journal, Vol. 14, No. 2, pp. 799-821.

71 Reichman and Samuelson (1997), note 25, p. 94.

Chapter 4

Assessment and Recommendations

Our nation has a vibrant and demonstrably productive community of scientific and technical (S&T) database creators, disseminators, and users that has led the world. Advances in computing and communication technologies make S&T databases and the facts they contain increasingly valuable for producing new discoveries and for accelerating the growth of knowledge and the pace of innovation. The same technologies that facilitate the effective production, dissemination, and use of data, however, can also expedite their unauthorized dissemination and use, with the potential effect of undermining incentives to create new databases, facilitating unfair competition and wholesale piracy, and in the most extreme cases, exposing the original database rights holder to market failure.

As Chapter 3 points out, the current efforts to enact statutory federal database protection in the United States appear to be stimulated by three principal factors: (1) the possibility for rapid and complete database copying with the potential for instantaneous broad dissemination; (2) the gap in U.S. law created by the Feist1 decision, which invalidated copyright protection on the basis of investment and effort (i.e., "sweat-of-the-brow") investments alone; and (3) the E.U. Database Directive, which requires other nations to pass a similar law in order for their citizens to enjoy the E.U. Directive's protections in Europe, thereby providing a potentially unfair advantage to European competitors of the U.S. private sector.2

The committee believes, however, that the need for additional statutory protection has not been sufficiently substantiated. The high level of activity in the production and use of digital S&T (and other) databases in the United States serves as prima facie evidence that threats of misappropriation do not constitute a crisis. Nor do the existing legal, technical, and market-based measures provide a chronic state of underprotection for proprietary databases. The almost universal use of licensing, rather than sale, of online databases and other digital information, coupled with technological enforcement measures, on balance potentially provides much stronger protections to the licensors vis-�-vis their customers than they enjoyed prior to Feist and under the print media copyright regime (see Table 3.2 in Chapter 3). While some of the current law providing protection to database rights holders remains uncertain in terms of scope of applicability, the trend in recent years has been to broaden, rather than narrow, applicable intellectual property protections.

Moreover, strong statutory protection of databases would have significant negative impacts on access to and use of S&T databases for not-for-profit research and other public-interest uses. Nevertheless, although the committee opposes the creation of any strong new rights in compilations of factual information, it recognizes that limited new federal legal protection against wholesale misappropriation of databases may be appropriate. In particular, a balanced alternative to the highly protectionistic E.U. Database Directive could be achieved in a properly scoped and focused new U.S. law, one that might serve as a model for an international treaty in this area.

In this chapter, the committee examines several legislative options and related government activities, and recommends a number of legislative principles and policy actions to help inform the current debate. The chapter concludes with a recommendation directed specifically to the not-for-profit S&T community.


The committee assessed and compared three separate proposals for increased database protection of private-sector databases in the United States that were placed in the Congressional Record at the beginning of the 106th Congress.3 During the 105th Congress, the House of Representatives twice approved a measure establishing a specific statutory scheme for the protection of databases--H.R. 26524 and Title V of H.R. 2281,5 which was substantially the same as H.R. 2652.6 Both H.R. 2652 and 2281 were adopted on suspension calendar by the House despite significant opposition from an array of scientific, educational, library, and consumer public-interest organizations and institutions, as well as from a number of commercial publishers and information technology services companies.7 These House bills applied both to databases that qualified for copyright protection and to noncopyrightable "sweat-of-the-brow" databases that did not.

The proposed legislation drew the concerned attention of the not-for-profit communities because expanding private property rights in factual databases could interfere with scientific progress and other public-interest uses of data.8 At the same time, some private-sector firms believe that their databases are vulnerable to misappropriation due to a gap in the law.9 Other reactions from the private sector included the perception that the proposal adopted by the House placed too many impediments to transformative uses of existing databases for commercial purposes.10 Perhaps most significant, the Administration presented its own consensus critique of the House bills on August 4, 1998,11 and the Department of Justice12 and the Federal Trade Commission13 issued legal memoranda outlining their concerns about the legislation's constitutionality and anticompetitive effects, respectively.

Soon after the approval of Title V of H.R. 2281 by the House in July 1998, Senator Orrin Hatch (R-UT), chairman of the Senate Committee on the Judiciary, initiated negotiations among the various interests, which continued from early August until early October.14 While substantial progress was made in this process and the needs of the science, education, and library communities were directly acknowledged, a consensus was not achieved before the 105th Congress adjourned in October 1998.15

Shortly after the 106th Congress convened in January 1999, Congressman Howard Coble (R-NC), chairman of the Subcommittee on Courts and Intellectual Property of the House Committee on the Judiciary, reintroduced as H.R. 35416 the proposal that had twice passed the House in the previous session. H.R. 354 included two changes to respond to concerns of the scientific community and other critics of the original legislative proposal.17 Thereafter, Senator Hatch inserted in the Congressional Record two other proposals--the first drafted by a coalition of commercial and not-for-profit interests (hereinafter, the Coalition Proposal)18 seeking much more limited protection than H.R. 354, and the second a draft bill that had emerged at the end of the 1998 negotiations sponsored by Senator Hatch (hereinafter, the Senate Discussion Draft).19

In the rest of this section the committee discusses several of the most important provisions of these three proposals and evaluates them in terms of their potential effects on access to and use of S&T data by public-interest users. In doing so, the committee recognizes that these proposals have changed and will continue to change before any one of them is considered for final adoption. Nonetheless, they serve as models for the types of issues that arise from the perspective of the research and education communities confronted with the prospect of legislative changes that would affect access to data.20

As noted in Chapter 1, because of the complex web of interdependent relationships among public-sector and private-sector database producers, disseminators, and users (see Table 1.1 in Chapter 1 for an indication), any action to increase the rights of persons in one category is likely to compromise the rights of the persons in the other categories, with potentially far-reaching negative consequences. A principal concern of the committee, therefore, is that the development of any new database protection measures aimed at protecting private-sector investments take into account the need to promote access to and subsequent use of S&T data and databases not only by the not-for-profit sector, but by commercial producers of derivative databases as well. Of course, it is in the common interest of both database rights holders and users--and of society generally--to achieve a workable balance among the respective interests so that all legitimate rights remain reasonably protected. Therefore, as a general guiding principle, the committee recommends that any new federal protection of databases should balance the costs and benefits of the proposed changes for both database rights holders and users.

The Standard of Harm

The key provision of the three legislative proposals defines the nature of protection accorded a database rights holder and establishes the standard of harm against which a defendant's liability is to be judged. As introduced in January 1999, H.R. 354 prohibited the "extraction or use" of a substantial part of a database if it results in "harm to the actual or potential market" for any product or service incorporating the database.21 A "potential market" includes any market the database rights holder "has current and demonstrable plans to exploit" or a market that is "commonly exploited by persons offering similar products or services." The Senate Discussion Draft narrowed the protection of actual markets to those markets commonly exploited by persons offering similar products.22

The Coalition Proposal took a different approach, prohibiting only the "duplication of another's database [and inclusion of those records in] . . . a database that competes with the original."23 To compete with the original database, the duplicate must be substantially identical to the original, must be shown to displace substantial sales or licenses of the original, and must be offered for sale or digitally distributed in such a manner as to "significantly diminish the incentive to invest" in developing the original database. The latter requirement may be interpreted as threatening the opportunity to recover a reasonable return on the investment in collecting or organizing the original database.

None of the three legislative proposals purported to create broad property rights in the original database, as the E.U. Directive does. However, by expanding protection to "potential markets," H.R. 354 would allow the rights holder to foreclose markets or uses beyond the rights holder's actual use. This has the effect of granting exclusive rights to the original database rights holder in uses unknown at the time of the database's creation. The limitation of the H.R. 354 language stating "current demonstrable plans to exploit" is unclear because the time at which "current plans" is to be measured is not stated. Does "current" mean at the time the extraction and use occur, at the time the user develops the new market, or at the time the database rights holder brings suit? A scientific researcher might discover an entirely new application for a database, only to be foreclosed from such use if the original database rights holder were subsequently to develop "current and demonstrable plans" to exploit that application as an additional market. For example, scientist A has a database consisting of human gene sequences potentially useful for locating genes controlling certain diseases, but does not know of any particular sequences that are valuable for this purpose. By extraction and use of A's database, scientist B discovers a set of sequences that seem particularly valuable for further experimentation and makes this subset of sequences available to the scientific community. In doing so, scientist B could violate the protection provided by H.R. 354. Although protection of original, noncopyrightable databases with a strong, copyright-like property right may encourage additional investment in producing databases, it simultaneously discourages others from investing in discovery of new uses for existing databases and elevates the cost of using them. In principle, the public benefits most from the weakest legal incentives for encouraging such investments, and intellectual property theory has always promoted the open availability of facts. For the creation of legal incentives greater than this, the former chairman of the House Committee on the Judiciary, Robert Kastenmeier, required proponents of new intellectual property rights to meet a very heavy burden.24

The Senate Discussion Draft provided considerably narrower protection by requiring a showing of "substantial" harm to the actual or neighboring market,25 which was defined in the Proposed Conference Report Language as "harm [that] is such as to significantly diminish the incentive to invest in gathering, organizing or maintaining the database."26 The harm test in the Coalition Proposal was similarly circumscribed, requiring both a displacement of substantial sales and a showing that the unauthorized use "significantly diminished the incentive to invest in the collecting or organizing of the protected database.27 These latter two formulations expressly acknowledged that not all duplications are actionable, even if used for commercial purposes (e.g., in distant markets) or for pro-competitive purposes by honest means. The intent was to recognize that competitors who add value and generate socioeconomic benefits should not incur liability if they do not directly harm the market of the original database rights holder, i.e., if they do not compete unfairly.

The committee believes that strong protection based on a broadly framed standard of harm test, such as the one proposed by H.R. 354, poses a number of potential problems for research, education, and other public-interest users, as well as for legitimate private-sector, value-adding database producers. As a general rule, the stronger the statutory protection, the greater the encumbrances will be on the reuse and transformation of data received by second-generation database producers and users. One person's derivative use can be characterized as an infringement on the original database rights holder's product; where the bar is set will determine to what extent database producers and disseminators will be enriched at the expense of all socially and economically valuable downstream uses.

As noted in Chapter 1 (and indicated in Table 1.1), many organizations are users as well as producers and vendors of S&T databases, as, for example, when they draw on one or more databases to search for cross-disciplinary associations or to create a derivative or value-added database targeted to a competing or entirely new market. Private-sector creators of derivative databases have conflicting views of protection: protection of source databases might deprive them of access, but insufficient protection for their own creations might make them vulnerable to copying. Protection entails contradictory consequences for creators of derivative databases. A concern of the committee, therefore, is that any new protection judged to be necessary must take into account the need to promote access to and subsequent use of S&T data and databases not only by the not-for-profit sector, but by commercial producers of derivative databases as well.28

A major negative effect of a strong standard of harm test would be to raise the resale prices for value-added or derivative databases, as well as to inhibit their production. Value-adding database producers that use multiple data sources to create new products, as is common in both the private and the public sector, are particularly penalized by a strong standard of harm test.29 Although the consequences would be difficult to measure, strong new rights for database rights holders would probably result in a broad loss of research opportunities.30 If, for example, potential users opted to engage in other professional activities rather than deal with more expensive and onerous restrictions on database use, the probability of subsequent discoveries, innovations, and advances in knowledge would decrease, not only because of the reduced number of users, but also because the remaining database users would be constrained in their activities. Downstream commercial providers who must pay license fees to the rights holders of sole-source databases can recover such fees only if they themselves charge more for access, costs that are passed down the chain of derivative products to all users, including investigators in not-for-profit institutions.

By making entry into a market more expensive, greater statutory protection also could increase the likelihood that small or niche markets, which are commonplace for many S&T databases, would be served by sole-source providers. A higher cost of entry typically deters entrants and allows the first entrant to act as a monopolist.31 A sole source may then use its market power to inhibit the development of derivative databases if these are interpreted as undermining the investments in the original database, even if such derivative uses are in completely different markets or are protected as "permitted acts" under a statute. Monopoly power could be exercised over the data in many areas of research, because most observational databases cannot be independently recreated after the fact, and it is economically inefficient and undesirable to require independent, redundant collection of original data in activities that use very high cost systems. As the Federal Trade Commission cautioned in its analysis of the predecessor bill to H.R. 354, ". . . policies that further entrench the market power of single-source data providers could have an unintended, undesirable impact on competition and innovation because of the significant potential for anticompetitive conduct in single-source database markets."32 The law should encourage competition because competition leads to lower prices, resulting in broader use and, hence, further discovery and innovation.

Increased license fees or unreasonable restrictions on subsequent uses or redissemination of data would negatively affect both government and not-for-profit database value adders or disseminators in other ways as well. For example, European government meteorological data providers, who are already benefiting from the stronger protections offered by the E.U. Database Directive, are placing various use and redistribution restrictions on the National Oceanic and Atmospheric Administration (NOAA), asking NOAA to enforce these restrictions in the United States, contrary to existing U.S. law and policy. Such encumbrances from private-sector sources would be exacerbated by any database legislation that, similar to the E.U. Database Directive, extended protection to elements not now protected. Government S&T managers, in particular, are concerned that they do not receive enough funding to pay license fees and enforce restrictive provisions, in addition to meeting the costs of data collection and database preparation, and anticipate that they might have to decline data contributed by private-sector sources (as well as public-sector European sources) that carry high royalties or restrictions on subsequent distribution that require enforcement by the user.

With increased statutory protection for databases and the accompanying higher transaction costs, scientists and educators in the not-for-profit sector might no longer be able to afford access to newly proprietary data sources or to enforce subsequent access and use restrictions on the data obtained from those sources, contrary to existing norms and practice.33 Not-for-profit research institutions tend to be conservative, risk-averse organizations that err on the side of caution, and they would likely institute guidelines prohibiting any database research activity that might potentially expose them to liability under a new legislative regime and to costly litigation. Such a possibility is particularly problematic given the uncertainty about what portions of databases would be deemed "qualitatively substantial" by the rights holders in each case and about what they would view as a "reasonable use" by not-for-profit entities. Such defensive measures would serve to further restrict, perhaps even beyond what might be allowed under the law, what scientists and educators can do with databases that they lawfully obtain.

Providing stronger property rights for databases that contain information of high commercial value, such as in the area of genomic research, can have the opposite of the intended effect, because the price of access to these databases is inversely related to the number of users who will access them. Hence, from an S&T perspective, the goal is to encourage the generation of publicly funded, and therefore readily available, collections of data in key scientific areas, where the use of this information is of potentially great commercial value, and to discourage the tendency for private companies to capture this information and restrict access to a limited audience. Promoting broad access to publicly generated databases has the additional benefit of fostering active competition and value-adding activity since all commercial and academic organizations would have access to this information.

Moreover, enhancing database protection would also serve as an incentive to both government agencies and not-for-profit organizations to privatize or commercialize their research databases. Such action would have the undesirable outcome of reducing the number of databases in the public domain and thus would have a chilling effect on the full and open data exchange and sharing ethos that benefits so many areas of scientific and engineering research.

Since a strong case for significantly greater protection of databases has not been made, primarily because existing protections already go a long way toward protecting database providers, the committee believes that the concerns regarding increased encumbrances on access and use, as well as the potential for higher prices and related transaction costs, cannot be ignored. In light of these concerns, the committee recommends that any new federal statutory protection of databases should limit any additional protection to prohibition of acts of unauthorized taking that cause substantial competitive injury to the database rights holder in the rights holder's actual market. The standard of harm should be sufficiently clear to permit good-faith users to know when they are infringing on a database rights holder's rights and should not undermine the nation's capabilities for innovation or competition in the marketplace. Such a formulation would help prevent undue and inappropriate interference with scientific inquiry and with other traditional and customary public-interest uses of data, as well as promote legitimate and socially beneficial commercial competitive activities.

Scope of Protection

The first section of all three of the legislative proposals considered by the committee defined a database as a "collection of information collected and organized for the purpose of facilitating access to discrete items of the information." All three proposals also provided protection to databases developed through the investment of substantial monetary or other resources. "Information" was defined to include data, facts, or other intangible material capable of being collected and organized in a systematic way. The Coalition Proposal, however, excluded "works of authorship"--a term applicable to subject matter protected by the copyright system. Such an exclusion would deny protection to copyrightable works such as anthologies of an author's works or a scientific journal that might otherwise be regarded as a database of articles. The H.R. 354 and Senate Discussion Draft proposals included these works, consistent with the subject-matter scope of the E.U. Directive. The committee believes that the inclusion of collections of works of authorship, which are already unambiguously protected by copyright, is both unnecessary and unsupportable. If the purpose of this legislation is to fill a purported gap in the legal protection currently available to noncopyrightable databases, then that scope of protection should not extend so broadly as to cover fully copyrightable anthologies, journals, and textbooks. The committee therefore recommends that the subject-matter scope of any new federal statutory protection of databases be constrained to databases comprising a collection of discrete facts and items of information, and expressly exclude collections of copyrightable material, which is already protected. Further, protection under any new statute should extend only to a database that is the product of a substantial investment, and not to any idea, fact, procedure, system, method of operation, concept, principle, or discovery disclosed by the database.

Term of Protection

H.R. 35434 and its predecessor bills, as well as the Senate Discussion Draft,35 provided protection for a nominal term of 15 years, reflecting an effort to meet the E.U. Database Directive's term of protection36 in response to the E.U. Directive's attempt to assert a reciprocity requirement. However, no analysis or empirical study supports the choice of 15 years of protection by the European Union, in comparison with other potential terms of protection. The committee believes that unquestioningly adopting an arbitrary term of protection developed by a foreign power without any experience as to its potential effects would be unsupportable. Further, the European Union's reciprocity requirement contravenes a well-established U.S. government policy requiring national treatment under all international intellectual property law agreements.37 Nonetheless, one key difference between the two congressional proposals above and the E.U. Directive is that the E.U. Directive allows the period of protection for the entire database to be extended for another 15 years with each substantial new investment--thereby providing the rights holder with the possibility of perpetual protection of the entire database--while the two congressional proposals tried to limit the extension of protection to the new data that might be added.38

Despite these efforts to limit protection to 15 years, the committee is concerned about such a long term of initial protection for factual data, whether for research and education purposes or any other uses, especially if the standard of harm remains as strong as proposed in H.R. 354. Indeed, there has been a complete failure on the part of the proponents for a 15-year term of protection either to justify that term, independent of the arbitrary decision made by the framers of the E.U. Directive, or to compare it with other, shorter terms of protection used in other intellectual property law models.39 Although, in comparison, the term of copyright is much longer, historically the term has been justified and set according to an author's likely lifetime, plus some additional time in which the author's heirs can benefit.40 Moreover, there are other constitutionally imposed limits on both the scope and length of such protection. If a database meets the constitutional and statutory requirements for copyright, then the rights holder can obtain the longer term of protection that copyright law affords. However, the committee has been unable to find any rationale for the 15-year term and, based on market factors relevant to databases, questions that length of protection for noncopyrightable databases or "substantial portions thereof."

The committee notes that the average high-activity life span of original data in an online commercial database is approximately 3 years.41 Consequently, most of the incentive for creating and distributing databases comes from the return on investment achieved in the first 3 years, when demand for and use of databases are highest. It is important to note, however, that there is a significant difference between how long databases have value and how long statutory protection for noncopyrightable databases should be accorded. Although there are S&T databases in rapidly progressing areas of research, such as some in the life sciences, whose research and commercial values plummet very quickly as new data supersede old, many research endeavors, such as the study of environmental trends, longitudinal socioeconomic studies, and various types of historical analyses, not only depend on consistently collected long-term data sets, but also require access to both current and historical data for comprehensive and comparative study and verification. The committee believes that any term of protection that is set should have a duration deemed sufficient to create incentives for producing original new databases. It should not be set to assure that rights holders capture all the value, since that would require an exceedingly long period to cover all cases and would constitute establishing protected markets that are inappropriate. As a general rule, the broader and stronger the scope of protection, the shorter the period of protection needs to be to provide an appropriate incentive for creating a database. In any event, the case for the term of protection to be used should be made by those who are seeking the new protection, and this has not been done.

Neither H.R. 35442 nor the Senate Discussion Draft43 completely addressed the potential problem of extending protection to substantial portions of old data that may become inseparably intermingled with new data, which raises the issues of how to effectively track such activity and how to provide adequate notice of which substantial portions remain subject to statutory protection.44 Important factors to consider in this regard are scientific and legal authentication methods and the necessary documentation (metadata) requirements. Adequate database authentication and documentation are essential not only as prerequisites for accurately tracking the term of protection for any given database, but also for improving the reliability and value of databases for research and other uses. In addition, although the committee did not explicitly consider it, some observers have noted that a registration system similar to the one administered by the Copyright Office for copyrighted works could be helpful in notifying users about the expiration of the term of protection for any given database.45

Both proposals also allowed for 15 years of retroactive application of protection by not limiting causes of action to databases created on or after the date of enactment of the legislation. The committee finds little justification for legislation that is supposed to be necessary to stimulate and protect new investment to apply to databases already created without the benefit of such protection.

The Coalition Proposal traded off a much weaker standard of harm and scope of protection for an unlimited term of duration--basically for as long as the database has some commercial value to the rights holder. The cause of action was limited to a duplication of a database that is placed in direct commercial competition,46 and this was coupled with additional exemptions for scientific, educational, or research uses.47 The committee believes that the unlimited term of duration proposed in the Coalition Proposal is not unreasonable in the overall context of that proposed bill, since it would also provide some added protection to those databases that have significant commercial value beyond 15 years and that have long existed. It is important to emphasize, however, that any further strengthening of the standard of harm under this proposal would likely make the unlimited term of duration not only unacceptable, but unconstitutional.48 Finally, neither the problem of identifying substantial portions of old versus new data nor the problem of retroactive application of the statute arose in the Coalition Proposal.

As a general principle, the committee recommends that the term of protection in any new federal statutory protection of databases be limited to a period of time sufficient to provide incentives found necessary for the creation of new databases. If legislation with a fixed term of protection is adopted, an appropriate term of protection most likely should be substantially shorter than the proposed 15-year term. It should also be based on an analysis of the economics of the database industry, rather than set arbitrarily.

The committee also recommends that any new legislation with a fixed term of protection also should require database rights holders to provide notice of expiration of the term of protection. Specifically, any such legislation should:

Finally, the committee recommends that protection be applied only to databases created after the effective date of any new legislation, in recognition that a major purpose of enacting enhanced protection is to provide additional incentives for the development of new databases.

Exemptions for Not-for-Profit Research and Education

It might be asked why not-for-profit research and education should have access to data on better terms than commercial enterprises, and why those communities should get special "subsidies" from database producers. After all, they do not receive parallel subsidies from suppliers of laboratory mice or telescopes.

In Chapter 2, the committee argues that the price of access to commercial databases will typically be higher than the efficient-access price, leading to inefficient use. (This is not true of (nonproprietary) laboratory mice or telescopes. Mice and telescopes do not have the increasing-returns cost structure of a database, and their price in a competitive market will be the "efficient-access" price.) The consequent reduction in data use could negatively affect both commercial users and public users of databases, but commercial users have an advantage. They can recover some of the high access fees by pricing their own products appropriately. In contrast, the revenues of public users, such as public research laboratories and universities, come mostly from public agencies (taxpayers). The scientific community legitimately predicts that any increase in the price of access to data will not be compensated by increased public subsidies.49 Hence publicly funded users are likely to be more negatively affected than commercial users by new database protection legislation.

The consequent reduction in resources available for education and science would be particularly damaging because education and science have a public-good aspect. They generate "nonexcludable benefits," sometimes called externalities, that go beyond any benefits that could be realized on their own ledgers or to their own constituents. All of society benefits when children are educated, when a cure is discovered for a disease, or when significant trends in climate are detected, and society will receive these benefits even without paying directly for them. To ensure an appropriate level of investment in these activities--a level proportionate to the benefits likely to be achieved--some public subsidy is required. This explains, in part, the long-standing U.S. tradition of public education and public funding of research--a tradition that can claim a significant share of the acclaim for our economic and social standing.

The privileged status of public-interest data users was in fact recognized to varying degrees by all three of the legislative proposals under consideration. They all attempted to grant education, science, and research more leeway in utilizing protected databases, but they varied with respect to their degree of anticipated effectiveness.

H.R. 354 and the Senate Discussion Draft initially permitted extraction or use of information for not-for-profit educational, scientific, or research purposes, as long as the use does not interfere with the database rights holder's "actual market."50 Under this provision, research that produces a product that potentially, or in fact, opens a new market not exploited by the database rights holder would not violate the law. Both proposals went further, to protect some educational or research activities, even if they do interfere with the rights holder's actual market. H.R. 354 provided a fact-dependent exemption similar to the fair-use privilege in copyright law.51 An individual act of use or extraction of another's database for teaching or research would be privileged if reasonable under the circumstances.52 The reasonableness of the use or extraction was to be determined by consideration of four factors: the commercial or not-for-profit nature of the use or extraction, the good faith of the user, whether the portion used or extracted is incorporated into an independent work, and whether the use or extraction is in the same field as the original database. However, notwithstanding these factors, a use or extraction would not be privileged if it was likely to become a market substitute for all or part of the original database.53

The Senate Discussion Draft proposed that anyone can use a protected database for purposes of "illustration, explanation or example, comment or criticism, internal verification, or scientific or statistical analysis of the portion used" and further authorized not-for-profit scientific, educational, or research activities "for similar customary or transformative purposes."54 This exemption was limited if substantial harm accrues to the original database rights holder because the use was more than reasonable and customary, consists of a substitute for the original database, is intended to avoid payment of reasonable fees for use of a database specifically marketed for education, scientific, or research purposes, or is a part of a pattern of systematic use.55

Because the Coalition Proposal prohibited the duplication of a database only if the duplication displaces substantial sales or licenses of the original database, there was less need for a strong research privilege. Nonetheless, that proposal provided that uses for science, research, or education are privileged unless the uses are part of a "consistent pattern" designed to compete directly with the original database, or to avoid reasonable fees for access to a database specifically designed for a scientific, research, or educational program.56 Under the Coalition Proposal, the use of another's database for scientific or technical research would be permitted unless the use directly undermines the incentive to invest in the original database.57 This model, based on unfair competition law, strongly acknowledged the value of promoting unfettered scientific research but still offered protection to prevent database rights holders from being the victims of commercial misappropriation. Finally, the Coalition Proposal recognized the fairness of reasonable access charges for databases whose only purpose is for scientific research.58

However, H.R. 354 and, to a lesser extent, the Senate Discussion Draft would represent a considerable risk for the conduct of research and education. The nature and value of the products of research frequently are unknown until well after the research is conducted. Scientists will not know whether a particular use of a database under H.R. 354 will affect the "actual market" for the original or whether a court will subsequently weigh the factors in such a way as to make the use privileged. The Senate Discussion Draft, which allowed not-for-profit scientific research for "similar customary or transformative purposes," still withdrew the privilege if the ultimate result would be "likely to serve as a substitute" for the original database.59 Thus, under both of these proposals, a researcher could not be certain of the authority to utilize another's database until the results of the research were known. Also, neither a researcher nor any other user would know with any certainty what the database rights holder might consider to be a quantitatively or qualitatively substantial part of the rights holder's database, and the rights holder would have every incentive to define its protected domain as broadly as possible. This uncertainty could discourage uses that would otherwise occur under a more privileged access and use provision.60 Moreover, the "actual market" of the originator might be undermined in many ways other than by direct competition, such as by research demonstrating the original database was inaccurate, was built on false premises, or did not do what it was marketed to do.

The uncertain results of research make license transactions for the use of databases intrinsically difficult. Although a licensor will want to establish a fee that protects against any reduction in value of the licensor's database and that provides for sharing in the economic value of the resulting product, not knowing the value of the research results in advance confounds defining "the harm to the original database" or the database value.

Both H.R. 354 and the Senate Discussion Draft also prevented patterns of systematic extraction or uses of individual items of information or other insubstantial parts of a database,61 and the H.R. 354 exemption for science was limited to an "individual act of use."62 Both of these limitations failed to reflect the nature of some scientific research. For example, the potential for new treatments for disease represented by the development of databases arising out of the Human Genome Project is likely to be achieved by the systematic and continuous analysis of the resulting databases rather than by an individual use. The Coalition Proposal permitted research using existing databases unless the purpose of the researcher was direct competition with the database rights holder, an approach that fosters continuous discovery using databases.63

The provisions proposed in H.R. 354 and the Senate Discussion Draft also should be contrasted with the operation of fair use under the Copyright Act. Copyright fair use, like the H.R. 354 provision, depends on an after-the-fact balancing of factors to assess whether a subsequent otherwise infringing use is privileged. However, copyright law protects only the expression of a protected work and not the ideas or facts contained therein. Scientists may freely mine the world of existing copyright-protected works for the data upon which their research depends, without fear of liability. One of the significant aspects of a database--a collection of facts--is that it can be difficult to draw the line between facts and expression. Neither H.R. 354 nor the Senate Discussion Draft purported to do so, but each prohibited the "extraction and use" of a substantial part (measured qualitatively or quantitatively) of a database.64 The traditional methods of scientific research--as well as the mining of existing storehouses of ideas or facts (i.e., databases) upon which to build knowledge--would be placed at risk by these proposals. The Coalition Proposal, in addition to its narrow scope of initial prohibition, also expressly exempted from protection any "idea, fact, procedure, system, method of operation, concept, principle or discovery,"65 consistent with the Copyright Act.66 This breadth of exemption is also recommended by the committee.

Even the limited privileges offered to teachers and researchers in H.R. 354 and in the Senate Discussion Draft could be further undermined by the enforceability of contractual limitations on use of databases, or by technical measures that can prevent uses even if privileged. Neither H.R. 354 nor the Senate Discussion Draft imposed limits on the ability of a rights holder of a database to impose additional restrictions on use by contract or technical measures.67 The Senate Discussion Draft did, however, attempt to temper unreasonable contractual overrides, particularly in cases of sole-source databases, by raising the issue of the potential application of the legal doctrine of misuse in the draft legislative history68 and by including such issues as part of a required review of the effects of the legislation69 (see the next section). The Coalition Proposal, on the other hand, adopted a "misuse" provision, which would authorize courts to deny relief to a database rights holder if "permitted acts" of database use are "frustrated by contractual arrangements or technological measures" or if "access to information necessary to research" is prevented.70

As noted above, research and education produce externalities that confer benefits on society at large. It is not clear that private parties through contracts will take these externalities into account when negotiating a license to use an existing database. Where this is true, licensors are likely to authorize fewer uses of their databases than would be socially optimal. Reliance on private decisions to ensure widespread availability of scientific and technical data runs the risk of interfering with research and education. Therefore, the creation of unprecedented new federal statutory rights for rights holders of databases must be balanced by some affirmative duties not to unilaterally override by contract the public-interest exemptions and other permitted uses allowed under any new law.71

One approach would be to affirmatively state by legislative provision that traditional or customary scientific, educational, and research uses could not constitute infringements under unfair competition database legislation. This would hold true even if such traditional or customary public-interest uses caused potential or actual economic harm to the rights holder in a database. This approach closely parallels the existing "first sale" doctrine in that although current purchasing and subsequent lending of traditional books and journals (such as is done by libraries) may reduce the sales of these works and allegedly harm the economic interests of publishers, the sharing of legally acquired works is so important to society's scientific, educational, and social advancement that the potential or actual harms to authors and publishers are considered inconsequential when balanced against the benefits to society of allowing sharing. The customary and traditional practices of the research and educational communities were formed under the copyright law milieu, which achieved a careful balance of the rights of rights holders and users over time. The balance of interests struck by the law in paper publishing environments has worked well, and an analogous balance has to be developed in electronic sharing environments, particularly for scientific and technical databases.

It is important that any carved-out rights for traditional and customary scientific uses in any legislation that may be adopted not be able to be overridden or denied to scientists, educators, and other public-interest users through use of contracts.72 In addition, integrative and derivative uses extending from other integrative and derivative uses have become one of the major methods of scientific inquiry today. As discussed in Chapter 1, scientific inquiry involves not only controlled observation and confirmation based on the published data, information, and knowledge from books, journals, and other intellectual works, but also the mining of electronic data sets that may have been gathered for scientific or other purposes. The sifting and winnowing of data and knowledge from all available sources contribute indispensably to the advancement of knowledge and the development of yet further derivative databases upon which others may build. Thus there is a need to minimize the barriers to access and use of facts and compilations of factual information, not increase them. It is extremely important that traditional and transformative uses of scientific and technical data be allowed by right without requiring permission of rights holders, similar to the legal situation for hard-copy documents under the "first sale" doctrine.

The "fair use" doctrine under copyright law and the "first sale" doctrine apply to all intellectual works and not just to "traditional or customary scientific, educational, and research uses" as proposed in the paragraphs above. Thus the balance between the exclusivity interests of the commercial community and the openness and sharing interests of the scientific and education communities is not fully preserved by the proposed legislation's exemptions for scientific inquiry.

The committee therefore recommends that any new legislation that may be adopted expressly continue to provide legal rights of access to and uses of proprietary databases equivalent to those that not-for-profit researchers, educators, and other public-interest users enjoyed under traditional or customary practice prior to enactment. Courts should be allowed to invalidate any non-bargained73 licensing terms that are shown to interfere unduly with otherwise legislatively permitted customary uses by not-for-profit entities. Additional steps need to be taken by the government and by the research, education, and other public-interest communities, however, in the implementation of a new database protection regime to help ensure that the traditional and customary rights of public-interest data users are not unduly compromised. These steps are discussed in some detail in the section below titled "Assessment of Policy Options, with Recommendations for Government Action."

Periodic Assessments of Effects Under Any New Statute

As pointed out above in this chapter, the Commission of the European Communities (CEC) conducted no economic studies of the database industry, or of the potential effects of different models and provisions, to specifically support the drafting of the E.U. Database Directive. The only significant economic analysis done in the United States with regard to the pending legislation was an article commissioned by two of the principal supporters of H.R. 354, Reed-Elsevier, Inc., and the Thompson Corporation.74 While neither the CEC nor the U.S. Congress has undertaken such studies in advance of their legislative initiatives, both legislative bodies have implicitly recognized that some negative effects are likely to be generated by this new legal regime. The E.U. Directive requires the CEC to submit to the European Parliament, the Council, and the Economic and Social Committee of the CEC a report

. . . on the application of this Directive, in which, inter alia on the basis of specific information supplied by the Member States, it shall examine in particular the application of the sui generis right, including Articles 8 and 9, and shall verify especially whether the application of this right has led to abuse of dominant position or other interference with free competition which would justify appropriate measures being taken, including the establishment of non-voluntary licensing arrangements.75

Neither H.R. 354 nor its predecessor bills in the House, H.R. 2652 or Title V of H.R. 2281, had any provision for review of the economic effects of the bill on competition, consumers, or public-interest users. In contrast, the Senate Discussion Draft did provide for the conduct of a "Study Regarding the Effect of the Act" by the General Accounting Office, in consultation with the Register of Copyrights and the Department of Justice, within 5 years of enactment and every 10 years thereafter.76 The issues for study that the Senate Discussion Draft would require are fully reproduced below, not only because they represent concerns regarding effects that might arise as a direct consequence of the enactment of this type of legislation, but also because they form the basis for a core set of questions that can be addressed independently by those studying the effects of the bill.

     (b)   ELEMENTS FOR CONSIDERATION--The study conducted under subsection (a) shall consider--
(1)   The extent to which the ability of persons to engage in the permitted acts under this Act has been frustrated by contractual arrangements or technological measures;
(2)   the extent to which information contained in databases that are the sole source of the information contained therein is made available through licensing or sale on reasonable terms and conditions;
(3)   the extent to which the license or sale of information contained in databases protected under this Act has been conditioned on the acquisition or license of any other product or service, or on the performance of any action, not directly related to the license or sale;
(4)   the extent to which the judicially-developed doctrines of misuse in other areas of the law have been extended to cases involving protection of databases under this Act;
(5)   the extent, if any, to which the provisions of this Act constitute a barrier to entry, or have encouraged entry into, a relevant database market;
(6)   the extent to which claims have been made that this Act prevented access to valuable information for research, competition, or innovation purposes and an evaluation of these claims;
(7)   the extent to which enactment of this Act resulted in the creation of databases that otherwise would not exist; and
(8)   such other matters necessary to accomplish the purpose of the report.77

This type of monitoring and review of the effects of database protection legislation should focus not only on national database activities, but on international ones as well, since the market for all online databases is inherently international, as are many S&T research activities. Although the committee believes that such periodic reviews would be important, particularly if they are not carried out prior to enactment of any new legislation, there are a number of other aspects to consider. There are several government entities that Congress might consider in addition to, or instead of, the three suggested above, including the Congressional Budget Office, the Department of Commerce, and the Federal Trade Commission, all of which would bring relevant expertise and interests to such a review. Also, because of the complexity of the issues to be examined, the rapidly changing nature of digital information technologies, and the lack of empirical data to fully support the analysis of these issues, the federal entities charged with doing the review should consider, in conjunction with the various stakeholder groups, what kind of data would be desirable to track and should initiate some means of doing so. Finally, the committee suggests that instead of an assessment of effects under the statute, Congress consider enacting a sunset provision that would take effect after a 5-year period and place the burden of proof and action on those who would want the legislation to continue. In light of the concerns about the possible unintended effects of such legislation, the rapid pace of change in digital information and network technologies, and the need to exercise due caution in the enactment of any such legislation, a sunset provision with the possibility of renewal may be the better option.

The committee therefore recommends that any new legislation provide either for a sunset provision with the possibility to renew, or for periodic assessments of the effects of new statutory database protection on competition in the database market and on consumers of databases, as well as on access to and use of data--including S&T data--by not-for-profit, public-interest users, in order to enable timely and appropriate revisions of legislation as needed.

Exemptions for Government Databases

As discussed throughout this report, S&T databases created either by the government or with government funding provide the largest and, in many scientific areas, the most important source of data for research and education. Moreover, existing U.S. law and policy prohibit proprietary protection of data or information created at the federal level, and generally limit such protection at the state and local levels as well. Maintaining these exemptions is of crucial importance, not only to public-interest users of databases in the research and educational communities but to all citizens. It therefore is not surprising that all three legislative proposals attempted to provide broad exemptions for government or government-funded databases.

Nevertheless, there are some notable differences among these proposals. Under H.R. 354, protection provided by the proposed legislation would not extend to collections of information gathered by or for a governmental entity, whether federal, state, or local.78 For example, databases compiled by state or municipal governments in conjunction with their geographic information systems (GISs) would not be protected, although state and local governments would retain any copyright, contractual/licensing, trade secret, and technological protections that currently apply. H.R. 354 made it clear that the proposed legislation, and presumably federal copyright law, would preempt conflicting state laws.79 This means that state misappropriation laws could not be applied by a state or local agency in a claim against a commercial business or vice versa.

H.R. 354 provided protection for collections of information created by state and federal educational institutions engaged in the course of education or scholarship.80 Relative to civil law suits, the court would be required to reduce or remit entirely monetary relief "in any case in which a defendant believed and had reasonable grounds for believing that his or her conduct was permissible under this chapter, if the defendant was an employee or agent of a not-for-profit educational, scientific, or research institution, library, or archives acting within the scope of his or her employment."81 A similar non-applicability provision would apply for criminal offenses under that proposed legislation.82 This allowance, however, is merely an "innocent infringer" provision; once the not-for-profit researcher or educator is notified the first time, this immunity would be removed.

These liability relief provisions would not apply to state or local government agencies under H.R 354. Thus, if a state or municipal government GIS operation would use data from a commercial collection of information without permission, the government operation and employees acting even within the course of their employment would be subject to the liability provisions set forth in H.R. 354. Criminal violations would apply in situations where losses to the commercial company aggregated to more than $10,000 in a year or if the state or local government ran its operation for "direct or indirect commercial advantage or financial gain."83 On the other hand, and again by example, commercial firms extracting government data from state or municipal GIS operations would not be subject to either civil or criminal liability provisions under that proposed legislation.

Under the Coalition Proposal, prohibitions against duplication would not apply to "government databases."84 Here, however, "government database" was defined as being "a database (A) that has been collected or maintained by the United States of America; or (B) that is required by federal statute or regulation to be collected or maintained, to the extent so required."85 Under the Coalition Proposal, state and local governments would gain most of the protections of the new database legislation, in a manner similar to that of commercial firms, and could choose to not avail themselves of the proposed legislative provisions at their option. This option is similar to the current situation in which many local governments choose not to seek copyright protection for their public records and databases. Under the liability provisions it is also clear that state and local governments would be in a position similar to that of commercial firms in being able to seek damages from private and government competitors who duplicate their data and compete with their income streams. Like H.R. 354, the Coalition Proposal stated that the new federal legislation would preempt conflicting state laws.

The research and education communities generally believe that the greatest benefits would accrue if state and local governments followed the federal government principles of "a strong freedom of information law, no government copyright, fees limited to recouping the cost of dissemination, and no restrictions on reuse."86 This view appears to be gaining currency even in Europe. A green paper issued by the Commission of the European Communities in January 1999 comes to the conclusion that public-sector information is a key resource for Europe and suggests that E.U. nations should more closely follow the model of U.S. federal government policies with regard to promoting broader access to government databases.87 Regardless of the merits of providing open access to government information, the individual states in the United States traditionally have determined which policies they will follow in providing access to their state and local government information. The Coalition Proposal appeared to support continuation of this state and local government self-determination concept.

The Senate Discussion Draft more closely paralleled H.R. 354 than it did the Coalition Proposal in its effect on government data collections. For instance, the term "government databases" was again defined to include databases of government entities at all levels--federal, state, or local.88 Therefore the protection and liability ramifications of the Senate Discussion Draft for governments would be similar in many respects to those of H.R. 354.

The Senate Discussion Draft theoretically would allow a library, archive, educational, scientific, or research institution to extract government data contained in a commercial database.89 However, the requester would bear heavy burdens regarding data identification and proof of need, and the extraction would be allowed only in the unlikely event that the information was still available in its original format in the commercial database and separate from other portions of the commercial database. The requesting not-for-profit organization also would need to pay the costs of fulfilling the user request. From a practical perspective, it is difficult to envision the ability of libraries and educational institutions to successfully pursue such extractions.

The adoption of a strong standard of harm also could accelerate the privatization of government data dissemination with potentially negative results, as noted in Chapter 2. While privatization of some government functions may, under appropriate circumstances, produce net benefits, the committee urges caution in the context of government S&T database privatization in light of the public-good aspects of the data.90 Because government databases are not and will not be legally protectable and the government's policy is to make public data broadly available, any entity can take the original data, add value, and redisseminate them as it wishes. In such cases, the original government data are nonetheless supposed to be (but are not always) maintained and to continue to be made available by the government source or archive.

There are some situations in which the government seeks to transfer the data dissemination function to a private-sector party, whether not-for-profit or commercial, on either a nonexclusive or an exclusive basis. The benefit of nonexclusive licensing is that several disseminators can compete in the market, and so tend toward providing access at a reasonable price for end users. In practice, however, few organizations would be willing to enter into a formal agreement to handle a major data management and dissemination function on behalf of the government unless they could see some reasonable opportunity to at least recoup their operating expenses. Therefore, most such data management and dissemination functions are performed at university or other not-for-profit institution data centers that are partly subsidized by the originating government agency and are limited either by the terms of their agreement with the government or by their institutional charter (or both) to recovering operating costs, but prohibited from earning a profit.

In certain instances, government agencies have given exclusive licenses to commercial firms for dissemination of government data.91 If the government agency subsequently does not continue to maintain and make available the original database, an exclusive license will result in a monopoly situation, which can lead to higher prices. Since stronger statutory database protection is likely to enhance the potential for profit by commercial data distributors, it also is likely to encourage the licensing of government data dissemination functions, perhaps on a de facto exclusive basis and without appropriate safeguards, thus defeating the existing open access and use law and policy of the U.S. government. As discussed in Chapter 2, the negative aspects of this trend are further exacerbated by other legislative initiatives that require government science agencies to purchase data from the private sector, rather than generate those same or similar data as a public good.

Finally, the exceptions for instructional and library uses articulated in the Senate Discussion Draft and in H.R. 354 were weak. Uses by academic institutions and libraries regarded as fair use under the Senate Discussion Draft are explicitly stated and represent a very narrow and qualified set of activities.92 For example, display of the contents of a data collection during teaching in a normal class session that also is viewed by students at a distance enrolled in World Wide Web-based instruction would be in violation of the provisions of the Senate Discussion Draft. Numerous additional impositions on efficient approaches to instruction were raised by the Senate Discussion Draft; however, H.R. 354 did not even have a separate exemption for those activities.

The Senate Discussion Draft, and especially H.R. 354, focused primarily on strengthening proprietary protection without adequately balancing the public-interest, consumer, and commercial competitive values. The Coalition Proposal, by attempting to minimize social costs while providing some additional protection for commercial databases, appeared to arrive at a result that would be far more acceptable for maintaining the viability and vitality of the academic and research communities upon which innovation and broad-based economic growth ultimately depend.

The committee thus recommends that although private-sector databases derived from government data should be eligible for protection, protection should not be extended to databases collected or maintained by the government. Any new legislation should expressly affirm the need for continuation of existing legal norms for wide distribution of government data and of data created pursuant to a government mandate or funding.


In this section the committee discusses a number of actions that should be taken by various government institutions to help promote access to and use of S&T databases for the public interest. The areas addressed include promoting availability of government S&T data; maintaining nonexclusive rights in government-funded databases by not-for-profit institutions and their employees; organizing discussions of licensing terms for not-for-profit uses of commercial S&T databases; improving the understanding of complex economic aspects of S&T database activities; and promoting international access to S&T data. Although the committee believes that its recommended actions in these areas ought to be undertaken whether or not any new statutory database protection is enacted by Congress, all of these actions will take on an increased urgency and importance if relatively strong new proprietary rights in databases are established by federal statute.

Promoting Availability of Government Scientific and Technical Data

Increased proprietary protection for commercial databases could have a significant effect on government data collection and distribution efforts. Because researchers and educators likely would be more constrained in their use of data drawn from commercial databases, they might have to request additional funds for the purchase and administration of proprietary data or ask federal agencies to collect and maintain more S&T data on a nonproprietary basis. Thus budgetary strains could increase for federal agencies trying to meet the data needs of their own researchers as well as those related to fulfilling institutional mandates.

Under appropriate circumstances and conditions, government partnering with the private sector--especially not-for-profit institutions--in accomplishing data collection and maintenance can be highly beneficial and effective. The committee fully endorses the existing policy and practice of the federal government, as expressed through OMB Circular A-130, to make public S&T (and other) databases openly available at the lowest possible prices. It is through this policy of efficient-access pricing that the taxpayer derives maximum value from the government's very substantial investments in its collections of data.

Consistent with current practice, government S&T agencies must not make their databases--whether created and owned by them or under their control--available on an exclusive basis. They also should continue to maintain under their own control and archive all S&T databases that have value for research and that are otherwise being disseminated on behalf of the government by a private-sector organization or company. Such control should be maintained through physical possession or by appropriate contractual provisions. The long-term maintenance of public databases and archiving of data in readily accessible formats are essential to ensure their availability for reuse in future research or to confirm the results of research already conducted, among other uses.93 Large quantities of government and government-funded data at all levels are lost, discarded, or rendered inaccessible owing to technological change or defects. Although this constitutes a major information management and policy issue in its own right that is beyond the scope of this report, the trend toward greater private-sector management and dissemination of public data--which the committee believes would increase under stronger statutory protection of databases--makes it even more important for government agencies to pay attention to this issue. Without adequate safeguards to ensure long-term preservation of public data created or disseminated on behalf of the government by private-sector entities, even larger amounts of such data may be lost or become inaccessible over time.

Finally, in making its data broadly available, the government should require that all private-sector disseminators or transformative users of its data identify the government source(s) of the data being used. Indeed, the same practice should be followed with regard to all sources of data from the private sector as well. Identifiers on privately disseminated government data will serve the objectives of providing notice to all users that they can contact the government agency source to obtain the original data, making the public aware of its government's activities, and giving proper credit where credit is due. Improving public awareness is an important objective, because all too often the public lacks a full appreciation of the benefits it derives from taxpayer-funded data-related activities.

The committee therefore recommends that the following actions be taken by all government entities. Scientific and technical data owned or controlled by the government should be made available for use by not-for-profit and commercial entities alike on a nonexclusive basis and should be disseminated to all users at no more than the marginal cost of reproduction and distribution, whenever possible. While the private sector's creation of derivative databases from government data should be encouraged, the source of the original government data must ensure that those original data remain openly available. Any information product derived from a government database also should be required to carry an identifier stating the government source(s) used.

Maintaining Nonexclusive Rights by Not-for-Profits in Government-funded Databases

Science best advances by promoting a culture of openness and sharing, whereas individual commercial companies best advance by maintaining control and secrecy. The tension between these two cultures has been attenuated through the "first sale" doctrine, whereby a purchaser of a book, journal, or other intellectual work is free to use the facts and ideas in the work and the publisher is not able to prevent the purchaser from placing the intellectual work in a library or passing it on to another person for similar uses, regardless of the medium in which the work is presented.94 As discussed in Chapter 3, this right is being superseded increasingly by licensing arrangements in the online digital environment. In the academic sector, such institutional licenses typically permit users to share with collaborators or colleagues on other campuses, so long as the sharing is not systematic. In fact, in some cases, universities are now able to negotiate licenses that "meet or exceed" their users' needs.95 Nevertheless, in order to maintain a reasonable balance between the scientific and education communities' interests in openness and sharing, on the one hand, and the commercial community's interests in exclusivity on the other, some minimal constraints ought to be placed on the commercial community to guarantee researchers and educators access to and unfettered use of facts, data, and intellectual works published by their peers.

The existing practice in the publication of research results has been for researchers to pay page charges and to contractually give up exclusive copyright in their works in order to have their articles published in the primary journals read by their peers. Because authors transfer exclusive copyright in their work, they may be legally obligated to ask publishers for permission to distribute copies of their authored articles to their own students and to their close research associates. As one participant in the committee's January 1999 workshop wryly noted, "We have a great system: we pay to publish and we pay to get it back." In addition, the pricing structure that seems to maximize profits for commercial scientific publishers is one that limits acquisition of journals to the elite academic libraries and researchers that can afford them.96 Many academic institutions now have difficulty paying library subscription rates, and their researchers, professors, and students thus lack convenient access to many journals, even to those in which they publish.

Such concerns will only increase as electronic publishing becomes more widespread. It is already common practice in electronic publishing--and one of its tremendously productive features--to link electronic articles to the data sets upon which the research results depend. If legislation protecting databases is enacted, the current practice of requiring scientific authors to give up exclusive rights in their research articles on a take-it-or-leave-it basis could be extended to the data sets underlying the results reported in the research articles. Ceding control of databases created in the not-for-profit sector, especially those created with taxpayer support, to private-sector vendors that can establish their own terms for access to and use of the underlying research data is thus a major concern. In scientific disciplines where marketplace competition is highly constrained or absent, there is a need to provide safeguards against monopolistic practices.

The committee therefore believes it important to initiate a "safety net" approach in the digital database context to help preserve the balance previously provided by the "first sale" doctrine. This approach will help ensure public access to data and databases developed in whole or in substantial part at federal government expense. Databases developed primarily with government funds should not fall under the exclusive control of private parties such that dissemination of the data to the public or other scientists is limited. Nor should public access to government-funded databases be highly constrained. The economic basis for funding science from governmental funds is that the research produces public goods. One researcher's use of these public goods does not decrease the value and benefits to others of the public goods.

Specifically, for any research accomplished wholly or in substantial part with federal funds, universities and not-for-profit organizations should be required by the funding agency to retain nonexclusive rights in any resulting databases. Under OMB Circular A-110, federal grant recipients have initial control over the intellectual property and databases that have been produced from their federally funded projects.97 The primary concern under a new statutory regime is the inequality in bargaining power between large publishers and individual researchers and scientific authors. Based on past practices, the committee is concerned that many researchers may be required to give up exclusive rights in the databases produced at federal expense in return for having their research results published.

If new database legislation is enacted, publishers may request rights in both the intellectual work (i.e., typically the journal article), as well as rights to the collected data sets from which the intellectual work arose or upon which the work may depend and which might usefully be linked to the article in electronic publishing environments. If the negotiated contract provides for reasonable access to and use of the government-funded data for further scientific work, it is unlikely that the right of the researcher to independently provide access to the government-funded data would ever have to be invoked. However, the proposed provision provides a safety mechanism. The retained right of the researcher to distribute the data is likely to be invoked only in the unusual situation in which data gathered through a federally funded project or grant have been transferred through contract from an academic institution to a commercial entity with highly constrained access to and use of the data.

This approach has a relatively narrow application. The proposed requirement would not apply generally to copyrighted works that may have been produced using federal funds (e.g., research articles), nor would it apply to state or local government databases or to databases generally. The requirement also would not automatically apply to databases created with only partial (e.g., less than half) funding from the federal government, unless specifically agreed to by the parties. The provision would allow not-for-profit institutions and researchers to share underlying federally funded data with others regardless of contract provisions with the private sector, but would impose no affirmative requirement on them to share such data. Universities and other not-for-profits, of course, could not distribute any value-added features provided by the private-sector publisher to the government data unless agreed to by contract with that entity or otherwise permitted by law.

Based on the foregoing discussion, the committee recommends that federal funding agencies should require university and other not-for-profit researchers or their employing institutions that use federal funds, wholly or in substantial part, in creating databases not to grant exclusive rights to such databases when submitting them for publication or for incorporation into other databases.

Organizing Discussions of Licensing Terms for Not-for-Profit Uses of Commercial Scientific and Technical Databases

Whether or not new database protection legislation is adopted, the committee believes that representatives from the not-for-profit research and education communities should engage in a series of discussions with commercial database publishers and vendors in different market segments in order to achieve a better understanding of their respective needs and concerns and thus foster the development of mutually acceptable licensing terms that can reduce uncertainty and transaction costs. Such discussions would be especially important in the months and years immediately after enactment of a any new federal database statute, since there would be many definitions and concepts that would not have been fully defined and that would be subject to broadly divergent interpretations by different parties. One person's legitimate derivative use may be another's harmful infringement.

Previously established guidelines or understandings concerning copyrighted works will not in most cases be transferable to the database context and therefore will most likely confuse user communities, without the benefit of a fresh set of clarifying discussions. Further complications will arise if currently copyrighted works, such as journals, textbooks, reference books, and other anthologies, are also included in the definition of protected databases or "collections of information" under any new U.S. legislation, as they already are in the European Union. In addition to promoting some mutual understanding regarding licensing terms, clarifying discussions might help prevent unnecessary conflicts and litigation.

It is unrealistic to assume that a model contract or even standard individual contract terms could be developed to cover all or perhaps even most such transactions. As discussed in Chapter 1, a key characteristic of S&T data is the heterogeneity of data types, sources, and uses. The expectation of developing a one-size-fits-all approach would be not only illusory and impossible, but also ultimately harmful. To avoid becoming futile, discussions among stakeholders must be founded on realistic and well-focused objectives that would have a reasonable chance of success.

In establishing such discussions, it is essential that representatives of all major stakeholders be involved so that all relevant interests and viewpoints can be considered. For example, the committee would not endorse a process such as the one that resulted in the "Agreement on Guidelines for Classroom Copying."98 That agreement has perhaps been workable for campus administrators, campus libraries, and the photocopying centers on campuses, but not for students and faculty, who were not involved as stakeholders in the discussions. Examples of both classroom guidelines99 and of existing digital licensing terms and phrases, and their evaluation from the not-for-profit perspective,100 may be found online on the World Wide Web already.

Private-sector S&T database producers and disseminators should remain cognizant of the social value of their products, particularly for not-for-profit research, education, and other public-interest uses. Database vendors whose primary source of revenue lies outside the not-for-profit S&T communities should endeavor to provide public-interest users access to their databases on favorable terms. Database vendors whose primary source of revenue is in the S&T research and education community should be encouraged to provide access on favorable terms once a reasonable return on investment has been achieved.

Indeed, as noted in Chapter 2, all pricing inhibits access, especially for those researchers who do not have adequate and strong institutional funding, whether academic, research institute, or industrial. Of course, the limitations are a matter of degree, depending on level and pattern of pricing. The goal should be to bring all sectors into a cooperative system in which data are made widely and readily available for scientific and educational use at as low a total cost (to the user population and society as a whole) as possible, and to do that within an environment that encourages, rather than inhibits, the inquisitiveness and inventiveness of the user while encouraging the entrepreneurship of suppliers. It is in the common interest of both database rights holders and users--and of society generally--to achieve a workable balance among the respective interests so that all legitimate rights remain reasonably protected.

The participants in these discussions would be primarily representatives of commercial S&T database disseminators and their government agency and not-for-profit-sector users. The committee makes its recommendation to the administration, rather than directly to those communities, however, because it believes that the discussions should be held under a convener such as the Copyright Office, which has the greatest subject matter expertise in these issues within the government. Such a focused venue would not only help stimulate progress on important issues, but would also mitigate the potential for accusations of collusion or conspiracy under federal antitrust laws. The committee therefore recommends that the Copyright Office sponsor discussions between the representatives of private-sector producers of databases and user stakeholder representatives from government agencies and not-for-profit groups to help develop a common understanding and optimal terms for the licensing of S&T databases and data products.

Improving the Understanding of Complex Economic Aspects of Scientific and Technical Database Activities

Although a detailed economic analysis is well beyond the scope of the committee's charge for this study, this report raises significant questions throughout regarding the adoption of statutory database protection; the economic underpinnings of different types and mixes of provisions, and the potential effects of both an overall statutory regime and specific provisions on various segments of the database industry and on the relationships among the different parties involved in creating, disseminating, and using S&T databases. Certainly, at a minimum, the questions raised in the E.U. Database Directive and in the Senate Discussion Draft, as well as any other questions that are ultimately identified in the course of the legislative process, should be the subject of more detailed study in advance of any legislatively mandated report on effects of increased protection. Such a study would help provide a comprehensive base of knowledge with which to officially evaluate the effects of new statutory protection of databases. In addition to these broad economic issues, the committee suggests that research be devoted to, among others, the following specific issues affecting the creation, dissemination, and use of S&T databases by the government and by the not-for-profit and for-profit sectors:

The committee recommends that the Congressional Research Service, the National Science Foundation, the Department of Commerce, and other federal science agencies, as considered appropriate, should undertake and fund external research that investigates the changing and complex economic aspects of S&T database activities, particularly in the context of any new legislative database protection measures that may be enacted and in support of the legislative principle recommended above regarding the conduct of periodic assessments of the effects of any new statutory protection of databases.

Promoting International Access to Scientific and Technical Data

It is a well-known truism that science knows no boundaries and that practically all research that is conducted on an open basis also involves international collaboration to some degree. Some research, such as that in observational space and Earth sciences, is inherently international and cannot be conducted successfully without either the collection of global data or access to foreign databases.101 As a result, the U.S. government science agencies in recent decades have concluded thousands of bilateral and multilateral general S&T cooperation and specific research program agreements.102 These agreements will take on added significance with the implementation of the E.U. Database Directive and with possible adoption of restrictive database protection legislation in the United States and elsewhere, since the negotiated terms of those agreements can specify the terms under which databases related to the research in question can be accessed and used. As the world's largest producer and disseminator of S&T data, the U.S. government has significant leverage in negotiating appropriate terms for the exchange and use of public data with other nations.

At the same time, the committee agrees with the Administration's concerns regarding the E.U. Directive's reciprocity provision and supports the U.S. Trade Representative's (USTR's) placement of that topic on the Administration's 1998 Special 301 Review.103 The committee would go one step further, however, and suggest that the USTR and other appropriate entities within the Administration negotiate with the Commission of the European Communities to review and revise its E.U. Directive, based on the substantial criticisms of that new legal regime in this report and in other position statements and articles cited above. If the U.S. Congress enacts a new database protection statute based on properly balanced unfair competition principles, the committee urges the USTR, the U.S. Patent and Trademark Office, and other appropriate administration officials to promote that statute as a model for international database protection within the World Intellectual Property Organization (WIPO).

The committee recommends that all departments and agencies of the federal government should continue to adopt international S&T agreements that include provisions to facilitate access to S&T data across national boundaries and should conduct periodic reviews of international policies and agreements to promote conformity to the above principles.

In addition, the committee recommends that the U.S. government should negotiate with the Commission of the European Communities to revise its highly protectionist E.U. Database Directive.


Finally, there is the question of what the research and education community should do in the event that highly restrictive statutory protection of databases is enacted by Congress. Certainly, leaders in all the major not-for-profit research, higher education, and library associations and in many individual institutions have voiced their concerns about the legislative proposals that have been introduced in the Committee on the Judiciary in the House of Representatives in both the 105th and 106th Congress.104 As the various critics and this report point out, such a new statutory regime could have many negative effects, among them significant changes in the terms for access to and use of databases sold or licensed by the commercial sector, the possibility of increased economic exploitation on a proprietary basis of heretofore openly available S&T databases by not-for-profit researchers and educators and their institutions, and the stimulation of further privatization of such public-good activities by the government.

The question thus raised is what actions the not-for-profit community should take on its own behalf if restrictive new provisions are enacted that encourage or exacerbate these negative effects. Proponents of new legislation rightly point out that any new law would not require individuals or organizations to make use of the new protections, and that if the not-for-profits are concerned about these legal developments as a matter of principle, they should resist the temptation to adopt proprietary restrictions on their databases. The committee agrees that much of the responsibility for maintaining a policy of full and open data availability in the academic community rests with the community, which will have to act to ensure continuation of the broad sharing of data and research results. Nevertheless, the committee also believes that the pressures to commercialize and privatize currently open data sources would increase inevitably under a regime such as the E.U. Directive or the one proposed by H.R. 354, and that fully maintaining the customary or traditional approaches to data exchange would prove to be difficult. In the event that the universe of public domain S&T data is found to be shrinking unacceptably, additional defensive measures may have to be taken to reinvigorate a robust public-interest sector for such data.

Therefore, as its last recommendation, the committee urges that the not-for-profit S&T community continue to promote and adhere to the policy of full and open exchange of data at both the national and international levels.


1 Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991).

2 Although certain database vendors might be at some competitive disadvantage in the European Union, the committee believes that a less protectionist law in the United States that encourages the use of factual data for both public interest and commercial purposes will benefit the U.S. economy and society to a greater extent.

3 Cong. Rec., Vol. 106, S. 316 (Jan. 19, 1999).

4 H.R. 2652, the "Collections of Information Antipiracy Act," 105th Congress (1997).

5 H.R. 2281, Title V, the "Collections of Information Antipiracy Act," 105th Congress (1998).

6 The only significant change in Title V of H.R. 2281 was to remove "potential markets" from the ambit of liability for not-for-profit uses in Section 1403, Permitted Acts (a) Educational, Scientific, Research, and Additional Reasonable Uses, which was amended as follows:

(1) Certain Not-for-profit Educational, Scientific, or Research Uses.-- . . . no person shall be restricted from extracting and using information for not-for-profit educational, scientific, or research purposes in a manner that does not harm directly the actual [or potential] market for the product or service referred to in section 1402." [words in brackets deleted].

7 See the testimony of the not-for-profit sector cited in note 8 below, and of the commercial opponents to the legislation in note 10 below. In addition, over 130 organizations and companies signed a position statement critical of H.R. 354 that was placed in the public record by Dr. James Neal, director of the Milton S. Eisenhower Library at Johns Hopkins University and president of the Association of Research Libraries, during the March 18, 1999, Hearing on H.R 354, the "Collections of Information Antipiracy Act," held by the Subcommittee on Courts and Intellectual Property of the Committee on the Judiciary of the U.S. House of Representatives [hereinafter March 18, 1999, Hearing]. A copy of the position statement and the full list of signatories may be found online at <>.

8 See testimony by Wm. A. Wulf, president of the National Academy of Engineering on behalf of the National Academies, J.H. Reichman, professor at the Vanderbilt University School of Law, and James G. Neal, director of the Milton S. Eisenhower Library at Johns Hopkins University and president of the Association of Research Libraries, at the October 23, 1997, Hearing on H.R. 2652, the "Collections of Information Antipiracy Act," held by the Subcommittee on Courts and Intellectual Property of the Committee on the Judiciary of the U.S. House of Representatives [hereinafter October 23, 1997, Hearing].

9 Id. See testimony by Paul Warren of Warren Publishing, Inc., on behalf of the Coalition Against Database Piracy. See also testimony by Robert E. Aber, senior vice president and general counsel, the NASDAQ Stock Market, Inc., on behalf of the Information Industry Association, at the February 12, 1998, Hearing on H.R. 2652, the "Collections of Information Antipiracy Act," held by the Subcommittee on Courts and Intellectual Property of the Committee on the Judiciary of the U.S. House of Representatives [hereinafter February 12, 1998, Hearing].

10 Id. See testimony by Jonathan Band, partner, Morrison & Foerster LLP, on behalf of the On-Line Banking Association, and by Tim Casey of MCI, Inc. on behalf of the Information Technology Association of America at the February 12, 1998, Hearing.

11 See letter from Andrew J. Pincus, general counsel of the Department of Commerce, to The Honorable Orrin G. Hatch, chairman, Senate Committee on the Judiciary, August 4, 1998, summarizing "a number of concerns" of the Administration with H.R. 2652.

12 See memorandum for William P. Marshall, associate White House counsel, from William Michael Treanor, deputy assistant attorney general, Office of Legal Counsel, Department of Justice, July 28, 1998, regarding "Constitutional Concerns Raised by the Collections of Information Antipiracy act, H.R. 2652."

13 See letter from Robert Pitofsky, chairman, Federal Trade Commission, to The Honorable Tom Bliley, chairman, Committee on Commerce, U.S. House of Representatives, September 28, 1998, regarding potential anti-competitive effects of Title V of H.R. 2281.

14 These negotiations were conducted in closed sessions with representatives of the principal organizations that had previously taken a public position on the House bills. The Intellectual Property Counsel to Senator Hatch, Edward Damich, moderated the negotiation process.

15 For a detailed discussion of the Senate negotiations and the legislative process associated with the database protection legislation in the U.S. Congress through early April 1999, see generally J.H. Reichman and Paul F. Uhlir (1999), "Database Protection at the Crossroads: Recent Developments and Their Impact on Science and Technology," Berkeley Technology Law Journal, Vol. 14, pp. 793-834.

16 H.R. 354, the "Collections of Information Antipiracy Act," 106th Congress (1999).

17 The two changes made in H.R. 354 by the House Subcommittee on Courts and Intellectual Property included an attempt to eliminate the potential for indefinitely prolonging the 15-year duration of protection in section 1408 (c), and expanding the scope of the exemption for certain not-for-profit educational, scientific, and research uses in section 1403 (a), both of which are discussed in more detail later in this chapter.

18 See the "Database Fair Competition and Research Promotion Act of 1999," Cong. Rec., Vol. 106, S. 316 (Jan. 19, 1999).

19 Id., "Chapter 14--Protection of Databases," S. 322-326.

20 At the time of this writing, the House Committee on Commerce has introduced and marked up a slightly modified version of the Coalition Proposal. See H.R. 1858, The Consumer and Investor Access to Information Act of 1999, 106th Congress, May 20, 1999. The House Committee on the Judiciary also has marked up H.R. 354, which includes a number of significant revisions. Because the study committee had already written its report, it was not able to consider these additional changes to the proposed legislation. Nevertheless, the committee believes that its analysis and recommendations remain relevant to the ongoing debate concerning this legislation, as well as to any eventual implementation of a statutory database protection regime. Any bill that is finally adopted, if any, most likely will be substantially further modified. For this reason, the committee presents its legislative recommendations as guiding principles, rather than as specific legislative language.

21 H.R. 354, section 1402.

22 Section 1301(3).

23 Section 1401. The House Committee on Commerce bill, H.R. 1858, has extended that prohibition to include a "discrete section" of a database.

24 See Robert W. Kastenmeier and Michael J. Remington (1985), "The Semiconductor Chip Protection Act of 1984: A Swamp or Firm Ground?," Minn. L. Rev., Vol. 70, p. 417, establishing a stringent four-part test for assessing the merits of any proposed intellectual property protection for new technologies.

25 Section 1302.

26 Proposed Conference Report Language, Section 1302, at 33.

27 Section 1405 (4).

28 Indeed, many commercial entities have expressed concerns about such effects of strong database protection. See, for example, the testimony and position statement cited in note 10.

29 As noted by Nobel laureate Joshua Lederberg in his testimony on behalf of the National Academies and the American Association for the Advancement of Science at the March 18, 1999, Hearing, note 7, the "recent advent of digital technologies for collecting, processing, storing, and transmitting data has led to an exponential increase in the number of databases created and used. A hallmark trait of modern research is to obtain and use dozens, or even hundreds of databases, extracting and merging portions of each to create new databases and new sources of knowledge and innovation."

30 See Reichman and Uhlir (1999), note 15, p. 820.

31 See Laura D'Andrea Tyson and Edward F. Sherry (1997), Statutory Protection for Databases: Economic & Public Policy Issues, research paper prepared under contract to Reed-Elsevier, Inc. and The Thomson Corporation, and presented as testimony on behalf of the Information Industry Association at the October 23, 1997, Hearing, note 8. Tyson and Sherry, however, generally argue that there are not many instances in the commercial database industry in which sole sources dominate the market and can prevent or inhibit entry. Although the committee did not analyze the entire database market in this study, it did find that in many S&T areas, including practically all observational databases, the data sources are unique.

32 Federal Trade Commission letter, note 13, p. 2.

33 Examples of this problem are already abundant in the restrictions on experimental research uses of patentable or otherwise protected innovations in the biotechnology sector. See M.A. Heller and R.S. Eisenberg (1998), "Can Patents Deter Innovation? The Anticommons in Biomedical Research," Science, Vol. 280, p. 698. Such a problem would be more insidious in the case of noncopyrightable factual databases, which are subpatentable innovations that do not merit strong property rights and that have been used much more widely and openly in research to date.

34 Section 1408(c).

35 Section 1310(c).

36 Article 10.

37 See Andrew Pincus testimony on behalf of the Administration from March 18, 1999, Hearing, note 7, which states in part: "The Administration opposes such 'reciprocity' requirements, both domestically and internationally. We believe that commercial laws (including intellectual property and unfair business laws) should be administered on national treatment terms, that is, a country's domestic laws should treat a foreign national like one of the country's citizens. This principle is embodied in Article 3 of the Agreement on Trade-Related Aspects of Intellectual Property Rights (TRIPS Agreement) as well as more generally in the Paris Convention for the Protection of Industrial Property and the Berne Convention for the Protection of Literary and Artistic Works. The Administration believes that Congress should craft U.S. database protection to meet the needs of the American economy. . . .", p. 32.

38 Failure to limit the term of protection, coupled with a strong proprietary right such as the one proposed in H.R. 354, has led some legal commentators to question the constitutionality of such a provision under U.S. law. See generally William Michael Treanor (1998), note 12, and Marci Hamilton, Cardozo Law School, letter to Howard Coble, chairman of the Subcommittee on Courts and Intellectual Property, House Committee on the Judiciary, February 10, 1998, 5 p.

39 See, for example, J.H. Reichman and Pamela Samuelson (1997), "Intellectual Property Rights in Data?" Vanderbilt Law Review, Vol. 50, pp. 137-152.

40 See 17 U.S.C., section 302.

41 See generally, Martha E. Williams (1984-1999), Information Market Indicators: Information Center/Library Market--Reports 1-60, Information Market Indicators, Inc., Monticello, IL.

42 Section 1408(c).

43 Section 1310. However, the Hatch Discussion Draft did include a provision for voluntary deposit of databases to the Copyright Office in Section 1311.

44 For a discussion of how such legislation might address this problem, see testimony of Andrew J. Pincus, March 18, 1999, note 7.

45 Id.

46 Section 1405(4).

47 Section 1402(e).

48 See documents referenced in note 38.

49 See National Research Council (1997), Bits of Power: Issues in Global Access to Scientific Data, National Academy Press, Washington, D.C., p. 114.

50 Section 1403(a)(1) in H.R. 354, and section 1303(c) in the Hatch Discussion Draft.

51 See 17 U.S.C., section 107, and the discussion of fair use in Chapter 3.

52 Section 1403(a)(2).

53 Section 1403(a)(2)(A).

54 Section 1304(a).

55 Section 1304(b).

56 Section 1402(e).

57 Section 1405(4).

58 Section 1402(e).

59 Section 1304(b)(2).

60 See Reichman and Uhlir (1999), note 15, p. 815.

61 Section 1403(b) and section 1303(a), respectively.

62 Section 1403(a)(2).

63 Section 1402(e).

64 H.R. 354, section 1402; Hatch Discussion Draft, section 1302.

65 Section 1402(d).

66 See 17 U.S.C., section 102(b).

67 Section 1405(e) in H.R. 354, and section 1306(e) in the Hatch Discussion Draft.

68 Section 1306 of the Proposed Conference Report language, pp. 36-37.

69 See section 4(a), Study Regarding the Effect of the Act.

70 Section 1407(b).

71 For a discussion of proposed "public-interest unconscionability" clauses in licensing agreements for copyrighted works and noncopyrightable databases, see J.H. Reichman and Jonathan A. Franklin (1999), "Privately Legislated Intellectual Property Rights: Reconciling Freedom of Contract with Public Good Uses of Information," U. Penn. L. Rev., Vol. 147, No. 4, pp. 929-951.

72 For a discussion of legal and policy issues at the interface of contract law and copyright law see David Nimmer, Elliot Brown, and Gary N. Frischling (1999), "The Metamorphosis of Contract into Expand," Cal. L. Rev., Vol. 87, p. 19, and Charles R. McManis (1999), "The Privatization (or "Shrink-Wrapping") of American Copyright Law," Cal. L. Rev., Vol. 87, p. 1763.

73 By "non-bargained term" the committee means any term, usually contained in a standard form contract, over which, as a practical matter, no actual bargaining by the parties to the contract takes place.

74 See Tyson and Sherry (1997), note 31.

75 Article 16(3). Articles 8 and 9, which are referred to in Article 16(3), concern "Rights and obligations of lawful users" and "Exceptions to the sui generis right," respectively. It should be noted that until a very last-minute decision by the E.C. Council of Ministers, the proposed Database Directive contained a mandatory compulsory license for sole-source providers. See Reichman and Samuelson (1997), note 39, p. 87.

76 Section 4(a).

77 Id., (b).

78 Section 1404(a)(1).

79 Section 1405(b).

80 Section 1404(a)(1).

81 Section 1406(e).

82 Section 1407(a)(2).

83 Section 1407(a)(1).

84 Section 1403(a).

85 Section 1405(5).

86 Peter N. Weiss and Peter Backlund (1997), "International Information Policy in Conflict: Open and Unrestricted Access versus Government Commercialization," in Borders in Cyberspace: Information Policy and the Global Information Infrastructure, Brian Kahin and Charles Nesson, eds., MIT Press, Cambridge, MA.

87 DGXIII (1998), Public Sector Information: A Key Resource for Europe, "Green Paper on Public Sector Information in the Information Society," European Commission, Luxembourg. Available online at <>.

88 Section 1301(6).

89 Section 1305(b).

90 See National Research Council (1997), Bits of Power, note 49, pp. 116-124 for a discussion of circumstances in which privatization of the government's data dissemination function is appropriate and inappropriate.

91 See the Landsat privatization example discussed in National Research Council (1997), Bits of Power, note 49, pp. 121-123.

92 Section 1307.

93 See generally National Research Council (1995), Preserving Scientific Data on Our Physical Universe: A New Strategy for Archiving Our Nation's Scientific Resources, National Academy Press, Washington, D.C.

94 17 U.S.C., section 109.

95 Personal communication from Ann Okerson, Yale University, September 1999. An example of this type of licensing language is contained in Academic Press's IDEAL license (for 200+ journals):

At the same time, the traditional user rights under the first sale doctrine are in danger of being significantly eroded by the Uniform Computer Information Transactions Act, which is currently being considered for enactment at the state level. See generally, "A Guide to the Proposed Uniform Computer Information Transactions Act" at <>.

96 See Association for Research Libraries (1999), ARL Statistics: 1997-98, Martha Kyrillidou et al., eds., Association of Research Libraries, Washington, D.C., also available online at <>, showing trends in average rise in costs of serial (journal) subscriptions between 1986 and 1998, pp. 8-9. For a retrospective look at these issues see <>.

97 Office of Management and Budget (1997), Circular A-110, "Uniform Administrative Requirements for Grants and Agreements with Institutions of Higher Education, Hospitals, and Other Not-for-profit Organizations," revised November 19, 1993; as further amended August 29, 1997.

98 For a history of academic fair use and classroom guidelines, see Kenneth D. Crews (1993), Copyright, Fair Use and the Challenge of Universities, University of Chicago Press, Chicago, IL.

99 Some examples of classroom guidelines may be found online at <>.

100 See <>, and type "licensing" in the "Search" box.

101 For example, for a comprehensive listing of most internationally available data sets from space missions, see the NASA Goddard Space Flight Center's National Space Science Data Center home page online at <>. For a listing of many international Web sites covering all aspects of Earth science data, see the NASA Global Change Master Directory online at <>.

102 For a discussion of some of the large international research programs, see National Research Council (1997), Bits of Power, note 49, pp. 58-61.

103 See Pincus statement, March 18, 1999, Hearing, note 7, p. 33.

104 See the testimony given by Wulf, Reichman, and Neal at the October 23, 1997, Hearing, note 8; by Stewart at the February 12, 1998, Hearing, note 9; and by Lederberg, Phelps, and Neal at the March 18, 1999, Hearing, note 29. See also the position statement signed by representatives of many of these organizations, note 7.

Return to Title Page - through - Summary
Continue to Appendixes A-D