16 April 2005
March 2, 2005
Background
The Advanced Research and Development Activity (ARDA) is a U.S. intelligence community (IC) center for conducting advanced research and development related to information technology (IT). ARDA sponsors high risk, high payoff research designed to produce new technology to address some of the most important and challenging IT problems faced by the intelligence community. The research is currently organized into five technology thrusts: Information Exploitation, Quantum Information Science, Global Infosystems Access, Novel Intelligence from Massive Data, and Advanced Information Assurance. More information is available at http://www.ic-arda.org/ .
The IC uses a specialized information infrastructure and a unique security environment that must be able to acquire, retain, and provide access to highly sensitive information for many years. In this environment, relying solely on the commercial sector to satisfy IC information assurance requirements is unacceptable. Relying on COTS for certain security-critical components within the IC information infrastructure incurs even greater risk when these components are developed outside the purview of the IC or IC-sponsored organizations. The Advanced Information Assurance (IA) research thrust within ARDA's overall R&D program is tasked with providing tailored security solutions for the IC to fill any perceived security gaps in the IC's information infrastructure. Its program is currently focused in the following areas: (1) countering the insider threat; (2) cyber intelligence; (3) high assurance for IC information infrastructure; (4) new defensive concepts; and (5) quantum cryptography.
As part of its overall IC security research program, ARDA's Information Assurance research thrust is initiating research in traceback within information networks used by the intelligence community, such as NIPRNET, SIPRNET, JWICS, and IC enclaves.
Program Overview
We seek to develop tools and techniques for the traceback of attacks carried out over information networks to their originating source. Attacks on information assets of the Intelligence Community (IC) can occur over restricted or open networks using the Internet Protocol (IP). These attacks begin with an originating host, may involve passing information through various stepping stone hosts to reach a controller node, which in turn might control a number of zombie hosts that have malicious software implanted within them (most often without their knowledge or consent). Upon a signal, these zombies may attack one or more target machines, to perform either a denial of service attack or to modify or exfiltrate information from them. In between participating hosts, the packet stream may transit many network devices such as routers, switches or network address translation (NAT) devices. (The distinguishing feature of a network device is that it does not initiate or terminate packet streams, but merely directs or modifies them. When an attack packet stream transits one of these devices, it becomes a "Router on Path" or a "NAT on Path". For the purposes of this document, routers and switches will be referred to as routers.)
Exfiltrated information may go to a receiver host that, in turn, is separated from the originating host by a series of stepping stone machines. The originating host, stepping stones, controller, zombies, and receiver hosts might be considered part of an uncooperative environment. Some of the zombies and the target machine(s) would be sited within a cooperative environment, where "cooperation" means that the IC may influence the content, configuration and operation of the hosts. Others may reside within uncooperative or hostile sites. Figure 1 provides a schematic of such an attack structure described above.
Figure 1-Process diagram of attack scenario |
In the event of such an attack, it is vital that the IC be able to perform attack attribution - to identify the individual or group responsible for a computer network attack. Generally, attack attribution is accomplished by the integration of information from many sources.
We are concerned with the means that are available to develop information relating to a specific attack using solely technical means, which we call "traceback" to distinguish it from other methods. The best information that can be developed through traceback is the identity of the computer host originating the attack. Many times, only much less specific information will result. Nonetheless, the information generated during traceback may be combined with other information - either information gleaned by traceback or from other attacks, or information from other sources altogether - to yield attack attribution.
Ideally, traceback reveals a specific network location, resulting in the identity of the attacking host. If the actual attacking host cannot be found, the location may be bounded to increasingly less exclusive network locations, such as (in decreasing order of specificity):
Originating LANAttacker's Access Provider
Originating Autonomous System1 containing the attacking host.
___________________
1 We define an autonomous system (AS) as a set of hosts that are reachable only via a set of core routers under a single management.
The attacker's Access Provider gives the attacking host access to the worldwide network. Whether the attacking host is a single host in a private home, a host on a small network such as an Internet café, or a host in a large company intranet, the Access Provider provides the aggregation and Point of Presence (POP) equipment that leads to the network core and worldwide access.
Traceback may proceed from participating host to participating host, or from AS to AS, or from router to router, etc. Note that techniques may be needed to distinguish a participating host, LAN, or AS (that is, one through which attack packets pass) from the originating host, LAN, or AS. Further, the attack packet stream may transit an AS that contains no participating host. In this case techniques may be needed to distinguish this AS from one that contains a participating host.
Research Solicited
We are soliciting research that will significantly improve the science and practice of network traceback, and are seeking tools and techniques to increase our knowledge of the true source of an attack. We are seeking solutions for both IPv4 and IPv6 networks. We are particularly interested in tracing attacks involving confidentiality and integrity of information on IC networks. Therefore techniques designed for tracing anonymous packet flooding attacks causing denial-of-service (DDOS and DOS) in IC networks back to their source are not of interest.
Classified white papers up through TS/SCI may be submitted if that is deemed necessary to describe the proposed research. Contact Robert Husnay, husnayr@rl.af.mil, 315-330-4821 for classified submission procedures. We anticipate that the research outlined in the white papers can be adequately described in an unclassified manner, even for those efforts that may result in classified proposals. Organizations without facility clearances are encouraged to team with cleared organizations in order to ease the transition of their research into an operational setting.
Research projects are expected to result in useful software tools and/or experimental development models (i.e. brassboards) that form a functional capability addressing some important aspect of the traceback problem. They should discuss means for collection of information, analysis, and testing and evaluation of the resulting system within a (simulated or actual) network testbed. Projects specializing in highly novel and interesting techniques for just collection or just analysis will be considered, however, if deemed to be of "breakthrough" quality and importance. White papers should discuss why the proposed traceback solution is different from previous traceback solutions that can be found in the literature. They should also discuss the assessment criteria to be used for the proposed research. What are the measure(s) of success? How are they to be measured? What auxiliary resources are necessary for this measurement to take place? How would the evaluation take place within a network testbed?
The tools developed should be useful in a standalone manner, and not depend on other proprietary tools or frameworks. They should be able to operate across combinations of cooperative, non-cooperative, and hostile networks.
Traceback
We are focused on tools and techniques for tracing attacks that involve single
packets, encrypted payloads, "stepping stones" (compromised hosts), and similar
attack attributes. We are seeking traceback solutions that perform in one
or more of the following network environments: cooperative, non-cooperative,
and hostile. Solutions for the non-cooperative and hostile network scenarios
are of particular interest. We seek to develop a suite of IP traceback
techniques that require:
White papers should identify what concealment methods their proposed tool
can use to mask its operation, and what concealment methods used by an adversary
it can overcome. An adversary's obfuscation techniques can include:
We are specifically interested in traceback techniques that can operate under
one or more of the following conditions:
We are seeking techniques that can trace the origin of a single IP packet delivered by a TCP/IP network in the recent past. The techniques to track individual packets in a network must be accomplished in an efficient, scalable fashion.
Wireless ad hoc networks are self-organizing systems formed by cooperating nodes. The topology of these networks is dynamic, decentralized and ever-changing with the ability of nodes moving arbitrarily. We are seeking IP traceback technologies for wireless ad hoc networks. Solutions proposed may involve the application of existing IP traceback techniques or new and novel approaches that perform more efficiently than existing techniques. Traceback involving wireless networks will often involve dealing with encrypted communications.
Collection
Information is the essential ingredient for solving the traceback problem. In almost all cases, the relevant information does not reside on or within the victim's host or system. The pertinent data may lie within the victim's domain, outside the victim's domain but still within cooperating parts of the infrastructure, and very much far removed from the victim in uncooperative and possibly hostile parts of the infrastructure. The information needs extend across a wide range of possible data to include survey and network mapping data that might be an ongoing background collection function, or immediate real-time ongoing attack/trace data.
The objective of research on collection is the development of access methods to enable the collection of pertinent data. The techniques may be either active or passive depending on particular circumstances. The techniques may include the active generation of data by the sensor as well as passive collection.
Research in the collection area would include identification of the high-value
information with respect to the traceback problem, approaches to identifying
the location of traceback-pertinent data, and approaches to developing access
to that information. The research efforts should result in new, innovative
and high-value identification, access and collection techniques. Examples
of capabilities developed in the collection area could include:
Research on collection should consider means and technologies to be used in host devices in order to watermark or otherwise tag network traffic (e.g., time perturbation, resetting protocol parameters). However, we are particularly interested in techniques that do not require alteration of network packets or violate IP protocols, because such alteration may reduce the stealthiness of traceback methods. Tools and techniques should be developed to gather information from various available network elements (e.g., routers, switches, gateways, NAT front-ends, firewalls, VPNs, DSL/Cable modems).
Analysis
The information currently available for traceback generally does not provide a direct approach to determining the source of an attack. At best, a careful and informed analysis of the data may provide hints and clues regarding the attacker, but not much more. The analysis tends to be piecemeal and disconnected, held together by the participation of, and intimate involvement of, the analyst. The process is manpower intensive. The scope, scale, duration, detail and stages of traceback are extending beyond the capacity of one or a few analysts to "hold it all together." A further exacerbation of the analysis problem is the rapid expansion of both the complexity of the IC networks and the intense detail knowledge that an analyst requires. Consequently, new and innovative approaches to traceback analysis must be developed.
We are seeking capabilities, among others, that support an "electronic playbook" for traceback - an implementation that provides a capability to capture expert knowledge that is relevant to a variety of traceback scenarios to include the selection of tools based on certain pre-conditions. This expert knowledge may then be used by other analysts, or recalled at a later time by an individual analyst.
The analysis portion of a research project will of course be expected to incorporate information items and flows developed by the collection aspects of the project.
The current analytical approaches span a wide range of both form and content variation in the available data. These approaches also have considerable latency: they are slow to assimilate data and slow to develop trace hypotheses. New and innovative solutions need to be developed to support the analyst to store, manipulate, mine and extend the very large knowledge bases supporting traceback, and then use them to affect traceback measures and analyze the results. Algorithms, heuristics and data mining capabilities are needed that can quickly perform data gap analysis, hypothesis generation, inferencing, and pruning - particularly based on partial knowledge as well as correlation of attack information from widely-spaced inputs and historical events.
Testing
Projects should result in a capability that can be initially evaluated within a realistic (but possibly simulated) networked environment within fifteen months of project start. More in-depth test and evaluation of prototype systems would be conducted within the optional phase two of projects, for those projects that are funded to continue into that phase. Sufficient test and evaluation should be performed during the project's first phase to assist ARDA in evaluating progress that would merit continued (phase two) funding. Tools and techniques should be ready for operational evaluation at the conclusion of phase two of the project.
We intend to utilize contractor support to assist us in the evaluation of the traceback tools being developed. Principle Investigators (PIs) will be required to interact with this 'evaluation contractor' with regard to evaluation plans and procedures. The evaluation contractor may also witness some of the PI Itraceback testing.
Traceback Testbed
We believe that a 'one size fits all' government testbed is not feasible for the testing of different traceback technologies. Therefore ARDA will not provide traceback testbed network facilities.
We mention the following testbeds, which that may be relevant to individual projects to aid PI's in their testing and evaluation during their projects. We encourage, but recognize that this type of testing may not be possible during Phase I of the project.
One such testbed is DETER/EMIST, operated by the USC Information Sciences Institute, and funded by NSF and the Department of Homeland Security's HSARPA. DETER is a homogeneous emulation cluster based upon the University of Utah's Emulab software. It implements network services such as DNS, BGP, with added features for containment, security, and usability. The network is distributed, and can be accessed via tunneling from remote sites. The companion EMIST program provides attack scenarios, attack simulators, generators for topology and background traffic, and tools to monitor and summarize experimental results. It promotes scientifically rigorous testing frameworks and methodologies. Further information is available at http://www.isi.edu/deter/ .
A second potential testbed is the R&D Experimental Collaboration (RDEC) program, operated for the intelligence community by the Advanced Systems & Concepts (ASC) division of SAIC. PlanetLab (www.planet-lab.org) and the Information Operations (IO) Test Range currently under development are also mentioned for your consideration. In some cases, realistic simulation of operational environments may also be considered as a viable testbed.
http://www1.eps.gov/spg/USAF/AFMC/AFRLRRS/Reference%2DNumber%2DBAA%2D05%2D04%2DIFKA/SynopsisP.html
Document Type: | Presolicitation Notice |
Solicitation Number: | Reference-Number-BAA-05-04-IFKA |
Posted Date: | Mar 02, 2005 |
Original Response Date: | |
Current Response Date: | |
Original Archive Date: | |
Current Archive Date: | |
Classification Code: | A -- Research & Development |
Naics Code: | 541710 -- Research and Development in the Physical, Engineering, and Life Sciences |