National Laboratory for Applied Network Research
Measurement and Operations Analysis Team
Second Year Program Plan, April 1999 to March 1999
Network measurements and analysis require adaptability to an ever changing networking landscape. Accordingly, the initial MOAT assumptions were continually adjusted during the first year to reflect the evolving requirements of NSF's HPC environment. As we look towards the second year, we expect that fulfilling the goals of the cooperative agreement will require further adjustments, and this second year program plan can only reflect our current thinking.
As an example for a shift in the first year, based on discussions with the HPC involved parties, it became highly desirable to deploy a large number of Active Monitors to allow for HPC-site to HPC-site network performance assessments, and to use those to augment MCI's intra-vBNS measurements. Ideally, these two activities will present enough information to the community to reduce the need for further substantive measurement activities by individual institutions. A project like NLANR has to be flexible enough to, within the established scope and context of the Cooperative Agreement and in close consultation with the NSF, constantly revisit goals and objectives, and adjust the implementation to meet the evolving needs. This is particularly critical in the face of changes to the HPC environment, such as multiple networks, increased complexity, more users and applications, additional institutions, and so on. The NLANR workscope has been and will continue to increase the utility of networking services to and among sites.
This Second Year Program Plan covers the April 1, 1999 to March 31, 2000 time frame.
The primary thrust areas of passive and active measurements and their network analysis will continue in the second year. This encompasses both MOAT-internal work, as well as collaborations with outside parties. These collaborations have been seeded and started in the first year, with multiple universities either working with NLANR, or having expressed interest in collaborations.
Secondary areas include routing analysis based on BGP information, both at a vBNS and as it relates to an Internet systemic level, and utilizing MCI's SNMP data to augment our observations.
A further area of attention is tool development. A significant amount of work has already been invested into the Cichlid OpenGL based 3D server/client data visualization tool. This work is now being made available to the community, and we expect continued development and improvement.
During the next few months we expect to augment the current base of OC3 monitors with OC12 capable machines. The development of the OC12 data collection software, done at CAIDA, has exposed significant difficulties at the interface between the Applied Telecom OC12 cards and the host software. While new and updated versions of both hardware and software promise improvement, we have also begun an, initially contingency, collaboration with OC12 monitoring cards of the University of Waikato in New Zealand. Given the difficulties of the existing cards, and the interest of the Waikato group to not just be vendors, but really work with us on monitoring technologies, our current assumption is that their cards will become strategic after they are proven to work. Those cards are also expect to be Packet Over Sonet (POS), as well at ATM, capable. As needed, we will discuss implementation of additional technologies.
We currently have two pairs of Applied Telecom cards available, and, once the software is sufficiently stable, expect to deploy a machine with a pair of those cards at NCSA and SDSC each. We are expecting a pair of OC12 cards from the University of Waikato very soon, and will deploy them at either the University of Washington in Seattle, or at UC Berkeley. Those OC12 cards will be able to run at either OC12 or OC3 speed, and will also be capable of ATM and POS.
We also expect two pairs of cards from the University of Waikato that will allow ATM-based DS3 monitoring. No sites have been identified for the two deployable units yet.
Given the need extend the scope of the measurements to more parts of the HPC community, an opportunity has arisen, based on discussions instigated by Internet2 representatives, to augment current efforts with measurements from probes on the new Abilene network. We would like to build and deploy minimally 20 more passive monitors, to be deployed in the context of the Abilene network, consisting of a mixture of OC3 and OC12, as well as ATM and IP over Sonet. The initially most interesting deployment sites will be Abilene's connections to gigaPOPs.
Traces will continue to be made available to the research community, both for independent research by other institutions, as well as collaborative activities with MOAT.
The AMP project will expand as more and more measurement machines are being deployed. If possible, we would like to reach all NSF supported HPC sites, and create a comprehensive active measurement substrate to gain insight into performance behaviors and metrics of advanced environments. This will support the advanced networking agenda of the NSF, as well as HPC campus communities and service providers.
As more and more sites come online, the measurement mesh, a virtual overlay across the HPC environment, will become more and more complex. It will become important to create a more structured environment, most likely measurement hierarchies that interconnect local and regional environments. Some of the measurement machines will be used to prototype such an environment.
The current AMP implementation is doing synchronous measurements based on ICMP and routing probes. In addition event/user driven throughput tests are being made available, albeit access controlled to prevent misuse. Additional functionality can be added over time, e.g., some concepts derived from the NIMI project.
Current access to the result data is via the http://moat.nlanr.net/AMP web page. In addition, initial 3D visualization, using the Cichlid tool, have been demonstrated.
The Active Measurement Program is a valuable long-term project, both for the vBNS and for any other large-scale networking environment. By placing measurement units inside the infrastructure, instead of collecting measurements from an external viewpoint, network administrators and researchers can more clearly understand the source of network outages and latency problems.
Effective analysis of historical SNMP data from inside MCI's vBNS domain will contribute to the other active and passive analytical methods deployed inside NLANR's realm. Of specific value towards comparisons will be data derived from the AMP active measurement machines.
This subject is still in early phases of conceptualization, and will need to be revised during the year.
MOAT has two BGP data sources, a daily snapshot from a route server at the University of Oregon, and a continuous BGP session with a vBNS node. Just as with SNMP, the BGP data analysis is still in an early phase, although a student intern has started to work on this topic.
New features will evolve the Cichlid visualization tool into a new version. Those include an expansion of the number of potential real-time data modeling applications by adding support for vertex and edge sets. We will also enhance the user interface for manipulating the graphs with the addition of a second window with intuitive controls.
Vertex and edge sets can be used to model many types of real-world events. We plan to implement a 3-D vertex/edge graph display and server formatting in the next quarters. This can be used to illustrate properties of both the vertex nodes (for instance, IP addresses), and the edge paths between them (for instance, traffic along a route). A researcher who writes a vertex/edge Cichlid server will specify presentation details in the server code, like the vertex coordinates inside a sphere or cube, and attribute value ranges for specifying different object colors and sizes in the graph presentation rendered by the Cichlid client. The server will only send updates to the client about vertices and edges that change, ensuring Cichlid will continue to be the best tool for visualizing real-time data quickly and efficiently with standard workstations.
Later in year two, we may implement bar graph stacks, so that different subsets of the data in a single bar can be indicated by color separations within the bar. In an animated bar graph of traffic volume across particular routes, for example, each bar could be stacked with colors to proportionately represent different protocols in use along each route.
As the SNMP data analysis matures, we would expect to use Cichlid for visualizing the SNMP data tree and understanding the relationships between the SNMP MIB variables as they change through time. This will enable NLANR to find trends in historical data and near real-time measurements, potentially leading to more predictive theories of network behavior.
Using Cichlid for simultaneous visualization of multiple live data sources, and using ever-smaller server formatting and client rendering hardware, researchers will be able to evaluate details of their changing environments, even with data from remote probes. Cichlid is not only a very useful tool for network researchers in the HPC environment, but also a major step forward in distributed real-time data collection, analysis, and visualization.
As MOAT moves into its second year of high-performance network measurement research, we will continue to seek collaborations with the HPC community. This was seeded and started in the first year, by means of both explicit collaborations, as well as making network data publicly available. We would like to help universities to cultivate a new generation of students who know how networks work internally, who more thoroughly understand its behavior, and who can use this knowledge to design and build new high performance networks.
As we have done in the first year, we intend to continue to involve both undergraduate and graduate students in the MOAT workscope.
We would like to hold two workshops with the objective of assessing requirements of high performance scientific application across high performance networks. The first of those should set the stage of considering a framework for discussions by assessing activities and existing machinery, and how to apply those to the current environments, with the second workshop focusing on actually building the framework in the context of upcoming multi-network HPC environments.