National Laboratory for Applied Network Research
Measurement and Operations Analysis Team

First Year Annual Report and Second Year Program Plan

Table of Contents

Summary for 1998

First quarter, 1998
Second quarter, 1998
Third quarter, 1998
Future (fourth quarter) activities, 1999

Second Year Program Plan


Summary for 1998

The primary focus of the MOAT activity has been the initiation, design, and deployment of a Network Analysis Infrastructure (NAI). The focus on analysis emphasizes the necessity of not viewing measurements as an end in and of itself, but to apply derived data towards answering questions that make a difference in the HPC environment. In that context, the analysis results are to be applied to critical areas, such as traffic engineering, usage accounting, workload profile distributions, and so on.

The initial components of the NAI are central equipment to absorb the collected data and provide computing and result presentation engines, as well as passive and active monitors deployed throughout the country. By the end of 1998 the deployed passive monitor publish daily between (uncompressed) approximately two and three gigabytes of collected packet traces and flow and transaction summaries of the traces. In addition the initial active monitors were starting to produce data, which will soon be made available to the community. In addition SNMP and BGP routing data is being collected, although little effort is going into analysis of those data sets so far.


First quarter -- April 1998 to June 1998

The NLANR's Measurement and Operations Analysis Team early on defined four primary areas of interest in the high-performance networking environment: passive monitoring, active measurements, SNMP/MIB data analysis, and Internet routing. The initial focus was on passive monitoring.

Passive monitoring of the HPC environments, such as the vBNS, means recording and analyzing packet header traces from the network substrate, without actually injecting additional data. MOAT has been primarily adopting a FreeBSD version of the OC3mon activity for this effort, and making it work for the NLANR environment. These monitors have been deployed at several HPC institutions to serve multiple distributed data analysis tasks.


NLANR/MOAT deployed passive monitoring machines

Privacy of the collected data is an important consideration, and data that is being made available has its IP addresses encoded, as described in http://moat.nlanr.net/Traces/. Information on the monitors is available at http://moat.nlanr.net/Coral.

A summary, continuously evolving slide presentation of the NAI is available at http://moat.nlanr.net/Presentations/NAI.

Toward the end of the first quarter, Tony McGregor, as part of his half year sabbatical with NLANR, began the conceptualization and development of the NLANR Active Measurement Program. This report discusses his work and the AMP project in further detail in the report of third quarter activities, and in the program plan.

MOAT developed a prototype VRML visualization tool for the vBNS SNMP data sent to NLANR every day by the vBNS team at MCI. However, the bulk of SNMP data analysis was postponed until the second year, when the data could be more completely integrated into the more pressing research of passive and active monitoring. Similarly, MOAT explored the realm of systemic Internet routing in the context of BGP and autonomous system path length data, and did draw some initial results.

Arguably, the most important achievement of the first quarter was the initial development of a framework data collection, analysis, archival, and publication system for MOAT's current and planned agenda. MOAT continues to explore the potential for making the collected data useful to other researchers, including the Datacube, a web-based interface for accessing the huge matrix of data along three axes: the collection project, the collection date, and the origin of the data.

The full report of the first quarter of year one is on-line at http://moat.nlanr.net/Reports/MOAT1stq/.


Second quarter, 1998 -- July 1998 to September 1998

Early in the second quarter, MOAT implemented the central components of its network data collection and analysis infrastructure, by networking several high-performance Pentium II machines together as a central research server cluster. The nai.nlanr.net machine collects raw data from the distributed Network Analysis Infrastructure machines, performing some analysis and sanitizing the data for privacy in a reasonably secure environment. The moat.nlanr.net machine publishes and performs analysis on public data, and serves other NLANR web pages. Four other single-processor machines act as data analysis computing engines for researchers needing processor power and storage capability for large traces.

MOAT deployed more than 10 OCXmon passive monitors, which collect large amounts packet header trace data, which are then analyzed, sanitized, and published for research use. Looking ahead to the Active Measurement Program, MOAT determined a need for more than just a storage and publishing infrastructure. Existing 2-D graphing tools proved of limited use in handling the complex, multi-dimensions, and massive volume of data collected in these projects. Finding no visualization packages capable of displaying large, live datasets with 3-D graphs in real time, UCSD student Jeff Brown created the Cichlid tool, using the standard OpenGL graphics library.

Cichlid runs on Linux, FreeBSD, and Silicon Graphics systems. The Linux OpenGL emulating library code (Mesa) includes support for 3DFX based graphics accelerator cards. The Cichlid visualization system is designed for a distributed environment, with server machines collecting and formating the data, and transmitting it to client machines. A client machine can open connections to multiple data servers, displaying several animated 3-D graphs to visualize live, changing data sets, that can be manipulated by the user.

Still images of animated Cichlid graphs, included on the next graphic, demonstrate the versatility of the system. The purple skin-like surface of the middle graph in the left panel is a NURBS surface mapping, or Non-Uniform Rational B-Spline. Splines are ways of meshing polynomial equations together to smoothly blend into a single surface. Jeff Brown's implementation of NURBS surfaces in Cichlid allows researchers to view their live data as it changes and shifts, indicating qualities of the changing values that would not be as easily seen in the cubic bar-graph representations, as the NURBS surface often perform as a noise reducing function. In some applications, the bar graph is the more useful representation, and Jeff included customizable color features so researchers can create both useful and visually appealing representations of their data.


Snapshots of animated Cichlid graphs made with real-time data.

It is important to point out the Cichlid is not the result of a development of a solution only useful for MOAT's needs, but a data visualization utility of value transferable to the larger scientific research community. Cichlid is being discussed in greater detail in the next sections.

In addition to the development of Cichlid and other progress in deploying the passive monitoring infrastructure, members of MOAT prepared presentations and demonstrations using Cichlid servers for SC98, the high performance computing and networking conference in Orlando Florida. In addition during this quarter, MOAT focused on initial planning and development for the Active Monitoring Program, or AMP, discussed in the next sections. MOAT member Tony McGregor also worked on IPMP, an Internet Protocol Measurement Protocol. IPMP is a potential protocol designed to solve problems using ICMP for measurement of network latency, to enhance capabilities for active monitoring.

More information about MOAT activities in the second quarter of its first year can be accessed via the web at http://moat.nlanr.net/Reports/MOAT2ndq/.


Third quarter, 1998 -- October 1998 to December 1998

During the third quarter, MOAT accomplished significant work in the following areas:

Passive measurement data analysis activities

MOAT automated the collection of passive monitoring data, encoding gigabytes of data every night, and making the results publicly available on the MOAT web and FTP server. In addition, analysis results are automatically loaded into the Datacube.

Brynjar Viken collected packet header traces on disk for more detailed analysis later on. In cooperation with David Koester, SCinet98 Network Architect, Brynjar began writing a study of the traffic dynamics on the SCinet98 network. Brynjar expects effort on this research paper to be complete in the first quarter of 1999.

MOAT provided some assistance to Mike Tesch of CAIDA in his development work on a usable OC12mon unit, by helping with hardware selection and testing. Some computer hardware was also made available to CAIDA for the OC12mon development work.

It is highly desirable to involve a broad set students and faculty in the network analysis activities, and MOAT attempted to instigate collaborations with people at multiple other institutions. The hope is to instigate more interest in research into the workload of real Internet environments, by making data, other computing resources, and more generally collaboration opportunities available. Main active collaborations in the third quarter included a doctoral student at Princeton University, and multiple students at UCSD. These students not only utilized the collected data, but also the compute engines instrumented by MOAT.

AMP - active measurement project

Tony McGregor research faculty at the University of Waikato in New Zealand, is working with high performance network analysis. During his half year sabbatical with NLANR, he began to lead the NLANR Active Measurement Program (AMP), to build the active, traffic-injecting infrastructure, that allows for performance assessments of HPC networks.

At the start of the third quarter, Tony completed a paper analyzing the performance of world wide web traffic on asymmetric satellite networks, which are increasingly used to route data traffic to parts of the world where terrestrial surface connections incurs high cost, is in the case of New Zealand and other parts of Asia. ( http://www.nlanr.net/~tonym/spie/.)

As a next step, Tony constructed a round trip time (RTT) and route measuring system for both point-to-cloud measurements (in which a single node measures RTT to all the other nodes) and 10-unit matrix mesh measurements (in which each node measures RTT to each other node). Tony coordinated the results in a web interface tabularly, with line graphs drawn on the page using GNUplot, using the Otter route visualization system developed at CAIDA, and by writing a data processing server for the Cichlid distributed visualization system. Tony developed an elaborate set of network data visualization tools for use by network administrators at institutions experiencing connectivity problems, in addition to researchers studying network behavior and dynamics.

The RTT graphing tools allow administrators to easily track changes in network performance over time, using historical archived data and almost present-moment measurements. By clicking on intuitive links in the RTT graphs, a network admin can view a series of outputs from the traceroute command from each node to all the other nodes at any particular time. The history of traceroutes can be consolidated visually using CAIDA's Otter tool to find inconsistent and looping routes.


Current web-based AMP data analysis. Clicking various links can view traceroute data, or view datasets with the Otter or Cichlid tools. Future AMP plans include implementation of similar visualization schemes for data collected by the AMP measurement machines in the vBNS infrastructure.

The OC3mon passive measurement machines were initial platforms to concurrently run Tony's active RTT measurements and traceroutes, since the AMP infrastructure was still in its initial development phase. The data is brought back to a central server and structured there. Then, his Cichlid server processes the data when the tool user clicks on a web link, feeding it to the user's Cichlid client for instant 3-D visualization. Tony wrote two types of server data formats for Cichlid visualization of the early active measurements, which he presented at SC98. In the first (not pictured here), bars along one axis depict packet RTT from the measurement node to each other node, and the other, longer axis represents time. A long red bar on the negative side of the graph represents packet loss. Although the bar graph mode does not always show the packet loss clearly, the Cichlid client's real-time NURBS surface rendering shows a clear deep trough in the graph when the measurement node loses connectivity to another node. The hidden label on the packet loss bar tells a user the name and address of the unreachable machine. These labels can be coordinated with the 10-square mesh of RTTs displayed in the second data format graph to determine if the site is actually down, or if the route from the local measurement node to the site is malfunctioning.


The Cichlid animated real-time NURBS surface mapping can help elucidate details in a changing environment that might otherwise go unnoticed if represented as quickly changing bar graphs. This graph on the right, from Tony's second type of Cichlid server, represents a full mesh of round-trip times measured to and from every host in a ten-by-ten matrix. In a perfect network, the left and right sides of the graph (split along the diagonal top to bottom) would be symmetric, but the NURBS surface reveals differences in traffic patterns depending on the direction of traffic flow at the time. This may be due to inconsistent routing, which can be further examined by using Otter to display the same dataset.

The second type, which can also be used in real time, shows asymmetries between RTTs collected to and from the same machines, indicating possible asymmetric routes or unidirectional latency. Again, the NURBS surface illustrates clear distinctions between stable, smoothly graphed networks, and turbulent, or 'knobby' networks.


Early AMP deployment sites

In December of 1998, MOAT sent invitations to NSF's HPC award sites around the country, asking if they would agree to locate one of the AMP active monitors on their networks, specifically at a representative location for the services exchanged between the HPC network and the local site. In result, an initial almost 20 machines had been deployed, with more expected in the first quarter of 1999.

The AMP project will provide a means of assessing site-site performance, and to compare the results to the intra-vBNS measurements that MCI is undertaking. The differentiation between intra-vBNS and inter-site will help determining means to determine weak components in a heterogeneous networking environment, stretching multiple technologies and administrations. Already it has shown that out-of-the-box and untuned computers with commonly low window sizes do not perform well in an HPC environment. This will become a serious area of investigation on the second program year.

Cichlid real-time visualization

Cichlid developer Jeff Brown did not attend SC98, but helped Tony and Brynjar prepare their data processing servers. As the author of the visualization software, Jeff wrote over 8000 lines of C code for the distributed visualization software.

In this third quarter, Jeff maintained and improved the software further. He added several significant capabilities:

Additionally, Jeff wrote extensive documentation for the Cichlid server API. This will help other researchers who want to use the Cichlid data processing servers to format their data in a way that will be useful to visualize with the visualization client software. The next section discusses future plans for Cichlid development, as well some potential uses of this powerful and versatile visualization package.

Brynjar Viken is a doctoral candidate in the Department of Telematics at the Norwegian University of Science and Technology in Trondheim, working with NLANR to study network behavior. Brynjar wrote Cichlid data processing servers to instantly visualize bi-directional traffic collected with NLANR's OC3mon passive monitors from between the deployment sites and the SCinet98 conference floor network at SC98.

Using data collected by an OC3mon unit, Brynjar demonstrated distributed real-time visualization to the conference using his data servers to format the data in a pre-programmed matrix. The 3-D graphing client rendered this data in a rotating, viewpoint shifting space, using either bar graphs or NURBS surface modeling to illuminate particular useful details of the traffic behavior.


This Cichlid graph plots byte usage on the SCinet98 network by the most commonly used TCP and UDP ports. From left to right, the graph displays usage of file transfer, telnet, e-mail, web, AOL, Usenet news, bytex, port 6970, and other types of traffic. The Cichlid client animated this graph in real time, and displayed each new iteration along the axis closest to this viewpoint. Further examples of the work at the SC98 conference can be seen in the OCXmon environment section at http://moat.nlanr.net/SC98Demos/.

SC98 - presentations and SCinet98 live demo

Tony McGregor and Brynjar Viken attended the SC98 high performance networking and computing conference in Orlando, Florida, to communicate with industry and academic partners and competitors about the year's developments in new computing.

The initial objectives for were defined by need for demonstrations at SC98 - to develop a distributed, multi-use visualization tool for live data, Cichlid. Jeff Brown, in collaboration with others, undertook the development in the second and the third quarters, before the SC98 conference to develop the software's functionality to a point where it served a useful purpose at the supercomputing and high-performance networking conference. This met the goal of taking the data collected by the Network Analysis Infrastructure and present it instantly in a form that yielded useful insight. At SC98, Tony and Brynjar achieved that objective, by demonstrating both active and passive data using this versatile data presentation package.


Future (fourth quarter) activities, 1999

Brynjar Viken plans to finish his paper on the traffic dynamics of SC98's conference floor network (SCinet98), in collaboration with its architect, David Koester. Brynjar (1) is an example of how NLANR is helping to develop the next generation of talent needed to progress networking and allow for insightful evolution of the overall environment.

The initial response to the AMP invitations was very promising, resulting in a significant number of machines very quickly. We expect many more HPC network administrators to be receptive to an AMP collaboration, especially given NSF's support and the inclusion in Bill Decker's HPC newsletter.

In a joint agreement between the University of California at San Diego and the University of Waikato, Tony McGregor will continue to lead the AMP activity from new Zealand after his six month San Diego sabattical ends. This includes the development and deployment of the AMP system and analysis of the collected data, perhaps even expanding the scope and usefulness of the active monitors to a global scale.

Footnote:

In early February, Brynjar Viken will depart San Diego to return to Trondheim and continue his doctoral studies. The work he accomplished to this point with MOAT integrally complements his future studies, theories, and models of high-performance network dynamics, in which he will use MOAT packet trace datasets. Brynjar hopes to continue expanding the global measurement collaborations with NLANR from his vantage point in Norway, and is also interested in measurements of the Norwegian academic networks.

Brynjar's main research area is measurements and analysis of IP traffic. This includes investigations of various strategies to perform measurements, studies of phenomena observed in collected measurement data and simulations of network behavior. Brynjar plans to use NLANR/MOAT measurement data for parts of his research and his work will be based on both simulations and measurements.

Brynjar may also collaborate with the Department of Systems Engineering and Telematics at Sintef (The Foundation of Scientific and Industrial Research of the Norwegian Institute of Technology) and Uninett, the Norwegian academic network for research and education. Sintef and Uninett have started a joint Internet project which among others address synthesized generation of Internet traffic, measurements of quality of service and accounting/billing. Related information can be found for Sintef and Uninett.