Contents

Agenda 2

List of Participants 3

Executive Summary 4

Synopsis 5

Comments by Participants
Jill Gemmill 14
Chris Thomas 14

Presentation Materials
June 29 17
June 30 60

 

 

Proceedings prepared by
Todd Hansen, Mike Gannis, and Hans-Werner Braun

About the cover:

This image is from the Abilene Weather Map (http://hydra.uits.iu.edu/~abilene/traffic/abilene.html), a real-time monitor of traffic between Abilene network core nodes developed by the Abilene Network Operations Center at Indiana University. Matt Zekauskas (Internet2) gave a presentation on Abilene NOC tools at the workshop that described this utility. Image used with permission.

The workshop and these proceedings were sponsored by National Science Foundation Cooperative Agreement No. ANI-9807479 with the National Laboratory for Applied Network Research at the University of California, San Diego. The Government has certain rights to this material. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or other institutions.

Workshop Agenda

Tuesday, 29 June 1999

Introduction

Hans-Werner Braun, National Laboratory for Applied Network Research

Goals and Measurements/Analysis Needs for High-Performance Communities

NSF perspective

Bill Decker, NSF

I2/Abilene perspective

Matt Zekauskas, Advanced Network Services, for Guy Almes

NGI perspective

Phil Dykstra, Army Research Laboratory

Open discussions

Demonstrations

Activities Overview for Measurement/Analysis:
What is being measured? What is being analyzed? What is being made available?

vBNS

Kevin Thompson, MCI WorldCom

Abilene

Matt Zekauskas, Advanced Network Services

Supporting Measurement/Analysis Activities

Surveyor

Matt Zekauskas, Advanced Network Services

NLANR measurement and analysis activities

Hans-Werner Braun, National Laboratory for Applied Network Research

NLANR Active Measurement Project (AMP)

Tony McGregor, University of Waikato, NZ, and NLANR

NGI activities

Phil Dykstra, Army Research Laboratory

Open discussions

 

Wednesday, 30 June 1999

Presentations by Site Representatives:
What do we measure?
What do we analyze?
What do we make available?
Expectations for collaborations and
concerted activities.
Measurement/analysis issues.

Bill Decker, NSF

Ronn Ritke, UCLA

John Cleary, WAND project at University of Waikato, NZ

David Moore, CAIDA/CoralReef

Mark Foster, NASA

Andy Germain, NASA

Les Cottrell, SLAC

Linda Winkler, ANL

Henk Uijterwaal, RIPE-NCC, Europe

Andrew Adams, PSC (NIMI)

John Jamison, StarTap

Discussions and Demonstrations

Discussion
Extension of the Freedom of Information Act to apply to data acquired through federally funded research.

Open discussion
Workshop objectives, network performance assessment, and reporting of results to oversight groups.

Additional demonstrations

List of Participants

Name

Affiliation

Andrew K. Adams

NLANR: NCNE @ PSC

Hans-Werner Braun

UCSD/NLANR

Javad Boroumand

NSF

Jeff Brown

UCSD/NLANR

Richard Carlson

DOE

John G. Cleary

University of Waikato

Neil Cotofana

UCSD/NLANR

Les Cottrell

ES-net, SLAC/HENP

Bill Decker

National Science Foundation

Phil Dykstra

DREN/JET/DOD/ARL

Mark Foster

NASA Ames/NREN

Mike Gannis

SDSC/NPACI/NLANR

Jill Gemmill

University of Alabama at Birmingham

Mario Gerla

UCLA

Andy Germain

NASA – EOS

Todd Hansen

UCSD/NLANR

Dave Hyatt

University of New Mexico

John Jamison

EVL/UIC and StarTap

Sunil Kalidindi

Advanced Network Services

Daniel Karrenberg

RIPE-NCC

Landy Manderson

University of Alabama at Birmingham

Tony McGregor

University of Waikato/NLANR

David Mitchell

NCAR

Keith Monroe

University of Florida

David Moore

UCSD/CAIDA

Richard R. Moore

Michigan State University

Greg Redder

Colorado State University

Sandra Redman

University of Alabama at Huntsville

Ronn Ritke

UCLA

Medi Sandidi

UCLA

David Shealy

University of Alabama at Birmingham

David Sutherland

Advanced Network Services

Chris Thomas

UCLA

Kevin Thompson

MCI WorldCom

Henk Uijterwaal

RIPE-NCC

Alan Verlo

EVL/UIC and StarTap

Linda Winkler

Argonne National Lab

Matthew J Zekauskas

Advanced Network Services

Measurement and Analysis
Collaborations Workshop

Executive Summary

he purpose of this workshop was to assess current measurement and analysis capabilities and to find new areas for future collaborations among researchers and high-performance network service providers.

Participants described ongoing passive and active measurement programs – MCI WorldCom using OCXmon, the Internet2 Surveyor project, the NLANR’s passive and active monitoring, BGP/SNMP analysis, and development of analysis and visualization tools – and discussed potential objectives for the measurement groups to accomplish in the immediate and extended future.

Several issues were recurring themes throughout the workshop:

· The "last mile" problem on research institution campuses is ubiquitous and very difficult to overcome, and campus network administrators are not fully sensitized to the problem.

•End-to-end performance is the true test of connectivity; campus-to-campus performance measures have little bearing on the effective throughput that a researcher sees.

· Representatives of several universities expressed strong interest in having a machine (possibly a portable active measurement system) that could be used to test their campus networks to assist in locating problems.

· An immediate problem is finding a way to present meaningful results in a simple, easy-to-understand format to management and to oversight groups. Continued funding of high-performance networks depends on being able to demonstrate adequate return on investment; hard facts that quantify improvements in throughput and in the productivity of researchers are needed.

· A common problem is the question of how to determine what performance an individual researcher should expect from a specific HPC connection.

· Guaranteed quality of service (QoS) seems to be a significant challenge for all of the high-speed networks.

Several conclusions emerged from the presentations and discussions:

· We must set up and perform measurements to acquire information that will demonstrate the value of HPC networks to the research community. By the end of 1999 we should have an initial set of results that can quantify improvements in throughput and in the productivity of researchers, to justify continued Federal support and show return on investment.

· There is no single-tool solution to the problem of performance analysis. Instead, it is only by the suitable application of all of the available tools that we can gain an accurate picture of the network.

• Achieving good end-to-end performance on high-performance networks will require overcoming the "last mile" problem on campuses. This can be done by working with end users, campus network administrators, and decision-makers, and by providing campuses the monitoring devices and measurement and analysis software tools that we are developing.

• Areas with good potential for collaboration include performing more in-depth analysis (perhaps based on thesis projects in CS and Engineering departments), refining and extending analysis and visualization tools, and extending the measurement and analysis infrastructures towards users and their applications.

Measurement and Analysis
Collaborations Workshop

Synopsis

he purpose of the workshop held at the San Diego Supercomputer Center (SDSC) on June 29 and 30 was to assess current measurement and analysis capabilities and to find new areas for future collaborations among researchers and high-performance network service providers.

Hans-Werner Braun, Principal Investigator of the NLANR Measurement and Analysis group at SDSC and local host of the workshop, opened the session with a general introduction.

Goals and Measurements/Analysis Needs for High-Performance Communities

Bill Decker, Director of the NSF’s Advanced Network Infrastructure program, gave the first technical presentation, on the NSF perspective and the issues and tensions that High Performance Connections (HPC) members face. Measurement and analysis is important as a legitimate and fruitful field of research, and the NSF desires to promote this field. In addition, it is incumbent on HPC grant awardees to demonstrate that their activities are useful and are generating an adequate return on investment. The Government Performance and Results Act of 1993 (GPRA) charges Federal agencies to account for program results through the integration of strategic planning, budgeting, and performance measurement. Measurable results in order to demonstrate to Congress and to oversight groups such as the President’s Information Technology Advisory Committee (PITAC) that the networks’ performance meets expectations and are worth the investment of time and money. Decker went on to discuss the NGI review process and the problems we ran into during the last review. It is up to us to prove to funding sources and to on-campus administrators that high-performance networks are worthwhile and make a difference to scientists’ ability to conduct scientific research.

Matt Zekauskas presented the Internet2 measurement perspective, and described the goals of the Internet2 project, their measurement and analysis program, and their current measurement efforts. He identified one of the more important points as the question of troubleshooting application-to-application performance – when a user says "My application has a problem," how do we determine where that problem is located? Zekauskas’ discussion of Internet2 measurement tools seemed to reinforce the conclusion that there is no single-tool solution to the problem of performance analysis. Instead, it is only by the suitable application of all of the available tools that we can gain an accurate picture of the network. He stressed the importance of a consistent measurement scheme across Internet2 structure for active, passive, and utilization metrics. He then described the measurement plans of the Internet2 project, which include debugging tools for QoS and multicast measurements and a simple method foe reporting the location of loss when it occurs.

Following Zekauskas’ presentation of the Internet2 perspective, a discussion ensued over the definition of service levels. Several issues surfaced during the discussion:

· In principle, QoS can be verified by active and passive measurement tools. No measurements currently are made based on individual applications. Applications should make their own performance measurements. As QoS evolves, then specifications will be written regarding levels of service, and tools will be created to test them.

· The so-called "last mile" problem: even though a campus may be a High Performance Connection site, the on-campus networks typically are not equipped to handle the high data rates needed to move information to the desktop or lab of a researcher who is not located inside the campus computer center building. This seems to be ubiquitous and very difficult to overcome, and campus network administrators may not be fully sensitized to the problem.

· Who will pay to maintain the high-performance connections, once the initial grants have ended?

· How can we best present the data that our measurement activities generate? Most attendees have to report to their program or campus to justify the investment in an HPC connection, and these administrators are not interested in pages of data – instead they want to be shown in an easily understood format that HPC networks are a quantifiable improvement over the commodity networks. Graphs are a good approach, but something more tangible often is needed: for example, comparing an application with high-speed data throughput to one without can give people an idea of the new capabilities opened up by high-performance networking.

Phil Dykstra of the Army Research Laboratory presented the Next-Generation Internet (NGI) perspective. He gave an overview of the Defense Research and Engineering Network (DREN) and described the Large Scale Networking (LSN) Joint Engineering Team (JET), a coordinating team for six major networks (Abilene, DREN, ESnet, NISN, NREN, and vBNS). According to Dykstra, NGI has three goals: (1) to enable research, (2) to develop testbeds, and (3) to enable high-speed and/or data-intensive applications. There are two identified subdivisions of the testbed goal:

· Goal 2.1 – hook up 100 sites at 100 times current end-to-end network speeds (essentially 100 Mbps)

· Goal 2.2 – hook up 10 sites at 1000 times current end-to-end network speeds (essentially 1 Gbps)

Regarding goal 2.1, developing a list of qualified sites is a considerable challenge. In addition to the "last mile problem" it is difficult to identify the point of contact at each institution.

Dykstra made it clear that NGI is concerned with application-to-application performance and not just with backbone metrics. This means we need to get measurements from the researchers’ perspectives. He also mentioned the NGI/PITAC desire for an "Internet Traffic Report" similar to the one currently available for the commodity Internet – a simple presentation that gives non-specialists a good idea of the payoff that these really expensive networks provide. However, the commercial "traffic report" is widely derided within the measurement and analysis community as hopelessly inaccurate.

Another point (which was confirmed by almost everyone at the conference) was that users’ machines are not tuned for high-speed networks. The problem is that most computers come from the manufacturer with a very poorly tuned network stack and therefore suffer from the bandwidth-delay product, which reduces their throughputs well below the capabilities of their connections. Who is going to go in and fix every researcher’s machine to allow it to attain the performance the researcher expects? One proposed step to a solution was a Web page that a user could access that would return an analysis of what is wrong with the network tuning of the researcher’s machine.

Inefficiencies of current techniques and protocols interfere with good network performance. TCP sometimes must deal with four orders of magnitude of bandwidth delay product, but has a difficult time adapting dynamically when the average length of an individual http connection is only 12 packets. On the other hand, the 1500-byte maximum packet length of Ethernet introduces other inefficiencies that adoption of 9180-byte packets would mitigate (although longer packets could lead to shorter packet sequences). Dykstra also discussed routing problems he had encountered in which data had been sent over a higher-latency but higher-bandwidth connection when he would have seen better throughput using an available low-bandwidth connection with less latency. He also brought up the firewall packet filtering problem; Hans-Werner Braun warned that even if we find another way to do performance measurements, eventually it too will be blocked by security measures.

NGI has several immediate needs: to start tabulating and testing the 100 testbed sites to accomplish "Goal 2.1," to set up a measurement mesh between NGI sites, to keep better track of the top speeds and applications on their networks, to focus more on the campus and end systems, and to build some sort of meaningful "traffic report." Dykstra finished his presentation with a list of unanswered questions:

· Can we define a campus DMZ measurement machine?

· Can we define a test suite?

· Can we create a Next-Generation Internet traffic/status report?

· In the presence of burstiness, are five-minute samples good enough?

· What is the relationship between utilization and packet loss?

· Are there wave phenomena in Internet traffic?

An open discussion of time synchronization followed Dykstra’s presentation. The basic question is what accuracy is sufficient for different types of measurements. For example, NTP assumes a symmetric path when making its time calculations; the error introduced by asymmetric paths is not suitable for one-way delay measurements, so GPS is commonly used. GPS timings provide sub-microsecond accuracy but are expensive to implement.

Activities Overview for Measurement/ Analysis:
What is being measured?
What is being analyzed?
What is being made available?

Kevin Thompson presented information on the vBNS (currently 101 sites) and MCI WorldCom’s measurement/analysis efforts. Their measurements include SNMP data, passive data (OCXmon), active measurements and performance tests (throughput, treno, mping). He then reviewed the results he was getting and explained the graphs and data samples he brought. He then gave three examples of how this data helps MCI WorldCom determine what is going on with the vBNS: a case of sub-optimal routing, a denial of service attack, and a router configuration problem. Thompson explained the use of the tool set in locating problems. In several real-world cases he had been able to use AMP data to confirm his findings and demonstrate the accuracy of his results. Thompson then discussed QoS and various ideas for ensuring it. The vBNS engineering group considered putting a passive monitor at each site and having it intercept packets and reroute them based on their precedence. It was an interesting discussion and led directly into other resources they want to develop for the future. Some of these were: institution specific traffic profiles, a more direct correlation of their measurements with NLANR, new visualization methods and measurement techniques for multicast and IPv6.

Matt Zekauskas gave a presentation on the Abilene network and the tools used by its Network Operations Center to diagnose problems. (Steven Wallace of Indiana University, who originally was scheduled to give this talk, was unable to attend.) One of these is the NEMO network monitor, which determines network utilization based on SNMP data. The Abilene Weather Map (http://hydra.uits.iu.edu/~abilene/traffic/abilene.html) uses loss data collected for MRTG and can generate animations for time-lapsed analysis. It even has an error high water mark that lingers for 24 hours. He then went on to discuss a possible method for processing SNMP data called RPG, which allows anyone to connect to it and run a limited set of commands to gather statistics on network performance (see http://hydra.uits.iu.edu/~abilene/proxy/). From a security point of view, this appeared to be very undesirable idea, but the command set is limited and usage is rate-limited, so it seems to be fairly secure. Zekauskas also described Abilene’s efforts to make their tools available through a higher-education/Internet2 friendly license.

Supporting Measurement/Analysis Activities

Matt Zekauskas gave another presentation on the Internet2 Surveyor active measurement project. Surveyor’s objective is to "create technology and infrastructure to allow users and service providers to have an accurate understanding of the performance and reliability of paths through the Internet." He described the IETF’s effort to develop a standard for IP Performance Measurements (IPPM) and to explain how Surveyor’s one-way delay measurements fit into the total picture. Delay measurements are important because the minimum delay (the propagation delay) and the variation in delay (the queuing delay) set constraints on network performance, large delays make sustaining high-bandwidth flows more difficult, and erratic variations in delay can interfere with real-time applications.

Surveyor monitors can be used for problem determination, assisting with network engineering by identifying trends and loads, monitoring QoS, providing feedback to advanced applications, and acquiring information for general network research. A single Surveyor system is placed at each location to measure one-way delay between it and all of the other machines. The systems clocks are synchronized by GPS, and results are sent to a centralized database. The then-current Surveyor constellation consisted of 55 machines with 1883 paths. He then went on to discuss the lessons learned: routing is asymmetric; even when routing is symmetric, queuing is asymmetric; Surveyor can detect level 2 changes (Sonet fallover, ATM routing) and can determine direction of routing problem; they have observed low delay with loss, and high delay without loss; HPC networks do indeed provide low-latency-low-loss paths; and HPC links do fall back (frequently or for long periods) to commodity network paths. One of Zekauskas’ concerns is that as universities obtain HPC connections they will lose the chance to study interesting commodity Internet data. He also faces the challenge of making all of this data easy to use. "Because there is so much data, it’s also important to highlight the important data, and to develop analyses that provide useful summaries."

Hans-Werner Braun presented NLANR’s Network Analysis Infrastructure. The NLANR Measurement and Analysis team collects traces from passive monitors, end-to-end performance data based on active measurements from AMP, SNMP performance data, and BGP routing data. All of this data is used to promote research activities and to verify the performance of the HPC connections. His team also develops tools to analyze and visualize the collected data. All of the data is collected on central machines, sanitized to protect privacy, and made available for access by other people who want to do research. The wide-area network analysis infrastructure effort is working towards a systemic view of IRONNnternet complexity.

NLANR’s Coral/OC3mon/OC12mon passive monitoring devices use optical beamsplitters that tap into fiber-optic media and analyze the data passing in both directions without introducing delays in the data stream. Braun then described AMP, the NLANR Active Measurement Program, which measures end-to-end performance and generates data for research and engineering applications. AMP places machines at representative campus locations as opposed to the end points of the backbones. The NLANR team also creates and provides visualization tools, specifically Cichlid (pronounced "sik-lid"), which is now a true multi-platform tool with a Windows client binary and Windows server API available (see http://moat.nlanr.net/Software/Cichlid/). Braun showed some Cichlid visualizations of data analysis results. At the end of his presentation Braun showed a prototype message generated by passive monitoring analysis software for determining the largest flows measured during each day.

Tony McGregor gave a more detailed presentation of NLANR’s AMP project, which focuses on the NSF-supported HPC community. The AMP effort measures round trip times, loss, and routes, and currently has about 65 monitors on about 5000 paths. AMP is a "full mesh system," in that each system measures results from every other system in the AMP constellation. The AMP researchers want to determine application-to-application performance as accurately as possible. A machine is available to any NSF funded site – NLANR pays for the cost of the machine and performs most of the system administration (although some units occasionally need to be manually rebooted on site). The data can be accessed through a Web interface, soon to be complemented by a data grid. It also can be accessed using two different Cichlid servers, for which McGregor gave a real-time demo. Raw data also is available upon request and through a data query engine on the Web page. The constellation does a full mesh of fpings every minute and a traceroute every ten minutes. It can also do a number of throughput tests upon request. Work in progress includes event detection based on a process chart and localization techniques that essentially are route decomposition to determine the location of problems. McGregor also mentioned that the IPMP protocol will address these issues more thoroughly. More information on IPMP is available from http://www.nlanr.net/
ActMon/IPMP/.

Phil Dykstra gave a presentation about JET research on the DREN network. The JET team uses Surveyor machines (in Maryland, California, Alaska, and Hawaii), and also receives SNMP data from their routers. Dykstra presented examples of various results that they derive from their measurements, which he used to illustrate some of the problems they have discovered (e.g., asymmetric routing and unstable paths).

Dykstra described DREN Connectivity Surveys, in which he determined the locations of approximately 300 DOD Principal Investigators from their phone numbers and did connectivity tests to their e-mail servers. These surveys are needed because DREN is a cloud network provided by AT&T, and they need to be sure they are getting what they are paying for. He identified some useful tools that would make connectivity surveys easier to generate, including:

· A way to convert the IP address to the latitude and longitude of a site (and preferably to its country identifier, too)

· Means of translating from IP address to ASN mapping and from ASN to name mapping

· A network performance measurement service

· Portable Surveyor machines

· A test machine at every campus DMZ

Mention of portable Surveyor machines prompted discussion about developing a portable AMP or Surveyor system that could be used at universities to diagnose "last mile" problems and could utilize data from the existing AMP/Surveyor constellations.

Presentations by Site Representatives:
What do we measure?
What do we analyze?
What do we make available?
Expectations for collaborations and concerted activities.
Measurement/analysis issues.

Bill Decker identified expectations for measurement and analysis research from the perspective of the NGI and the need for information:

• Treating the NGI as a testbed – Using the results of our research to see how wide-area networks are performing.

• Enabling applications – The NSF would like to identify next-generation applications that have been enabled by the network (that would not exist or would not be as effective if it did not exist).

• Citing results – NSF is very interested in being able to cite the knowledge, tools, and results gained of NGI networking.

Decker urged the workshop participants to identify the short-term and long-term measurement, analysis, and reporting tasks that need to be accomplished to inform the President’s Information Technology Advisory Committee (PITAC) and the NGI review in early calendar year 2000. It would be very good for the network research and analysis community to be able to demonstrate that scientists can do better science using high-performance networks because we have used our measurements to improve the functionality of these networks.

Ronn Ritke, a UCLA graduate student, presented information on his work with measurement and analysis at UCLA and on the vBNS. UCLA’s vBNS measurement collaborations include hosting an active monitor, working with Hans-Werner Braun to create a report on approximating vBNS end-to-end performance, and testing a vBNS ATM trace supplied by MCI WorldCom for Long Range Dependence (LRD). Feedback indicated that determining the flow duration would be a useful metric for understanding the peaks in throughput.

John Cleary of the WAND group at the University of Waikato in New Zealand presented an overview of their work with DAG ATM cell-capture boards, which they are developing to do traces on high-bandwidth connections. WAND used GPS as a time source, and the Waikato groups uses the timing version of the Trimble Marine GPS board (which costs about $500). One of the group’s activities is the analysis of voice-over-IP applications between New Zealand and SDSC. Models under investigation include wavelet transforms (with the University of Melbourne), mixture of exponentials, and non-parametric weighted models. The WAND group makes their own measurements, but they face a difficult problem making the data available since transferring data across the Pacific is expensive. They are very interested in validation of their traces and their analysis through the use of other tools.

David Moore of CAIDA at SDSC gave a presentation on the CoralReef software package, which analyzes OC-3 and OC-12 traces from a network card in real time or recorded traces in post-collection mode. (The package is a descendant of Coral, hence the name.) CoralReef handles most of the file formatting for you; you simply write C or PERL code to analyze each packet; there is a separate library for unpacking network objects, so it can even pull the packet apart for you. CoralReef is useful for analyzing other packet types than IP (for example, it can also work with ATM cells). Coral Reef is available to the public (see http://www.caida.org/Tools/
CoralReef/
).

Jeff Brown of the NLANR Measurement and Analysis Team demonstrated his newest Cichlid server, which is capable of displaying information in additional graphical formats – in particular a vector graphing scheme that he used to generate a "sugar cube" traffic visualization of the vBNS.

Mark Foster of NASA Ames spoke on NASA Information Power Grid (IPG) traffic measurements for demonstrations of preferential treatment. NASA Ames researchers are working to develop and deploy passive monitoring and measurement systems (similar to OCXmon devices) at selected locations on NREN to make long-duration measurements for investigation of Quality of Service.

Andy Germain described NASA EOS network performance testing. His group is interested in active testing of end-to-end performance, without visibility into the internal structure of the network. They want easy determination of where the problem is. Andy is also interested in applying the same tools to a related international project as well.

Les Cottrell of SLAC discussed collaborative measurement and analysis activities with ESnet from a high energy and nuclear physics (HENP) and XIWT (Cross-Industry Working Team) viewpoint. Cottrell’s work involves end-to-end active measurements via ping; he characterized the probes as "lightweight, low-impact, and hierarchical." The ESnet and HENP monitoring is conducted at 18 sites in 27 countries (approximately 1500 pairs). There are 50 "beacon sites" monitored by all stations. XIWT. Cottrell’s main interests include: end-to-end performance measurements and "their relation to things we can do something about" – the impact of routes, peering, and congestion points; long-term results for baselines, trend-spotting, and troubleshooting; the relationship of measurements to applications (e.g., QoS, bulk data transfer, jitter, Web use, multimedia and real-time applications); how to analyze data and summarize results; the appropriateness of tools for the job, calibration and its pitfalls, and things to avoid; and how best to aggregate data in analyses by time and by group (e.g., ESnet, vBNS, tld, collaborative enterprises, etc.). Cottrell also discussed the correlation of his measurements with results from ping, Surveyor, AMP, RIPE, ANX projects.

Cottrell also expanded on the recurring issue of how to analyze and summarize the measurement information in a way that isn’t "bogus." What does an executive want to see in order to understand the improvements of the network over a commodity link. There may not be a good way to do this, but we have to try.

Linda Winkler of Argonne National Laboratory gave a presentation on the role of measurements in making applications more effective. APAN is planning to conduct a measurement effort. "How can we instrument the measurement techniques so we can know whether we’re helping the applications?" Whatever system is set up should be based on a persistent infrastructure, "since we’ll have to do it over and over again." One of her major points was the need to understand the requirements of the individual application, and that can’t be done without sitting down with the application developer.

Henk Uijterwaal gave a presentation on the RIPE-NCC Test-Traffic Project. The Test-Traffic project is an implementation of the IETF IPPM Internet drafts on one-way delays, packet losses, and routing vectors. Like Surveyor, the devices have GPS time capability." A general problem is that it’s hard to run an antenna into a computer room, especially when the cable is limited to 40 or 50 meters." There currently are 43 measurement stations in the field, most of them in the RIPE-NCC service area (Europe, the Middle East, North Africa and former Soviet Union). They provide daily plots showing the network delays to the sites hosting their test boxes, and are working on out-of-tolerance alarms and long-term trend maps for the networks operated by their customers. They also look at routing vectors and churning of routes to get a general idea of what changes in topology are occurring. RIPE-NCC is very interested in analyses that will characterize network Quality of Service.

Andrew Adams from NCNE at PSC presented information on the NIMI (National Internet Measurement Infrastructure) software system. NIMI facilitates widespread deployment of measurement platforms to determine network characteristics using active tools such as ping and throughput. It is designed to be dynamic and scalable as well as high-level and uniform so users can collect the data from different clients in a uniform format for later analysis. It is still undergoing testing; they are waiting to see how it scales when they increase deployment to 25 machines. For future versions they want to provide public key serving, time synchronization tools, multicast measurement tools, and maybe a directory service.

The next talk was by J.J. Jamison, on StarTap and their plans for performance measurements. He presented an interesting idea of peering multiple AMP/Surveyor constellations through StarTap – for example, to show the connectivity of the APAN network relative to the nationwide AMP constellation. This would require figuring out how to do analysis without a full mesh of measurement systems. He finished by pointing out the value of having third party systems such as AMP and Surveyor as they give validity to internal measurements.

Discussions

Bill Decker led a discussion on OMB Circular A-110: "Uniform Administrative Requirements for Grants and Agreements With Institutions of Higher Education, Hospitals, and Other Non-Profit Organizations." Language was included in the omnibus spending bill directing OMB to revise Circular A-110 to require federal agencies to ensure that all data produced under grants are made available pursuant to procedures established under the Freedom of Information Act (FOIA). This legislation has generated serious concerns within the scientific and academic communities.

This concerns network researchers because we at times collect data that must be "sanitized" before made public to protect the privacy of users of the network. If we have to make the data available to anyone on request, then it will become very difficult to gain access to the data in the first place. Many network providers are balking at the idea of providing data under those rules; universities are investigating the proposed rule’s consequences and have found the problems due to concerns over privacy to be extremely limiting. (Also, as written this rule affects every project that receives any federal funds, regardless of the percentage of the total funding.) "Research of any kind will be devastated by this regulation," commented Tracie Monk of CAIDA at UCSD.

When this was concluded discussion addressed the ways in which measurement and analysis activities can support better operation of the network and better service for end users.

One topic concerned possible ways in which our measurement and analysis projects or methods could help identify problems on a campus. Several participants suggested that portable Surveyor or AMP active monitor devices would be extremely useful inside campus networks. These units could be used by on-campus researchers or could be provided to campus network managers to localize problems or even to alert them that problems exist. Managers regard Network Operations Centers as binary – they’re either up or they’re down. They have no tools to diagnose "poor" performance.

Javad Boroumand asked if we could have whatever we wanted, how many AMP machines per campus would we like to have? Would we like to have software on every desktop system on campus? If so, what?

Hans-Werner Braun noted that distributed meshes of measurement machines need to be as nearly identical as possible for consistent, meaningful results.

Tony McGregor asked whether it would be a good idea to pick a couple of example campuses to focus on. What kind of machines would be appropriate? Any particular configuration? Could we try both? We could install an AMP standard configuration, and also take measurements from a "minimalist" configuration. Perhaps we could use some small, non-AMP machines as "satellite" units for on-campus measurements with an AMP unit.

Bill Decker asked what measurement, analysis, and reporting tasks the workshop participants can accomplish to inform the PITAC review in January 2000 and the NGI review in spring of 2000. It would be very good for the network research and analysis community to be able to demonstrate that scientists can do better science using high-performance networks because we have used our measurements to improve the functionality of these networks. What can we report by January, and how can we make that happen? What can we do in the next few months to create a set of reports and measurements that we can use to justify our answers to PITAC questions?

It was pointed out that NGI needs to document good performance as well as bad.

Phil Dykstra identified action items for NGI:

• Start tabulating and testing the "100 sites" of goal 2.1

• Set up a measurement mesh between NGI sites

• Build an acceptable "traffic report"

• Keep better track of the top speeds and applications on the network

• Put greater focus on the campus and on end systems.

A consensus developed that the first step is to start to enumerate and test the 100 sites as required by NGI goal 2.1. The task is more difficult than it sounds, as many "last mile" problems will persist. In the end it was decided that someone must be assigned by NGI to handle the task.

Regarding identification of the 100 sites, Dykstra developed a list of questions for which each site could supply answers:

A) Site name and preferred short name

B) Point of Contact

C) Lat/Long or Phone number (to determine geographic location)

D) Test address (IP address that’s representative of site’s connectivity)

E) Network connections (DREN, Abilene, vBNS, ESnet, commodity Internet, etc.)

F) Speed of campus connection

G) Measurement machines (AMP, Surveyor, NIMI, Netperf)

H) Number of desktop systems that link to the high-performance network

We also need to advise PITAC about technical issues and meaningful network performance metrics. However, simple performance questions, such as "what can an individual biology researcher expect to get from an HPC connection" are also meaningful, and we need to work hard to answer them.

Hans-Werner Braun noted that it is important for the us (the workshop participants, and the networking measurement and analysis community in general) to communicate our needs, goals, and objectives to decision makers and funding agencies.

The discussion turned to research concerns. Most of these can best be expressed in the form of questions; for some there are no clearly obvious answers

Why aren’t more of the research results published? What’s the level of linkage

between the "pure" network research community and this community?

How can we best ensure validation of data? The consensus was that this could be accomplished by comparing measurements from AMP, Surveyor, other measurement system.

What about involving other disciplines in our efforts? For example, where are the statisticians? (It turns out that WAND has statisticians on the project.) What about economists, to facilitate a more meaningful discussion on the relationship of network performance to return on investment?

Suppose the NSF were to put out a new program announcement ... What should it emphasize?

Can we set up a common list of all researchers gathering data and analyzing traffic using AMP/Surveyor/NIMI/Skitter methods?

On conclusion of the discussions, two demonstrations were presented. One showed off the capabilities of a Cichlid servers (one developed at CMU that shows the growth of the vBNS) and a head tracker that lets you fly through a Cichlid visualization with your head.

Participants were encouraged to contribute their comments after the workshop, which have been included in the next section of the proceedings.

Comments By Participants

Jill Gemmill

University of Alabama at Birmingham

Here are my thoughts on the need for measurement as seen from a campus and end-user point of view.

First, the people most interested in solving a problem are the ones most impacted by the problem. Therefore, the availability of tools that could be used directly by an end user (or, worst case, one support person away from the end user) to diagnose and troubleshoot would be valuable. Examples include:

· Some "standard" portable desktop unit with a known high performance throughput, ability to time precisely (on a human time scale) transfers of very large files (disk performance), and a variety of other workstation performance metrics. I suppose there would have to be one or more identical destination workstations; an end user could unplug their workstation, plug in the portable unit, and compare the known performance to that of their own workstation. It would give them some clues about where to look to increase workstation performance. The same equipment could be used to see if campus network performance matches the expected throughput and so on.

· Some "AMP-type" monitors that were distributed across a campus; they would do pairwise comparisons mostly across campus, and would have a very limited set of across-backbone pairings.

· Eventually it would be nice to see some correlation of performance and underlying network architecture, down to the campus level.

Bill asked what could be measured by January:

· If "which 100 institutions are connected" is an important question, it seems NSF should shake loose some staff time to see which of the 150 vBNS funded institutions are connected. (Traceroutes show the path taken to the home page of the institution).

· NLANR DAST has written code that takes into account TCP window size, MTU and so on. Since FTP appears to be the #1 application on the vBNS, could some timing comparison be done comparing transfer time at 100Mb/s of a 2 GB file "before" and "after"?

Chris Thomas

UCLA Research Technology Services – Office of Academic Computing

NSF wants to find out whether the money they’re spending on high performance networks is yielding value, which is reasonable enough. However, it seems to me that the questions they’re asking Bill Decker are not the correct questions. I understand Bill’s concern about not being able to make that bold a statement to the people asking the questions; however, the nature of the questions is significant with a potentially large impact on future funding of high performance network efforts, and needs to be addressed.

The crux of the matter is that while the networking efforts (e.g. vBNS) which NFS has funded are *necessary* to enabling high performance application to application connectivity, those efforts are not *sufficient* to do so. That is, a high performance backbone is certainly a necessary first step. But then the onramp/offramp issues must be addressed. This means that each campus must take the necessary steps to provide high performance from the campus connection to the vBNS to the desktop, and to tune the desktop machine (e.g., appropriate TCP window size et al).

This onramp/offramp enablement will naturally trail the availability of the high performance backbone since it is a "chicken and egg" issue. There is no point in providing 100 Mb/s switched connectivity from the researchers’ desktops to the campus edge until a high performance backbone is available. There is also a time lag while researchers learn about the availability of the high performance backbone and begin to design research projects which can take advantage of it. One site mentioned a 5 year ramp up. This is discouraging, but it may be realistic.

We’ve observed that the high speed backbones seem to do exactly what they are supposed to. In UCLA’s case, we have an OC-12 CalREN2 connection to the other eight schools on the ring, and an OC-3 connection to the vBNS. As far as we can tell, we get 622 Mb/s between our GSR and the other GSRs, with almost zero jitter or packet loss. But that’s a very different question that asking whether every researcher gets 100 Mb/sec application to application performance across the network.

There was some discussion about NSF wanting to know how many of the desktops on a campus had high performance connectivity. Again, this doesn’t seem to be the best question. We have at least 10,000 desktops at UCLA (maybe twice that). 95% of them have no need for anything better than shared 10 Mb/s connectivity (although the campus goal is to provide switched 10 Mb/s to all workstations as the standard). This includes large number of student labs, lots of administrative workstations, etc. Of course, all workstations on campus are routed through the high performance networks when connecting to other sites on those same networks, and will benefit by whatever degree their local attachments allow. But we’ve identified a relatively small initial number of researchers’ workstations which need high performance and are working on enabling those for real high performance. Statements about the 10,000 workstations are going to swamp the meaningful work in a mass of meaningless statistics.

One thing an answer to NSF’s questions should focus on is that in addition to raw bandwidth, there should be emphasis on the lower latency and very much lower packet loss being provided by the vBNS as compared to the commodity networks. This is something which is there today and which benefits almost all applications, regardless of the bandwidth they are currently getting.

There was a lot of discussion the second day of the meeting about grandiose plans to roll out large numbers of additional measurement machines. While I believe that NSF should probably be explicitly funding measurement activities, it is clear that rolling out hundreds of machines is not something which can occur in time to respond to the current questions. What could be done is to use existing AMP and surveyor data from in-place machines and to possibly roll out simple, easy to install software packages. I would suggest that Netperf is a good candidate for this (http://www.netperf.org/) and it’s free. It has two disadvantages – it is open to denial of service attacks and it only measures bandwidth, not latency or (directly, anyway) packet loss. UCLA is currently providing a "public" Netperf machine (netperf.ucla.edu) to allow other HPC sites to measure the bandwidth of their connections to the UCLA campus.

Lastly, there was a suggestion that most of us had old junker machines which could be installed as measurement devices at no cost. Unfortunately, our experience argues to the contrary. While such machines can reply to pings, the NSF questions request the demonstration of bandwidth. Older machine, even slightly older machines have neither the CPU horsepower nor the bus bandwidth to drive a network card at anywhere near 100 Mb/sec. Any machines to be deployed as measurement machine should be able to obtain network speeds of 93-95 Mb/sec. Machines which cannot do this will fail to demonstrate the bandwidth inherent in today’s high performance networks. I agree with Hans-Werner’s observation that distributed meshes of measurement machines need to be as nearly identical as possible. Even if this means fewer machines, it also means valid dependable data.

Presentation Materials

The following pages reproduce materials that were presented in the form of slides, handouts, or screen images during the workshop.