NLANR, Summary quarterly status report, 1 jan 96 to 31 mar 96

National Laboratory for Applied Network Research

Summary quarterly status report
1 jan 96 to 31 mar 96

Cooperative Agreement No. NCR-9415666 with the National Science Foundation

Contents (activities by site):
Intro
MCI
Cornell Theory Center
National Center for Atmospheric Research
National Center for Supercomputing Applications (NCSA)
Pittsburgh Supercomputing Center (PSC)
SDSC
Agenda for next quarter/year

Recent operations status reports:
1996 quarter 1 report (1 jan - 31 mar 96)
1995 annual report (1 may - 31 dec 95)

Introduction

As new opportunities for fostering the evolution of the national network infrastructure arise, the
Division of Networking and Communications Research Infrastructure of the National Science Foundation is focusing on facilitatiing network research activities on leading edge networks. In pursuit of this agenda, DNCRI supports the National Laboratory for Applied Network Research (NLANR), a coordination and research grounding function initally seeded from the current NSF supported supercomputing sites (Cornell Theory Center (CTC), National Center for Atmospheric Research (NCAR), National Center for Supercomputing Applications (NCSA), Pittsburgh Supercomputing Center (PSC), and the San Diego Supercomputer Center (SDSC). A primary objective of NLANR is to support researchers on the NSF very high speed Backbone Network Services (vBNS), a national network research vehicle that connects the 5 sites mentioned above at high bandwidth.

Since its inception in May 1995, NLANR has undertaken a number of activities in line with its chartered mission and workscope, which fall into three areas:

  1. gathering, presenting, and leveraging information about the network
  2. collaboration environments
  3. easing access to and use of network research resources by the R&E community
During the first quarter of 1996, the sites continued supporting vBNS researchers. A vBNS Technical Coordinating Committee meeting was held at SDSC on February 16, where plans were discussed to deploy test network enhancements to support OC-12 HIPPI-HIPPI connectivity between SDSC and PSC. OC-12 connectivity to the Network Engineering Lab in Richardson and the Internet Engineering Lab in Reston is part of the planned test network topology. At this meeting the National Science Foundation also clarified the approval policy for vBNS use so that projects that need high-speed connectivity between supercomputer centers can just inform the Technical Committee mailing list and begin using the vBNS. In addition, Hans-Werner Braun has assisted NSF/DNCRI with their newly announced Connections program by handling questions from several reporters on networking issues surrounding the program.

MCI

Joel Apisdorf at MCI is developing an OC-3 packet/cell monitor. They presented the status at the NLANR Traffic Statistics and Measurement Workshop held in San Diego February 19-20, and demonstrated IP flows-based monitoring on an OC-3 vBNS trunk leaving SDSC was demonstrated. The data produced by the monitor is compatible with interactive query and visualization tools already in use at SDSC to analyze FIX-west traffic.

A measurement study is underway in the test network to evaluate the Early Packet Discard algorithm as implemented in Fore Systems ASX-200BX ATM switch. The results show significant improvement of TCP performance. More measurements with larger numbers of simultaneous TCP flows are being done to complete the study.

More information on traffic statistics and engineering activities is available in the vBNS monthly reports.

Cornell Theory Center (CTC)

CTC established a desktop video conference routed across the vBNS for a vTCC, using a CU-SeeMe reflector (logic.tc.cornell.edu). The reflector attached to a multicast group with nv clients using CU-SeeMe encoding and CU-SeeMe clients connected to the reflector. The conference worked but had few participants. When trying to add a second CU-SeeMe reflector to this multicast group, it uncovered a bug that multiple CU-SeeMe video streams between the reflectors were not distinguished across the multicast path and were combined as one. Bruce Johnson will work with the developers to resolve this bug.

Regarding vBNS routing, CTC does not yet have the physical connection needed to BGP peer with the other sites, but hopes to before the end of April.

National Center for Atmospheric Research (NCAR)

Personnel
Chris Fair continues as NCAR NLANR-funded engineer, NLANR-funded technician support continues to be provided by existing NCAR staff network technicians.

As SDSC staff hosted by NCAR, Duane Wessels began working on March 11, funded to work on NLANR and caching research and engineering.

Accomplishments
Fair worked with SDSC NLANR personnel to reconfigure SDSC C90 to accommodate HiPPI fabric changes on NCAR J9 used for DCSL. He analyzed IP performance over vBNS between NCAR Cray J9 and SDSC C90 to observe TCP ARQ performance over high bandwidth-delay product ATM network.

Fair worked with Claffy, Wessels and Digital Equipment to troubleshoot and correct an illegal length FDDI packet problem on NCAR DMZ adversely affecting the Boulder web caching machine. They observed two bugs: first the Cisco router sends bad packets, which seems to be stoppable with the command NO CBUS CACHE. Second, the FDDI interface in the Digital Alpha halts after reception of a certain number of malformed packets, preventing further access to the machine via the interface. We have still not found a FDDI driver that works perfectly. One seems to work okay but sometimes runs out of mbufs (on the NCSA and Fix-West machines).

Fair installed Fore Systems ATM NIC in the second Indigo 2 workstation and configured software for Classical IP ATM connectivity to all vBNS endpoints. He configured local ATM for direct connectivity for the NCAR Onyx, magic-atm.scd.ucar.edu, through the vBNS Lightstream at NCAR to the CTC NetStar. Fair also configured the Onyx, magic-atm.scd.ucar.edu, for direct Classical IP connectivity to kasina.nlanr.net, Claffy's workstation at SDSC. This was in support of NCAR Visualization Lab demo in mid-February 1996. He troubleshot and corrected an atmarp problem between the NCAR Onyx, magic-atm.scd.ucar.edu, and the PSC NetStar. NetStar was not responding to atmarp requests from the arp server at NCAR preventing revalidation of PVC connection. NCAR NLANR personnel coordinated and managed the installation of US WEST OC-3 connecting NCAR Mesa and Foothills Lab in January 1996. He configured an NCAR premise ATM switch for full mesh PVC connectivity from sugarloaf-atm.scd.ucar.edu and all vBNS Cisco and NetStar routers and Cisco Lightstreams.

Fair attended the vBNS/NLANR workshop at held at SDSC February 16, 1996.

NLANR personnel prepared ATM testbed between Mesa and Foothills Laboratories including two Cisco 7507 ATM routers and Fore Systems ATM switches.

NCAR NLANR personnel selected and ordered Fore NIC for current vBNS attached machines at NCAR.

NCAR requested clarification on the forthcoming NSF policy for additional university sector vBNS connectivity.

Applications
Distributed Climate Simulation Laboratory (DCSL)
NCAR installed new Middlepark file server funded from DCSL project funds. Middlepark has on order 90GB of RAID3 and 63GB of RAID5 storage. Purchase Requests have been submitted to add 183GB of RAID3 and 43GB of RAID5. The total storage of Middlepark will be 378GB, with 272GB of RAID3 and 106GB of RAID5 storage.

FDDI load testing will commence shortly in preparation for the move to Middlepark. Middlepark is configured with four FDDi interfaces, one OC-3 ATM interface, and one HIPPI interface.

This machine will also be our testbed for the *FS MSS project, which will use the DMIG DMAPI hooks in the SGI kernel to attach the MSS to the file server as an HSM. This server will be our prototype for the Data Park and for file service front-ends to the MSS in general.

NCAR visualization staff developed an interface to CSM (Climate System Model) data for the VIS5D application. Additional enhancements were made to this visualization application, including the ability to display stereo 3D imagery.

An experimental remote CSM data site was established on the direct-ATM-connected NLANR machine, kasina.nlanr.net. Kasina's HTTP server was configured to support new datatypes associated with the CSM data, and a collection of CSM data was organized on the system. Sample datasets contained variables for both the atmospheric and ocean component of the coupled climate model.

As an experiment, a stock Netscape client was run on NCAR's magic system (2-processor SGI Onyx Reality Engine) and used to access the CSM data website on kasina. When one of the datasets is selected, it is downloaded to magic over the vBNS using the HTTP protocol. Netscape then automatically loads the customized visualization application.

The visualization application then loads the CSM dataset into memory making it available for interactive 3D exploration and animation. When we have demonstrated this experiment, we have employed an active-stereo configuration on the Onyx coupled with a large, high-resolution screen. The net effect is to demonstrate a prototype virtual environment for browsing large complex climate datasets on the web using a high-bandwidth, wide-area network.

Performance was less than thrilling but didn't offer too many surprises. Cursory comparisons indicated that as a data transfer protocol, HTTP was a bit slower than "rcp" and slower than "ftp" by perhaps 25%. In practice, we saw HTTP rates roughly equivalent to that of Ethernet (1.2MB/s) or slower. Again, this is no surprise as the latency between kasina and magic is generally about 40ms. Kasina and magic are both SGI's and thus have default buffer sizes of 60KB resulting in an overall transfer rate of 1/40ms * 60KB = 1.5MB/s. Climate datasets are routinely several gigabytes, so faster transfer rates would of great benefit for highly-interactive activities such as browsing. Even so, our datasets were typically less than 100MB and represented several variables and a year of data - a reasonably good "test".

Ultimately, it may be desirable to have tunable buffer sizes for HTTP as well as "ftp" and "rcp". The protocol is certainly seeing heavy use for transferring data and this can be expected to grow at a tremendous rate. Question: what are the tradeoffs in configuring an HTTP connection for interactive use vs. data transfer? Another possibility is to employ customized ftp applications and use them as transfer agents in conjuntion with data-browsing web clients.

One can easily imagine extensive climate and/or climatological datasets sited all across the web and available for downloading as well as browsing and study. Ultimately, scientists will want to explore these datasets interactively and the scenario described in this experiment is a fairly practical model.

Internet Data Distribution System Project no update

HPCC Grand Challenge Astrophysical Turbulence: U.Col should have OC-3 to NCAR in April. (ed: was active as of 19 apr 96; NSF approved the application on 22 apr 96.)

HPCC Grand Challenge Geophysical Turbulence Project
NCAR has worked with NLANR personnel at PSC and CTC to establish ATM connectivity over the vBNS between the Crays at PSC and the SGI Onyx at NCAR and the IBM SP2 at CTC and the SGI Onyx at NCAR.

Large Data Transfers are taking place between PSC and NCAR with the same to commence soon between CTC and NCAR. Final routing and PVC configurations are being made for that connectivity.

National Center for Supercomputing Applications (NCSA)

NCSA was involved with BGP peering, and has been testing and debugging their route server implementation, which is now operational. NCSA ported public-domain FTP tools to their SGI Power Challenge array and modified them to be able to use large TCP window sizes. This allowed users to take full advantage of the vBNS high-bandwidth, resulting in FTP throughput between NCSA and PSC increasing by nearly an order of magnitude from 2 megabytes/second to 13 megabytes/second. Details

Randy Butler is now a technical advisor to Steve Goldstein for the Group of Seven Information Society Global Interoperabilty for Broadband Networks (GIBN). He attended his first meeting in January hosted by France and presented a strategy for the interconnection of Japan, North America and Europe. The next meeting is schedule for April 28 and 29 in Berlin. The focus of the meeting will again be on the interconnection issues with heavy involvement from the carriers. Mr. Butler will continue to pursue activities and connection issues that may involve the vBNS. For more information see http://www.ncsa.uiuc.edu/General/GIBN/

As a result of the I-WAY demostration at SC95, NCSA submitted a proposal to ARPA to fund "Towards a Persistant I-WAY". In that proposal the networking group at NCSA contributed to two parts. One is to provide a common access point to interconnect the major U.S. testbed networks. The U.S. testbeds, including the vBNS suffer from a lack of connectivity to each other, something that Japan, Canada and Europe have worked hard on. We believe that a common interconnection point, not unlike the NAP design, will enable the interconnection of the U.S. testbeds in a reliable fashion. The proposal asked for FTE dollars to support the effort and at least one major network vendor has agreed to support the effort with equipment loans. It is our hope to additionally use this Testbed Access Point (TAP) as a site to interconnect the International testbeds as well.

The second piece of the netdev component is for the development of real-time network performance tools. Central to this is the extention of the MCI/NSF funded monitors that we hope to extend for real-time distributed and programmable monitors capable of gathering both IP and ATM statistics. Specifically focusing on specific ATM VCs and feeding this information back into performance tools being built by Dr. Daniel Reed of the University of Illinois.

NCSA continues its upgrade of the local area network to support switched Ethernets and ATM. Many components have now been tested in our testbed and deployment is scheduled. The migration includes an ATM backbone repalcement for our FDDI ring.

A good paper resulted from the I-WAY experience and the usage of the vBNS: Galaxies Collide on the I-WAY: An Example of Heterogeneous Wide Area Collaborative Supercomputing, by Michael Norman, Peter Beckman, Greg Bryan, John Dubinski, Dennis Gannon, Lars Hernquist, Kate Keahey, Jeremiah Ostriker, John Shalf, Joel Welling, and Shelby Yang. It will be published in the summer 96 edition of International Journal of Supercomputer Applications and High Performance Computing.

Charlie Catlett gave an overview of the NLANR activities on a panel at Interop with George Strawn and Chas Lee.

Pittsburgh Supercomputing Center (PSC)

PSC focused on faciliting the use of the vBNS by both the networking and applications communities, including ongoing vBNS support, testing, in house TCP/IP research on the vBNS, and co-coordination of the vTCC.
vBNS Connectivity
PSC is currently upgrading its machine room network to include additional ATM equipment. While not yet connected to the vBNS, we expect that this equipment will support a wider variety of ATM based connections through PSC's facilities. PSC is also working with MCI on site planning for the OC-12 upgrade to the vBNS test network, including discussions with SDSC which will also be on the test net.

We are currently working with three different research projects with equipment located in our machine room, directly connected to the vBNS. This past quarter we have provided some hardware support for the University of Pittsburgh based ATM Traffic Anaylsis project, including troubleshooting network problems associated with the vBNS and the tester. We have also recently connected an Essential Systems NetHiWay router to the vBNS for testing. We are also working with Hui Zhang from CMU to connect a CMU test ATM enviroment to the vBNS test network. This application, recently approved by the vBNS vTCC, will utilize existing fiber between the CMU CS department and PSC's machine room to support the application. We have begun working with both MCI and CMU to specify the requirements associated with the connection, which should be in place in April

The past quarter we have worked considerably with MCI to improve routing on the vBNS. During January, PSC received special permission from the NSF to place all PSC and associated traffic on the vBNS during a catastrophic outage of our commodity connectivity connection. This exercise pointed out a number of issues associated with routing configurations (both site, vBNS and NAP) over this infrastructure. Since this time, the group has agreed and PSC has implement routing in support of full intercenter traffic flowing over the vBNS.

Facilitating Research
In response to concerns researchers we have begun discussions to facilitate and streamline vBNS researchers requesting machine time allocations at multiple sites. Currently the researchers must not only request vBNS time, but must also separately request the corresponding machine time individually at each Supercomputing Center. The current proposal suggests a single point of contact that would work with the researcher as well as the allocations group at each center to process the allocations request. Long term, we would like to automate the process by connecting an allocations request form to the current vBNS request form. When requiring machine time at multiple sites, the researcher would fill out both the vBNS request and the allocation request forms. The Allocation's request form would be forwarded on to allocations group at one center (PSC has volunteered to prototype and test this process) who would then facilitate the process. While this process does not elimate the need to request allocations at multiple sites (a necessity due to the different allocations processes at each center) it provides an informed single point of contact for the researcher. We hope that this will stimulate more application requests and use of the vBNS.

Jamshid Mahdavi has continued his capacity of co-chair of the vTCC. This past quarter, he not only chaired his share of the vTcc meeting, (with pre-meeting notes), but also helped plan and organize the February vBNS technical meeting.

TCP Analysis
Mathis and Mahdavi (PSC) have continued their work on TCP performance issues this past quarter. Specifically they finished work on the SACK Internet Draft and presented it at the March 1996 IETF. This draft is currently slated to become an IETF RFC. Both are currently working implementations of the RFC, focusing on NetBSD and Digital Unix. These implementations will be tested over both the vBNS and commodity network as soon as they are completed.
Treno
PSC's treno server, introduced in the last quarterly report, has proved to be quite popular with the network community. In the past quarter it has been used over 550 times, providing complete responses for 490 of those requests. Since deployed last fall, it has been used over 1000 times. The code itself, available off of PSC WWW server, has been retrieved 98 times since the beginning of the year.

Mahdavi is currently working on server modifications which will allow it use the vBNS and an encapsulation scheme (GRE) to get to arbitrary points in the Internet. If the US based scheme works well using the vBNS, he hopes to expand it include international links as well. We plan to deploy this new server along with a directly connected vBNS treno server by the beginning of next quarter.

SDSC

SDSC NLANR activities have focused on multicast and collaboration environment support, as well as research and infrastructual issues in gathering, presenting, and leveraging information about the Internet.

Making NLANR a viable research vehicle for the community requires research on the network itself including an overlapping area of interest between research and operations. NSF supported an NLANR workshop on Internet statistics measurement and analysis, which gathered representatives of four diverse but interdependent communities: Internet service providers, equipment vendors, researchers, and large user constituencies. The objective of the workshop was to explore the possibile scope of an agenda for concerted Internet statistics collection efforts in a post-NSFNET environment. Specifically, the transition away from the NSFNET backbone paradigm into one of competing commericial service providers has led to an environment where there is little opportunity for users, researchers, and even ISPs themselves, to investigate and diagnose network behavioral difficulties.

The workshop provided a forum for ISPs, their upstream suppliers, and their downstream consumers to articulate their needs for and constraints on statistics collection. The final report for the workshop was presented to the Federal Networking Council for their consideration at the April 1996 meeting, where statistics collection was a primary agenda item.

NLANR activities have included focusing on the needs articulated at the workshop:

In line with the goal of using the information gathered from network analysis to improve the performance of the architecture, NLANR continues work on its information caching prototype. A key result of Internet workload analysis has been the recognition of the existence of an increasing proportion of web traffic as measured at reachable locations, as well as simulation results that indicate that caching could leverage the high degree of redundancy in web document transmissions.

Caching system prototype

Participation in the NLANR caching project continues to grow. The machines arrived from Digital in late November 1995 and were up and running within a couple of days. Each machine has 10 Gbytes of disk space and 128 Mbytes of RAM. The caches used the original Harvest cache software (http://harvest.cs.colorado.edu/) designed for performance and hierarchical scaling. In March 1996 Duane Wessels assumed the technical lead for the project, and coordinates a team of community volunteers interested in maintaining the free version of the cache. (Peter Danzig, architect of the original cache, is providing commercialization support for the harvest cache.

We had initially hoped to attract a number of people and organizations in the U.S. to use the NLANR caches. We would encourage them to run the Harvest cache at their site with one or more of the NLANR sites as a parent cache. As it turned out, however, we began to get inquiries from international organizations. Similar caching systems were being deployed (or were already in place) in New Zealand, Australia, the U.K., Poland and other countries.

Even now we have more users from outside the U.S. Following is a list of sites using the NLANR caches:

The sites shown in bold have established ``mutual parent'' relationships with us. That is, we agree to be their parent cache for URLs located in the U.S., and they are our parent for URLs located within their country.

So far, we have not placed any restrictions on using the caches. We accept connections from any address and will process any URL request. This allows, for example, sites in New Zealand to retrieve URLs from the U.K. through us.

Usage Patterns

A busy cache will typically handle 50,000 requests and serve 800 Mbytes of Web objects. Cache hit rates are usually in the range of 15-20 percent and a couple of percentage points higher when weighted by byte volume.

By far, the largest number of requests are in the .com domain. Often 60% of all requests are for .com URLs. Another 20% are for .edu, .net, and .org.

We calculate that the caches are currently only being used to 5% of their capacity. The cache software should be able to handle one million requests and 16 Gbytes per day. However, at this rate, we would be serving twice as much data per day than we could store on disk; additional disk space would be desirable.

A full collection of statistics can be found at http://www.nlanr.net/Cache/Statistics/.

Network Configuration

Because some of the cache machines are located at supercomputer center sites, they are able to communicate with each other over the vBNS. This is advantegeous because it provides a high-speed cache-to-cache communication channel. It means that in most cases, there is very little penalty to retrieve an object through another NLANR cache. The FIX-West cache is not on the vBNS.

Unfortunately for many users, the supercomputer centers are not optimal locations within the Internet topology. Ideally the caches would be placed at top-level interconnection points (such as the FIX'es and NAP's). Because the supercomputer sites are one or two ISP's ``underneath'' from the backbone providers, U.S. users may find little benefit in using an NLANR cache. The FIX-West cache is actually in a very good location and this makes it more popular than the others, and we continue trying to convince NAP service providers to participate in this project.

Problems Encountered

To date most of our problems have been with the caching machines themselves. A number of them report frequent SCSI errors and one disk drive has already been replaced. Another machine has problems dealing with illegal packets on the FDDI ring and must be rebooted a couple of times per week.

The Harvest cache software has proven to be stable, but there are a few problems which must still be fixed. Most notably, support for the conditional GET in HTTP is missing. Handling FTP objects has also been a real challenge.

There have been no real problems with cache users. In one case some people in the Former Soviet Union were using a US University-wide cache instead of one of the six NLANR caches. We encourage people to send us a note when they start using one of our caches.

Issues and Future Work

Upcoming activities will be in areas to:

Multicast infrastructure

NLANR is investing resources into multimedia conferencing and supporting multicast communications infrastructure to use and provide a development environment for such tools on the vBNS. Claffy continued to work with MCI to get the mbone infrastructure running smoothly, and supported several academic conferences in San Diego with vBNS Mbone tunnel support.

Most notably, NLANR/SDSC helped host NANOG at SDSC in February, including providing Mbone transmission across a wireless link, which seemed to work quite well. Max Okumoto architected the wireless support, Jeff Winkler coordinated the MBONE transmissions, and Charlotte Smart provided administrative coordination.

Claffy is also working on developing tools for Mbone visualization with Tamara Munzner at Stanford, Bill Fenner at Xerox, and Eric Hoffman at Ipsilon. They submitted a paper, Visualizing the Global Topology of the MBONE Information Visualization '96, a symposium associated with IEEE Visualization '96, to be held in San Francisco, Oct 27 - Nov 1 1996. The tool outlined in the work enables one to zoom in on logical subsets of the Mbone infrastructure such as the vBNS-related subset of tunnels, illustrated to the right.


Collaboration environments

NLANR is also interested in applications that facilitate network-based research collaboration. It is supporting the development of text-based online collaboration communities, in particular, Oceana, a K-12 project sponsored in conjunction with Digital Equipment Corporation, and NLANR MOO, an environment for parties interested in NLANR issues to chat, explore, and build within an online community.

Claffy has submitted to NSF a proposal for the development of an Internet engineering curriculum to enable instructors to share instructional resources (e.g., slides, problem sets, exams, outlines, notes). Still under review.


Agenda for next quarter/year

The above activities are, in the final analysis, intended help NCRI provide a viable network environment to the R&E community, in order that scientists can take maximum advantage of community computational resources, many of which are also provided by the National Science Foundation (i.e., the supercomputers (vector and mpp), visualization/graphics engines, workstation clusters, data archives, mass storage, and AFS services of the NSF-sponsored supercomputer centers). NLANR facilitates such collaboration in conjunction with various other organizations, some of which are actively supporting the collaboration as well. NCRI also recognizes that many of the networking as well as computational technologies will find their way outside of the NSF-spawned R&E environment, just as they did with the NSFNET program. NCRI continues to take care to optimize the interaction between the research infrastructure and the market-provided infrastructure, while pushing the technology envelope gracefully, such as with the recently announced NSFNET Connections Program. The neutrality of the constituents of NLANR also make it a good basis for infrastructural and interagency network activities, such as assisting NSF with this program, and continued collaboration among U.S. federal agencies.
Links to other networking resources
acknowledgements and disclaimers
24 apr 96, comments: info@nlanr.net.