Workshop Notes
- logistical matters (fire alarm, parking, attendees, etc.)
- background – see slides (HPIIS review, suggestions in the review, set of metrics)
- workshop overview – needs, classifications, map of network applications to classifications, discussion of the needs
Tom Defanti: Euro-Link, STAR TAP & StarLight [slides]
- Euro-Link - one of three cooperative agreements to provide copayments to support international connectivity
- links to Moscow
- scientific applications -- networked experiments for CERN (high energy particle physics)
- found a typical 100 mbit network would not work for the science
- future plans for Euro-Link (has about 1.5 years to go – talking with NSF to find what happens next): grid work (data grid, transatlantic grid), DTF network (optical switching), Canarie, SurfNet, etc. – all going into optical wavelength arena on a 100Mbps network, applications had to be well scheduled, easily maxed out
- Grids are a big deal
- STAR TAP, Internet2, Abilene
- multiple countries, multiple accesses to major world networks
- AADS NAP, unofficial equipment (low access)
- continually attracts customers; always changing; new circuits from the same and new networks
- predicting network technology in the future is difficult - want estimates of what will be here in 2008; DREN, ESNet, NREN, NISN, vBNS are all connected to STARnet; Internet2 and Abilene
- STAR TAP has connections to a number of major cities within the world (Japan, Korea, Taiwan, Singapore, Australia, China, Norway, Iceland ,Sweden, Finland, Denmark, Russia, France, Netherlands, Cern, Israel, Ireland, Beligum, and the UK)
- STAR TAP: engineering support, QoS test bed, marketing (documentation, conference, education outreach), team building
- engineering support from global NOC and University of Indiana and STAR TAP engineers
- ipv4 routing, ipv6 routing, multicast, globus, NLANR AMP, Web caching, QoS test bed, application network performance metrics, experimental protocols
- external partner circuits: FAPESP (Brazil), RNPS (Brazil), HEANET (OC3), BELNET (OC3), KISTI (OC3); five new circuits coming in this month (by 24 of August 2001)
- Ameritech Chicago NAP: ATM hub for STAR TAP and MREN; approximately 130 connections, peaks to 6Gbps, problem: port speed capped at OC12c; collocation for STAR TAP service is two racks
-
StarLight - jointly managed and engineered by International Center for Advanced Internet Research (Northwestern University), Electronic Visualization Lab (EVL at University of Chicago), and Argonne National Lab
- large colocation facility with space, power and fiber made available to the university and a big collaborator, as a point of presence in Chicago
- currently present: Ameritech, AT&T, QWEST
- production GigE and trial 10GigE switch/router facility for high-performance access to participating networks
- operational with Cisco, Juniper switches, 40 racks
- soon to be an optical switching facility for wavelengths
- connected to STAR TAP via two OC12 ATM circuits
- connected to SurfNet (two OC12 POS) Abilene, CAnet, IWIRE, DTF, NORDUNET, CERN
- StarLight goals:
- full production GigE NAP and R&E networks,
- metropolitan optical switching at 10Gb,
- international wavelength switching hub,
- facility for hosting experiments like dwdm, lambda conversion, optical routing, ultra high definition video and vr, terascale computing, and petabyte data mining
- StarLight encourages NLANR/MNA group cooperation
- StarLight: many POPs nearby; major collocation facility; carriers such as Ameritech, QWest, AT&T, (see slide)
- GigE infrastructure -> 10GigE operational and in trial mode
- DTF: 4x10GigE
- Can give away connectivity to QWest to friends.
- NEEDS:
- application monitoring on 10Gigabit networks
- to know how much bandwidth the applications require
- to know what is available for measurement; how to measure; what equipment is available and useable; what information is already there due to current equipment
- is it a foreseen conclusion that reliable protocols cannot be done with a single TCP implementation? --unknown; would require a change in every kernel on the planet
- sctp? as opposed to rudp ? --unknown, will lookup
- what granularity of monitoring do you want/expect? should bandwidth be requested? what time scale should leases be given?
- goal is to have this automated
- Application Centric Monitoring Of Optical Networks
- need to know how much bandwidth is available so that programs can adapt accordingly
- need to know how much bandwidth an application actually uses (to make responsible QoS allocations)
- what kind of trusted box is needed to manage requests and queries to our GigE switches and routers
- assume the application will do per flow monitoring by themselves
- wide area optically connected PC clusters
- teranode monitor tool: monitor CPU utilization, available memory and bandwidth per PC
- Questions:
- is there any standard protocol for these optical switches? (snmp, etc.)
- can a query tell us how much bandwidth is available and how much is going through any given path?
- Does this query occur over an external ATM link or is this done over the optical network?
- Can we deploy a monitoring server at each switch with out incredible degrees of security
- Can current tools be reused for GigE and beyond?
- How can we verify that a path has been created and our packets are going over it?
- Who is building the middleware?
- New protocol work: what would be done by NLANR pros vs. grad student, how to propagate tools over the network
- Goal: try to figure out the underlying networks and which protocols to use
- Examine protocols like parallel TCP, FEC, UDP to find tweakable parameters, etc.
- Extend the high performance protocols
- SCTP as opposed to RUDP?
Julio Ibarra: AMPATH Status Report [slides]
- provides/wants to provide connectivity to Latin America
- very little connectivity to Latin America; all connectivity is from Chicago (north!)
- they want to emulate HPIIS
- at the infancy of their project
- started in March of 2000 by FIU (Florida International University)
- AMPATH is a project to interconnect the R&E networks in South and Central America, the Caribbean, and Mexico through optical fiber (currently have submarine cables along Florida coast)
- 10 DS3s, Cisco GSR 12012, Lucent CBS 500 ATM, Juniper routers, collocation in the NAP of the Americas (looks like an amusement park) and shares an OC3
- provides connectivity to Puerto Rico, Mexico, Chile, Panama, Colombia, Peru, Argentina, Brazil, Venezuela, St. Croix, through Abilene, NAP of the Americas and perhaps STAR TAP
- project timeline is to connect the above countries within 12 months
- started providing connectivity in June of this year
- metrics:
- bandwidth
- traffic characteristics
- ports (flow metrics and NetFlow)
- link latency over time
- application level performance
- goal: fully utilize the donated DS3S
- activities:
- optical wavelengths
- optical routing
- ipv6
- multicast
- QoS
- access grid
- MPLS tunnels
- wireless to wired networks
- Global Crossing's network, SAC PAC MAC networks, NAPotA, Chile(REUNA), Brazil(RNP), Puerto Rico (UPR)
John Hicks: TransPAC Overview and Monitoring [slides]
TransPAC and Global NOC report (Indiana University)
- HP network between Asia-Pacific/US
- part of the HPIIS program
- Chris Robb the network engineer (work w/ Abilene) is deeply involved with application monitoring and analysis
- scavenger service
- what is scavenger service – less than best effort service, kind of like nice, traffic tagged with scavenger diffser is low priority
- GRAPE project, sends results (Terabyte scale) to HPSS over complex route; kept proprietary for one year and then made public
- Juniper donated an M5 for testing
- overall tests were positive (200mb data stream)
- TransPAC runs from Tokyo XP to STAR TAP in Chicago
- TransPAC began as 35mbps and now OC3
- TransPAC measurement: classification, metrics, TransPAC tools, OCX-mon, data archive
- metrics:
- bandwidth of physical network
- identify and profile individual applications
- verify QoS
- real-time application monitoring
- TransPAC tools:
- general purpose Linux box
- traceroute (from STAR TAP, Tokyo XP)
- reverse traceroute
- pinger node
- mrtg
- flow scans
- NetFlow tools
- TransPAC weather map
- OC3mon
- need help to determine archiving policies (Should raw data be archived? How long shall we keep it? Who do we turn to?)
- measurement platform, would like more applications to run on this Linux box located at STAR TAP (real-time application monitor very desirable, archiving policies a question, legal questions)
- concerns:
- equity between northern and southern routes
- trace data archival
- put tools in the hands of users
- NOC provides: continuous operations, trouble ticket reporting, staff (15), supported by Abilene
- better tools to application people is the aim
- application bandwidth primary concern
- how many people would be using this tool at any given time?
- application scientists on either end
- engineering and NOC staff
- histogram to provide to NSF or other people
- Netramet with nifty may be suitable for these analyses
- may not cover more than a few clients
Jörg Micheel: Overview of Measurement Tools
[slides]
Network Data Collection and Analysis
- currently have implementation of an OC48 monitor
- what tools are available and what can be tailored to meeting attendee needs
- data collection: using existing equipment (routers), passive data collection (OCX-mons), active data collection (AMP, IPPM, etc.)
- router data collection, two options 1. SNMP 2. NetFlow
- passive data collection: using dedicated data collections systems like PMA, CAIDA, Sprint, AT&T – you get to know what the network does to your applications, passive is hardest of all techniques
- active delay collection: easy to install ping/traceroute/test reflectors (AMP, RIPE, TTM, NIMI), standard metrics, AMP scales fast and easy, need much more than one way delay figures
- existing equipment, passive data collection, and active data collection available to users
- many public tools at no costs for visualization, but figures are very coarse and doesn't provide insight into problems of clients
- NetFlow (CISCO standard) with plenty of tools, still no detail
- dedicated data collectors, best detail, no automated tools that fit all bills
- passive monitoring is the hardest method of all
- active measurement scales, and standard, lots out there: AMP, RIPE, TTM, Surveyor, NIMI
- doesn't tell you what's going on in the network. GPS is hard to deal with, need network insights
Bud Hale: NLANR Measurement and Analysis Group Activities
[slides]
Brief overviews of
Greg Cole: MIRnet Overview and Monitoring (Unable to present; [slides] [Quicktime version])
Andy Germain: NASA EOS Active Network Performance Testing [slides]
- EOS is not all of NASA
- CEOS: shares data; lets users find the data they are looking for. Individual organizations have different policies
for delivering data
- IGOS: strategy for avoiding redundancy in data stored
- members have budgets for testing...
- traceroutes show these networks have connections through many networks.
- active testing
- hourly tests to each site
- using Iperf more often than others because of support on multiple operating systems.
- test throughput for 30-60 seconds, don't want to overload the network though.
- pings get blocked a lot these days, not as useful
- [Web page demo]
- CRS throughput tests, limitations likely are 4Mbps link in Japan
- hard to look into events after they are over
- two day latency to see data from [NASDA]
- other sites that use HPIIS links?
- Euro-Link or Mirnet?
- don't do mirnet; look at JaNet
- Janet in London, 35 Mbit currently; uunet in Norway doesn't do very well;
- [showing off other measurements on Web site]
- most of Europe is 1Mpbs throughput
- Israel is down for a while
- Argentina? or Chile?
- Argentina, satellite link; 300-500Kbit peaks; would like AMPATH to improve these links but hasn't seen
it yet; unable to get through to Russia
- Nordunet running 45Mbps into st. petersburg
- Mirnet only 6Mbps to Moscow
- Nordunet stops for political reasons at Moscow State U.
- Web100 group products may be able to measure directly what EOS is doing indirectly
- users don't really care
- it's one of those things that users don't know they care about
- in international community, window size and circuit size are limitations...
- Notes taken by David Cheney and Jose Otero -
Back to the Top
Back to Follow-up page
Back to HPIIS 2001 main page