Summary of Research Activities - November 2004
Passive Measurement and Analysis (PMA) Project~ Continuing development of new metrics and real-time analysis for PMA The focus of the real-time work was the completion of the preparation for, demonstration of, and follow-up to, the live real-time demo at SC2004 of OC192MON. The real-time Web application was further polished and minor problems resolved, to a state where it was quite stable. A sizable amount of time was spent finding a bug in RRDtool. With help from Chris, Klaus was able to resolve the problem and improve the RRD output for the demo. As to be expected, there were several logistical problems and kinks to be resolved, however, there were no program crashes at SC2004. Some critical parts of the program had just an half an hour testing beforehand, so this was excellent. Klaus D. provided support from San Diego to Jörg in Pittsburg. We could monitor a 14Gbit/s transfer without packet loss. Jörg reported that it caught the eye of quite a few folks at the exhibition, and was very well received. We operated the OC192MON from Monday November 8th through to Thursday November 11th, 2004. Most of the time the OC192MON was collecting and analyzing data in real time; with one major gap between Tuesday night and Wednesday morning, during which the system was collecting IP packet header trace data. All data which was captured from SC2004 was saved; retrieval from Pittsburgh to San Diego took some time. The tool for rebuilding the graphs from this database was improved in anticipation of publishing the data on PMA. The RRD database data was converted to html. But to publish them via NLANR/MNA, the style had to be changed (no frames, add m4 macros, etc.). At the end of November, there were approximately 6000 Web pages. http://pma.nlanr.net/Special/sc04rt.html While he was in San Diego, Jörg and Klaus D. had a conference call with Klaus Mochalski in Leipzig, primarily discussing how we should be collaborating in the future. Work began on making the real-time application code more generic. Also some minor bugs were fixed. A problem with a new program from Klaus M. was found in the library and after some effort was found. We are also working on a real-time delay program (with Klaus M.). Currently, it crashes after one day; but we anticipate resolving these bugs as well. ~ Special Traces Video Conferencing Trace Data I ~ a one hour IP header trace of the H.323/H.263 video teleconference call (VTC) between Jörg in New Zealand and Hans-Werner in San Diego was published. http://pma.nlanr.net/Special/vtce1.html We published two data sets on Worm/Virus related requests that had been posted earlier in the year by AT&T Research and NCSC: The HPWREN trace data was reprocessed to fix the interface 1/2 sequencing errors that we discovered with Matthew back in September; the resulting gunzip/tshseq/gzip kept the PMA data server busy for two days in a row. Initial network analysis regarding application requirements on HPWREN was performed. To support the output file formats of some of the daily trace analysis software on ittrack were modified slightly. The trace collection and analysis software was also changed, to retain the last day trace and analysis file, plus two more days. A white paper on the analysis, Network bandwidth performance disparity across science applications, was written and posted. http://hpwren.ucsd.edu/WP/20041120/ ~ New (and developing) strategically important measurements and deployments After SC2004, Jörg traveled to Indianapolis to perform the Abilene IPLS router instrumentation, a long, difficult, and most important, successful task. Comments from Jörg: "thanks to the heroic efforts of SDSC and Indiana University staff. In the end both Jim's and Caroline Carver's joint efforts really got the critical pieces in place in time. From there it was another eight hours for the four of us at the POP (John Hicks, Chris Small, Caroline, and myself) and lots of intense hard labor." During the several days there, the TDS-24, which had fallen apart, was also fixed. At present we are waiting for the OC48c -> OC192c link upgrade towards Atlanta in order to have all links in tune. In advance, we worked with I2 and IU folks to prepare for the installation. Also, a huge amount of preparation went into the planning and coordinating of all the gear for the installation, which was unfortunately, quite nerve wracking close to the wire, when the wrong equipment was delivered and had to be replaced (vendor chasing and purchasing), shipped, and received in time. Another glitch which was successfully resolved was that upon arrival in Indianapolis, the folks there could not find our box with splitters and cables that was left as a reserve for this work. This was solved when John Hicks arrived back from SC2004 and located our box. At the end of November, there was some worrisome news from the Indy GlobalNOC: apparently Qwest was not very happy with the immediate results that came out of our work at IPLS. Mostly, the issues appeared to be of a cosmetic nature, but we were quite nervous that they might need to touch the fibers and splitters, which would endanger the entire mission and negate the massive efforts just performed as well as weeks of intense preparation. Discussions and further fact finding were taking place at the end of November. We are preparing an OC3 monitor to send to Malathi Veeraraghavan at the University of Virginia. Jörg and Hans-Werner had several discussions, in person and VTC, regarding future directions for PMA. One of the immediate results for the research community was the posting of a one hour IP header trace of the H.323/H.263 video conference (see Special Traces). They have since been experimenting with various bit rates ranging from as low as 48 KBits/sec to 384 KBits/sec and they both find that higher bit rates do not come close to offering the rewards for the added costs that they incur. For instance, 128 KBits/sec *is* in fact better than 64 KBits/sec, but not by a factor of two (as are the costs). What is surprising is that the audio quality is very good even at very low bit rates, which is perhaps the most important feature. Development of passive monitoring for a lambda network (first stage prototyping of lambdaMON) ~ Dialogue with Endace and Iolon to move forward with the lambdaMON, we may be in a position to implement a field system early 2005. ~ Upgrades, troubleshooting, and maintenance on the PMA servers and infrastructure A significant amount of time was spent reworking the daily backups from the old lftp interface to the secure HIS version for access to the HPSS. We received tremendous support from Mike Gleicher. Some of the issues with HSI are real bugs, which is annoying. The new interface we are using is not only more secure, but much cleaner and leaner and after two days in regular operation Jörg was very pleased with the results, which means that we will be in a position to let ENS turn off the ftp interface to the HPSS for good shortly. There is a recall on the Endace Dag 6.2 cards we received. Apparently there is an over heating problem on some of the chips. So we shipped two Dag 6.2 cards used at SC2004 back to Endace, as well as the returning another 6.2 card in exchange for the 6.1 card that was needed and used in the IPLS installation. We anticipate having the cards back by early December. Support and troubleshooting, existing PMA measurement sites: Purdue ~ Jörg cut his time in Pittsburgh a bit short in order to go to Indianapolis and spend the day at Purdue to fix our GIGEMON there. He worked with Scott Ballew, the local network guru and they fixed the culprit, which was an incorrect wiring of the splitter. The monitor was working "okay," but it still had issues, so we are going to swap it for another system to stabilize the configuration. The replacement GigEmon was in preparation at the end of November. Boulder and Denver, CO ~ Fixed the Front Range GigaPOP OC12MON (Denver), it is again working. Strangely though, the NCAR GIGEMON machine had disappeared and we are not sure where it is. We are pursuing this. As of the end of the month, Scot and Donnie remember that the machine was removed for repairs. They do not remember receiving it back. As soon as we find the monitor, we will arrange to install it.
Active Measurement Project (AMP)~ Progress on the reimplementation of AMP and the development of a new testing architecture
The code can be:
s (test of size bytes from server to client)
S (test of size bytes from client to server)
t (test of size milliseconds from server to client)
T (test of size milliseconds from client to server)
p (pause of size ms)
n (establish a new connection)
~ New (and developing) strategically important measurements and deployments Tony had an extended email exchange with Mark Boolootian of UCSC discussing the idea of a campus deployment of amp. They seem quite interested in pursuing the idea. Mark also suggested it might be worth talking with CENIC about further deployment. One thought is that we might use small one board solid state PCs for the nodes. (e.g., a Soekris board). Mark also put us in touch with Mike van Norman from UCLA who may also be interested in deploying a campus amp. Tony was contacted by Mark Stavely from ACEnet in Canada about possibly doing some measurements on their proposed network (seems to be grid-line connecting St. John's NF, Halifax NS, Antigonish NS, and Fredericton NB). He's following up to see if there is work of interest there. A site in Singapore let Ronn know that they will check with their technicians and get back to us regarding hosting an AMP machine. We continued discussions with Charlie Knezevich, Systems Manager for the SDSC Protein Data Bank (PDB) project regarding embedding AMP software. Charlie is anxious to install the deployable AMP software on the global PDB sites. The PDB is an opportunity to install the AMP software on a truly global application. ~ IPMP In tandem with working on his dissertation, Matthew is working on an IPMP architecture figure that explains the overall operation of the protocol. It is slow work, but he is making good progress. He also worked on developing generic language with which to discuss a protocol for conducting combined path and delay measurements, and how such a thing might be possible. ~ IPv6 and IPv6 Scamper More progress was made on the further development of IPv6 Scamper:
~ Upgrades, troubleshooting, and maintenance on the AMP servers and infrastructure As reported previously, the system disk on the photon.nlanr.net (system manager machine) crashed late in October. We tried, to no avail, to resurrect it. We found that there was inadequate backup for this machine; so we undertook a major effort looking for backups and/or sources of the data on the photon disk. We also researched methods of restoring the data on the disk. This effort included consultation with many people at SDSC for suggestions. After which, the AMP team had discussions and decided that the scripts on the disk were quite valuable and therefore worth having a data restoration laboratory try to restore the disk data. (Todd Hansen had a copy of the old, original scripts. However they were from long before the AMPlets and the system manager were updated to the FreeBSD4.6 version, but were still somewhat useful.) With much effort, the current system manager scripts were scrubbed from the crashed disk. However the listing scripts on the /root directory were not recoverable. To begin rebuilding the photon system manager server, we installed the system manager scripts in the original directory structure on the new system disk and worked around the listing scripts to do preliminary testing. We created a test machine to work out the connection problems of the rebuilt system that resulted from the recovered data from the crashed disk. By the end of November, photon had been restored and we were in the process of distributing the machine identify file to the remote AMPlets (150+). We expect system management will be working shortly. Testing and transition to the new AMP servers (AMP2 and VOLT2) ~ AMP and VOLT remain as the AMP data collectors and servers while the development and testing of the central data collector/server software for AMP2 and VOLT2 continues. While the software development is continuing, AMP2 will be loaded with FreeBSD5.2.1, running the RAID controller in the RAID10 mode. The OS is to be installed on an independent IDE system drive with six hot swappable 250 Gigabytes drives in the RAID10 configuration. AMP and VOLT data collector/server data disk fill is proceeding as expected and they were archived as necessary (with no problems). Support and troubleshooting, existing AMP measurement sites: Site outage remains at a very low level. A total of nine remote sites in the AMP Network meshes received attention during this period; most were resolved and the monitors are again collecting data. Only three (plus CNIC, Beijing, see note below*) were still being investigated, or pending site action, at the end of the period. (Outages are considered "open" until the monitor is again collecting data.) 10 problem sites: 6 resolved, 4 open - at the end of the period.
Outreach, Collaborations, and Activities supporting Network Research~ Papers, Presentations, and Conference/Meeting Participation Supercomputing 2004, Pittsburg, PA ~ Both Ronn and Jörg attended. SC2004 was very intense and successful. Jörg managed to make our OC192MON at the SCinet showground work, in collaboration with Jon Dugan at NCSA, and with help from Matt Zekauskas. We were sadly unlucky to make any of the three OC192MONs from NCSA working as well, due to technical problems. Klaus D's real-time monitoring made big waves with folks and we were encouraged by a number of people to keep working on it. Performance proved stable as expected; we can manage full duplex 10Gigabits without any problems, unless some other application is running in the background and occupying substantial CPU time. We also have packet header traces from some of the bandwidth contestants and a record of a superHDTV application using 1080i frame format between Pittsburgh, Seattle, and Canberra (Australia). Through the week, Klaus D. was making changes and additions to the GUI. Jörg reported that they "turned out to be very useful. Good work!" While there, Jörg gave two well-received talks, one at NCSA on the OC192MON, the other on the lambdaMON work. The lambdaMON talk went better than expected, the audience was small, but everyone there was intimately aware of our work, which resulted in folks absorbing all the technical details and discussions were more lively. Ronn had a number of conversations with Julio Ibarra, John Hicks, Doug Gatchell, Kevin Thompson, and Peter Arzberger regarding international measurements. He also gave Grant Miller a project update. Rich Carlson gave Ronn an update on his NDT tool. He has code for measurement point resource discovery, but unlike Tony's PathViz tool, his will only discover resources on the direct path. Also at SC2004, Ronn distributed the lambdaMON posters (including arranging to put one up at the NLR booth) and the current issue of the Network Analysis Times. In addition, Ronn introduced Jörg to several people, including: Joel Mambretti, Joel has experimented with optical switching the last few years and is interested in the lambdaMON; Bob Grossman, Bob has a focus in data mining and is very interested in analysis on large PMA traces; and Peter Arzberger of the PRAGMA project. Of note, there was an OC768 connection at the PSC booth. Back at SDSC, Jörg gave a presentation on PMA status and future plans to CAIDA, attended by most of the NLANR/MNA staff as well. The outcome was somewhat inconclusive, while CAIDA staff was pointing out the shortcomings of our approach, we were unable meet in the middle and understand how CAIDA could leverage our work for their research agenda. Hans-Werner also participated in the presentation. Tony visited Auckland Uni Electrical and Computer Systems Engineering and gave a talk on measurement and analysis. The main goal was to see if there is anyone there interested in working with us. Slides for presentation at the CANS meeting in Florida were updated. ~ Collaborations And Activities Supporting Network Research Jörg had many meetings and discussions during his two weeks travel to SC2004, IPLS, etc. He had many talks with a large group of people, which overall was very fruitful; the ideas for next generation passive monitoring are taking shape. Most notably was a conversation that he had with Matt Mathis at PSC.
Peter Arzberger, PRAGAMA ~ Greg Cole, GLORIAD ~ Steve Corbato, Internet2 ~ Russ Hobby ~ Charlie Knezevich, SDSC Protein Data Bank (PDB) ~ Perry Lorier, WAND ~ David Malone ~
Debbie Montano, NLR ~ Two groups (CNIC site in Beijing, KISTI site in Korea) plan to visit and meet with us in early December. We are involved in some advance planning and coordination for these (separate) visits. We are awaiting confirmation of the dates and times. Both sites participate in the GLORIAD project and are interested in network measurement. Documentation, Web Work, Utilization Improvement, PublicationsThe NLANR/MNA International Collaborations white paper is being updated; the New Zealand local AMP mesh was one of the major additions. It was begun at the end of November, and is nearly complete. In keeping with the original plans to have dynamic content, the NLANR/MNA home page was updated to reflect our activities at SC2004. This was done on the first day of the meeting (Monday), after Jörg sent out an email re the real-time display at SC2004 and the strong positive reaction. Links to the monitor which was live from the floor, as well as to our lambdaMON work, were included. A sample graph was created for illustration of the real-time OC192 measurements, as well as a new navbar size image for the lambdaMON (based on the style from the poster). The Web link for the amp-cnic(Computer Network, Beijing, China) site disappeared from the AMP Web interface page. Also, performance data was not available. This was investigated in order to have it work properly for the visit in December by folks from CNIC. Part of the cause was found to be that the site was dropped from the international.list file which was distributed to the international mesh. The international.list file in each of the international mesh sites was manually corrected. (The manual correction was necessary because the photon system manager machine was still down at that time.) The site was dropped from the list due to a recently implemented script which had a lat/lon (latitude/longitude) entry in a WHERE clause to select the link into the list; this was corrected and the site is now properly publishing collected data. http://watt.nlanr.net/active/maps/ampmap_active.php The reports index Web page was updated. http://moat.nlanr.net/Reports/ A new AMP Web page on "other AMP meshes" was begun. Also, work began on a new update to the MNA home page "Latest Pings" section (minor changes were made in the interim). Commenting was added to the Web logs project code (to enable better understanding by future developers of the Stats project). Arranged to have additional 11x17 lambdaMON poster handouts printed, as none were left over after SC2004.
Activities of each individual on the projectAMP team
PMA team
MNA, AMP, and PMA Outreach, Documentation, Web work
Management and Administrative
- 30 - Top 2004 Dec 16 NLANR/MNA home page
|