Summary of Research Activities - August 2004
Passive Measurement and Analysis (PMA) Project~ Continuing development of new metrics and real-time analysis for PMA Klaus Mochalski continued his work on real-time applications:
Chris worked on maintaining and adding improvements to his existing real-time code. Much of that time was spent debugging his next_packet function which facilitates the walk through the Dag's memory window. Among other things, he added some reporting functionality and cleaned up many of his command line arguments, keeping in mind the need to be able to open, keep track of, and report about multiple streams from multiple Dag cards. The plan is to make the software the most compatible and portable with any type of Dag card or any configuration of them. He will work on developing an elegant way of handling the different types of legacy records in use. Currently, the user specifies it on the command line. Klaus Degner (an undergraduate student from the University of Leipzig) will be coming here in October for a few months to work on real-time applications. He and Klaus Mochalski exchanged emails discussing ideas for real-time one-way delay calculation. ~ Special Traces A new data set was captured: simultaneous traces at all Indianapolis monitors. The disk space lasted from 13:40 to 18:20 PDT. After the anonymization process, Klaus began using these in his real-time application work (detailed above). A student by the name of Xun Su at CalTech is working as a REU with Jose Fernandez and Julio Ibarra at FIU and looking at the data. We are in dialog to collect some specific longer trace files for them to dive much deeper, and they are also interested in our upcoming real-time work. Jörg can see synergies developing here, reconfirming the positive impression that he already gathered during his visit to Miami, FL in May/June. With regard to the FIU data, a rather sour point was Jörg's discovery of garbage records amongst the ATM headers captured, which are so frequent that you can easily spot them by visual inspection of the trace. He has been thinking what to do about this problem, as there seems little point in complaining to the vendor. The cards are now 2 1/2 years old and the Xilinx images are still (or again) not working properly, and we have finally developed an approach which is likely going to fix current, as well as future, occurrences of such issues. Although he did briefly think about renaming the project from Passive Measurement and Analysis to Passive Measurement or Approximation. ;-) ~ New (and developing) strategically important measurements and deployments nai-p-amp ~ We worked to bring the new AMPATH passive monitoring system live with Ernie Rubi in Miami, who installed the new Dell monitor. We have performed the first check and machine configurations. All the signals are there, including CDMA time support, which makes this system an ideal platform for some measurements we are intending to launch with Klaus later this summer. We enabled data collection on the monitor, which took a few hours since the configuration is rather different from what we have in the field so far. Jörg is very pleased with the new machine configuration we have now in place. It has lots of horsepower and disk space at a low price point, is 1U, and also has full CDMA time support for PC and DAG cards. The system was turned into 8x90 collection for now, to have some data for AMPATH with which to work. We received an invitation from Pere Barlet at UPC Barcelona, Catalonia, Spain, to continue the collaboration via placement of an OC48MON inside the Spanish R&D network. Linda Winkler of the Univ. of Illinois, Chicago wants to place a GigE monitor on the connection to UIC and asked if we could convert the previously live, not decommissioned, OC3 monitor to GigE. We told her that the hardware of the OC3 monitor would not support the GigE cards, but that we would discuss the possibility of installing a GigE monitor. While working with us regarding return of the OC3 monitor, Alan Verlo reiterated Linda's desire to implement a GigE monitor at the UIC connection. We started an email conversation with Malathi Veeraraghavan, Professor at the University of Virginia, on the placement of an OC3MON at VOA. She responded positively; we now need to find a time to have a phone call as a follow-up. (This was our first positive response from the "cold call" emails.) We have been researching vendors and products on the optical networking space, sending some 40-50 inquiries to various vendors. We have received the first few replies, and also some initial quotes. Jörg had a number of conversations with vendors, most notably Aliathon, Xtellus, Calient Networks, and a meeting with Ian Graham of Endace. Looked into the various DWDM components we need for a prototype lambdaMON as well as a larger scale rollout. In the process we are making some strategic contacts as well as developing questions which will need to be tackled. Most likely, an early prototype of any sort would be good. ~ Upgrades, troubleshooting, and maintenance on the PMA servers and infrastructure We continued to work with Michael Gleicher on the installation of the HSI interface to the HPSS storage system. The interface, which Mike helped to fine-tune, is now in place, but the first trial produced errors. We need to have a closer look at all the configurations, especially Kerberos. Early in the month, the pma.nlanr.net data server (AMD64 PMA machine) began suffering from some unusual characteristics causing it to stop working for no apparent reason. This occurred twice, at the time, we did not think they were related. We added an additional drive (40GB) for backup purposes and reworked the nightly procedures. Later in the month, we were disturbed by the pma server crashing repeatedly and inexplicably. It has since cleared, one of the likely explanations is issues with SMP and locking support in FreeBSD 5.2, which should be overcome with FreeBSD 5.3. We need to keep watching it closely. We are now using a symbolic alias ntp.nlanr.net due to the NTP support problem discovered during the collection of the new data set from all monitors at IPLS. We began spreading the configuration among the rest of the PMA infrastructure. We identified a possible need for fiberoptic amplifiers for the PMA project. We spent time researching the specifications, price, and availability of the EDFA type amplifiers. Much progress has been made in that field in the past two years, and the number of new sources of amplifiers, as well as amplifier components, is increasing. The EDFA industry has developed in a way that has made it possible for many sources to supply individual EDFA components. This enables customers to design and assemble EDFAs to their own individual requirements, as opposed to having to fit highly standardized products to their needs. We suffered serious network problems due to a network firmware upgrades on some routers this period. We worked with Todd Hansen in resolving (and/or excluding) NLANR or HPWREN issues to be the cause. As a result, Todd arranged that direct fiber connections be implemented to bypass some unnecessary routing; this will avoid these same problems in the future. We had to send the failed OC48 card back to Endace on warranty repair. The shipping procedures for SDSC are obviously not completely worked out and the card actually went out days after it was scheduled. Support and troubleshooting, existing PMA measurement sites: The PMA monitor from the Tel Aviv University was returned this period. It is available for deployment to a new site. Linda Winkler and Alan Verlo of the Univ. of Illinois, Chicago informed us that the connection at the SBC site in Chicago was no longer there. Alan is returning the OC3 monitor from the APAN site. Both Tel Aviv Univ. (TAU) and APAN (APN) are decommissioned sites, now listed with the historic deployments, on the PMA Sites page. Active Measurement Project (AMP)~ Progress on the reimplementation of AMP and the development of a new testing architecture The code for dealing with sub-tests was finished, including debugging a tricky little PHP extension problem related to the PHP symbol table. We started using the new version of the graphing tool (see below for detail on dgraph). We started work on the IPv6 code in amp. An annoying inconsistency between how IPv4 and IPv6 are handled from the programming end was corrected. This actually made things easier in IPv6, but also increased the differentiation between the code handling IPv4 and IPv6. We implemented the icmpTest and the traceTest, and tested them using the loopback interface. They still need to be tested on an IPv6 connected machine to determine how much more coding is left. We investigated a problem with the AMPlet code which the CRCnet people discovered when they ran it on their small solid state nodes. It seemed to be running the machine out of memory. After some initial experiments it appeared that the code may indeed have a slow memory leak. We fixed the leak, and also made some other changes to the configuration the CRCnet people were using for their machines. After testing, we did a new release of the AMPlet code - just for CRCnet -with these fixes. The IPMP code was updated to the latest IETF Internet draft status and was included in the above release. This is not a public release because after feedback from other users, there are additional things that we want to look at (and potentially change) before a public release. Great progress with dgraf (finished with it for now). We added the ability to do histograms and the IP hop count graph, as well as their associated html map code. Some of the code was reworked for better portability, which allows Tony to compile it on his machine as well. Memory leaks and pointer issues were also cleaned up. While adding an events graph type to the code, a design flaw was discovered; fixing it required rearrangement of some of the code. Time Testing of both workability and graph accuracy took place. The AMPvis code was added and some major restructuring performed, which yielded a little bit better performance. Comments and documentation further developed. Testing was done to make sure that it will compile on various machines; it compiles on Linux, FreeBSD, and Solaris. The eventual other part of this program is currently in dtrace. We need to determine how we are going to implement this functionality. A database management schema (for all site, contact, test meshes, scheduling information, etc.) for the AMP machines was developed. http://byerley.cs.waikato.ac.nz/~tonym/tmp/schema Work began on a Perl extension for amp. Eventually this will be a Perl interface to the C code that underlies the PHP code. The main use of this is for Warren Matthews' Web services interface, but we anticipate that there will be others, mostly learning and a few experiments. A problem with email on the AMPlets was sorted out. Sendmail was unable to resolve DNS names and was sending thousands of syslog messages per AMPlet to amp. It was an odd situation because everything else could resolve names properly. Matthew suggested a reboot which, to our surprise did clear the problem. Therefore, an expect script was written and run to clear the mailq and reboot all the AMPlets. This was sad in a way, as many of them had close to two years uptime. They had not been down since we last upgraded the OS. ~ New (and developing) strategically important measurements and deployments Site amp-cnic (China Computer Network, Beijing) was connected this period. The machine has been on site for several months, but the site was undergoing changes and the machine could not be connected. When site people connected the monitor, we walked them through the edits to bring it online, and finalized the connection of the AMP monitor. We followed-up with runs of the system manager process to initialize the monitor, distribute the international list file to all AMP monitors, and start data collection. The monitor is now up and collecting data on the international mesh. (There will be a press release regarding this machine.) 10GigE AMP ~ In preparation for testing the S2io 10Gb cards, new SuperMicro 6013 P-Ts machines were ordered. (They can also be used later for Gigabit PMA monitors.) The cards arrived and were installed, however there were problems. Little information about the S2io 10Gb cards was disclosed before their release, so configuration plans were not accurate. We are currently waiting further instructions and information. ~ IPMP and IPMP cross-traffic-from-trace (ctft) generator Work was done on the BSD kernel implementation, removing unnecessary timers from IPMP flow records. Instead of each flow having a timer associated with it, there is a timer recording when the next flow will expire, and a list of flows in order of expiry. Issues with OpenBSD/NetBSD/Linux portability were also handled. Still to be done: work on the Linux IPMP flow code to sync up the implementations. ~ IPv6 and IPv6 Scamper The bottleneck that caused Scamper to be limited to sending HZ packets per second was removed. HZ is 100 in most operating systems by default, which meant Scamper was limited to sending 100 packets per second. Used dmalloc to track down abuses of malloc in Scamper. dmalloc is a very useful library. One instance was found where there was writing beyond the end of a piece of memory, and there were a handful of mallocs that were not freed. Additional work was done on the file writing code. We think that Scamper is on track to being released before SIGCOMM Network Troubleshooting workshop, which is our target. One of the highlights of the period was adding MTU searching to the PMTU phase. If Scamper does not get a reply for a large packet, then it uses a table of L2 media types and their MTUs to decide on a smaller probe size to attempt. When it does get an answer using one of the known media MTUs, it tries a packet one byte bigger. If it does not get an answer, it has inferred the PMTU to the end host [or the next PMTU bottleneck]. If it does get an answer for the packet, then it does a binary chop to probe the PMTU. Code was added to do a PMTU search to figure out the largest packet that can be sent along a path when an ICMP Fragmentation required message is not returned. A table of fairly common media types to aid Scamper in quickly finding the PMTU was created. This makes it more dynamic, in that it will not try lesser used media types, as learnt by Scamper. For example, when probing from a machine with a 4470 byte interface MTU, Scamper tries 3 or 4 other media MTUs before probing the 1500 byte Ethernet MTU, which is the most likely media case. Code was also written to allow a list of IP addresses to be passed on the command line, so that it can be used more like traceroute on the command line. Code to do a TTL limited binary chop of the segments in question in order to isolate the node that is not returning the ICMP Fragmentation required message was written. Also added was a TTL limited search to isolate the hop[s] where packets are silently dropped. Matthew is quite pleased with the code, as it was not trivial to add, and is something we feel is reasonably original. traceroute from 199.109.33.1 to 129.82.201.9 1 199.109.33.254 0.744 ms [mtu: 4470] 2 199.109.5.86 14.041 ms [mtu: 4470] 3 199.109.5.53 26.454 ms [mtu: 4470] 4 199.109.6.2 22.627 ms [mtu: 4470] 5 199.109.2.2 35.079 ms [mtu: 4470] 6 198.32.8.77 38.690 ms [mtu: 4470] 7 198.32.8.81 47.683 ms [mtu: 4470] 8 XX 59.337 ms [mtu: 4470] 9 XX 63.797 ms [mtu: 4470] 10 XX 61.082 ms [*mtu: 1514] 11 XX 60.909 ms [mtu: 1500] 12 129.82.201.9 61.541 ms [mtu: 1500] The above traceroute with Scamper shows that the path is 4470 bytes all the way to hop 9. Hop 9 should send an ICMP fragmentation required message for 1514 bytes [Ethernet Max] but does not. Hop 10 sends an ICMP fragmentation required message for 1500 bytes, However, Hop 9 shields that message from being sent until a 1501 byte packet is sent. Talked more with Bill Owens (NYSERNET) about some ideas we have for a paper to submit to the CCR special issue on Internet Vital Statistics. We are currently discussing exactly the data we want to collect, the tables we want to present, etc. Bill also suggested we link the IPv6 NetTs paper to the Internet2 IPv6 WG. He thinks it would provoke some interesting discussion on that list. So, we sent an email to the Internet2 IPv6 WG mailing list outlining the AMP IPv6 project and the paper. ~ Upgrades, troubleshooting, and maintenance on the AMP servers and infrastructure The plan to store AMP and VOLT data on the RAID array in the new servers using a NFS mounted system moved forward. During preparations, a major problem was discovered with one of the machine's system boards. It was returned to the vendor, Verari Systems, the successor to RackSaver. As it turned out, this vendor is no longer organized to handle our kind of business, and consequently the warrantee repair required a week. Following a recompile of the Linux kernel, it was reinstalled and placed online. Both the new and old AMP/VOLT servers are interconnected on the 10.28.5 network. Thus allowing the RAID array in the new servers to be NFS mounted to the old AMP/VOLT servers. Unfortunately, the AMP transition to the new server array did not go well. We copied AMP's data to the new server and got it mounted back onto AMP so we can continue to use AMP to run the old software, while using the new server for the data, waiting until the new software is ready to go 100% live. After which, NFS was performing very badly. We think that this is due to the large number of processes writing to the file system (one for each remote site, plus others for the Web pages etc.). To improve the performance, we tried several things.
At the end of the period, Server AMP2 is collecting data for the AMP server. While these modifications had improved the performance somewhat on the machine itself, unfortunately, NFS performance to AMP is still too poor for it to be used for Web serving. The VOLT server which is already acting as the main data collector, is also providing the Web serving. Therefore, we archived data on VOLT to assure that the am_slave process would not crash during this time. We will continue to watch VOLT carefully during this transition period. This very frustrating time where we tried a whole range of things without real success has brought us to the point where it appears that an IDE RAID cannot handle the load we are putting on it. The main problem is the number of random accesses that are happening. Of note, if we can make it work well: the eventual capacity of the new RAID arrays is expected to be such that archiving to the HPSS should not be needed for perhaps a year. We also tried to install mysql on the calorie server for Warren Matthews, who thinks it will solve his performance problems. Unfortunately, the current ports will not build on calorie and the original ports from 4.7 refer to sources that are not easily available any more. We tried to debug the problem without success and tried several different versions also without success. We may have to upgrade calorie, but would rather not as we hope to retire calorie (and put that functionality on the new servers) in the coming few months. We finally got it to install, but it is not running yet. Support and troubleshooting, existing AMP measurement sites: A total of 13 remote sites in the AMP infrastructure received attention during this period. "Open" means that the site is still being investigated, or pending action by site technicians. Outages are considered "open" until the monitor is again collecting data. Details follow. 13 problem sites: 7 resolved, 6 open - at the end of the period. amp-arizona (U. of Arizona) ~ brief outage due to quote error in the /etc/rc.conf file. It is unclear how the double quote was lost, and also how the subsequent, unplanned reboot followed. When the machine came back up, it was in single user mode and was off line, but the site technician edited the file and rebooted; now collecting data. amp-cwru (Case Western Reserve University) ~ contacted us to schedule a move of the AMP monitor; now collecting data. amp-fsu (Florida State U) ~ site technician Art Houle disconnected the AMP monitor there in preparation for a move to another network. That move is expected to be finished soon, at which point it will be restarted on the new network. amp-jhu (John Hopkins U, Baltimore, Md) ~ during the process of moving the AMP monitor to a new network, the machine was removed from the network before a new address was assigned. Guided the site technician through booting the monitor in single user mode and editing in the new address and gateway. The monitor was successfully relocated; now collecting data. amp-korea (KREONet2 in Korea) ~ down due to some confusion at the site. Manhee Lee, our original contact, has left the site, and turned his duties over to another person, who, in an attempt to learn about the monitor, booted the machine single user and made changes he thought necessary. Have been working with the new person, but a replacement machine or system disk might become necessary. amp-ncsa-dca (NCSA Access Center, Arlington) ~ experiencing an outage; site technician is investigating. Site people at NCSA also want to explore a plan to place the monitor on a "fixed" DHCP address. We went through an exercise NCSA believed would allow that to work, but it was not successful. Will work with them as needed in the future. amp-ou (Oklahoma U.) ~ has apparently installed a port 22 block on that network. It collects data and transfers correctly, but it is unreachable by ssh login and cannot be updated. Extensive coordination with the site technician has failed to resolve the cause. A replacement machine will probably be the solution. amp-rice (Rice U., Houston) ~ brief outage; the site technician corrected it with a reboot; now collecting data. amp-rnpb (RNPnet in Brazil) ~ shipped a replacement monitor, which arrived on site and was installed, replacing the failed unit. The system manager process was started on the photon server and the monitor was initialized; now collecting data. amp-surf (SURFnet at Amsterdam) ~ the GigE network interface goes down for seemingly no reason, so we prepared, tested, and shipped a replacement to determine if the anomaly is hardware-related. Two short outages after the monitor was shipped; both were corrected through the out-of-band connection. After that, the monitor continued to suffer from the loss of connectivity on a near daily basis. The replacement should be installed soon. In addition, the ACL list at the site needs to be edited to the new NLANR server network 198.202.123, which should be done shortly. amp-unin (UNINet in Thailand) ~ appears to have a block on ICMP echo requests, site people have been requested to investigate and report back. amp-wisn (U. of Wisconsin Network, Madison) ~ experienced a short outage; the site could not reach other remote sites with ICMP echo requests or UDP traceroute. Corrected by a remote (soft) reboot; now collecting data. amp-yale (Yale U.) ~ the site could not reach other remote sites with ICMP echo requests or UDP tranceroute. The problem was temporarily corrected by a power cycle reboot. The machine was replaced, and after a week of smooth operation, the issue recurred twice. However, those were corrected by a remote (soft) reboot. Site people are researching network components to determine if that is a factor; now collecting data.
Outreach, Collaborations, and Activities supporting Network Research~ Collaborations And Activities Supporting Network Research W.H. Carlisle, Auburn University~ Xun Su, CalTech Jim Dolgonas, CENIC~ Aaron Greusel, CENIC~ Chris Bruja, CISCO~ Greg Cole, GLORIAD~ Jim Ferguson, DAST~ Klaus Degner, University of Leipzig~ Ian Graham, Endace~ John Hicks~ Rick Summerhill, I2 Srinivas Kota~ Warren Matthews~ Don Mitchell~ Jon Dugan, NCSA~ Bill Chang, NSF; Peter Arzberger, PRAGMA~ Kevin Thompson, NSF~ Bill Owens, NYSERNET~ Ville Aikas, Pacific Northwest Gigapop~ Bill Cleveland, Purdue~ Susan Rathburn, SDSC~ Vijay Samalam, SDSC Network Director~ Teri Simas, PRAGMA~ Felix Hernandez-Campos, UNC~ UniLink~ Pere Barlet, UPC Barcelona~ Malathi Veeraraghavan, University of Virginia~ HCI, University of Waikato~ David Wetherall, Washington University~ ~ Papers, Presentations, and Conference/Meeting Participation Klaus gave us a presentation on his real-time work during one of our weekly meetings to great interest and continued comment afterward, including some long email conversations with Jörg. Initial planning and arrangements for Jörg to give a presentation on 10GigE application software at SC2004 in November. We have arranged with DAST/NCSA to have a slot in their booth. Demonstration requests for the SDSC booth open soon; we will be pursuing that as well. Ronn completed a set of presentation slides for the AARNet group in September. Matthew worked on his talk for SIGCOMM Network Troubleshooting. He is also planning an IPv6 paper with Bill Owens of NYSERNET
Documentation, Web Work, Utilization Improvement, PublicationsPlanning for a new issue of the Network Analysis Times began, including establishing the international theme and announcement of the AMP machine at CNIC, Beijing, China now collecting data as the main article. The issue will first be distributed at the PRAGMA7 meeting in September at SDSC. A great new NATimes head banner that retains the previous look (recognition factor), as well as updates it, was created. Potential images and photos were chosen. Excerpts from the International white paper will also be used, as well as text from the forthcoming press release. Using the new layout program, a preliminary layout design was done. We are going for one 11x17 sheet (4 pages). If we print them on the color machine in the SEQ conference room, then have the campus graphics (called Imprints) fold them, not only will we save money, but we will be able to have color on the inside as well. This will make this issue the first to have color on the inside. Traditionally, due to the huge expense of color, we have had only the front and back cover in color (as they are the on the same side of one 11x17 sheet), with the rest of the pages in black and white only. (The difference in cost is several hundred dollars for each printing of 200-300.) We discussed a press release and article about the AMP monitor in China being activated and collecting data. We called Dave Hart at NSF/OLPA to coordinate. Database Back end Project, utilizing PostgreSQL and Perl scripts ~
A DHCP server was added on moat for the new NLANR/198.202.123 network. Enhancements, refinements, and additions continued on the PMA Collection and Use Statistics (Web logs) project, including:
Activities of each individual on the projectAMP team
PMA team
MNA, AMP, and PMA Outreach, Documentation, Web work
Management and Administrative
Note: Lana Kennedy, the student writer, will be leaving NLANR in order to concentrate on her UCSD studies this fall (her last day will be Sept.10). I wish Lana great good luck in school and in the future and will miss her tremendously. She is a very quick learner and a pleasure to work with. We (all) will miss her excellent, thorough, work. [MCC]
- 30 - Top 2004 Oct 20 NLANR/MNA home page
|