NLANR/MNA logo

Summary of Research Activities - August 2004

line

Passive Measurement and Analysis (PMA) Project

~ Continuing development of new metrics and real-time analysis for PMA

Klaus Mochalski continued his work on real-time applications:

  • He worked on the TCP packet loss detection engine, getting it to work properly. While it performed to the original specs he had in mind, it was insufficient for the end goal that he wants to achieve: detecting network congestion. The algorithm worked well in finding two duplicate ACKs as a sign of a packet loss. However, not every packet loss is a sign of serious congestion. It is necessary to differentiate Fast Retransmission (of single packets) from an actual Slow Start caused by a retransmission timeout. Slow start is much harder to detect however, not least due to vantage point problems. Apart from this, complete stateful tracking of all TCP connections is probably unfeasible in real-time. The new plan involves looking for sudden drops in a TCP's rate in conjunction with packet retransmission.
  • To look at packet delay and loss data across the Juniper T640 at Indianapolis, Klaus collected a new dataset: simultaneous traces at all IPLS monitors (approximately 4 hours, 40 minutes). After finishing the anonymization, he generated RRD databases for the basic statistics and put up a preliminary RRD-based Web site with graphs that can be generated on request. While doing this, he spotted inconsistencies in the NTP support and with help from Hans-Werner, he and Jörg were able to resolve the problem. It turned out to be another artifact of our recent changes to the new NLANR IP address space. As a result we now use a symbolic alias ntp.nlanr.net.
  • He also did some minor polishing of the flow engine and set up a CVS for our daglib2 and finished copying the traces to pma.
  • To solve the problem of obtaining continuous flow statistics with the trace being split up into 10 minute chunks, he wrote a utility to merge a whole set of traces.

Chris worked on maintaining and adding improvements to his existing real-time code. Much of that time was spent debugging his next_packet function which facilitates the walk through the Dag's memory window. Among other things, he added some reporting functionality and cleaned up many of his command line arguments, keeping in mind the need to be able to open, keep track of, and report about multiple streams from multiple Dag cards. The plan is to make the software the most compatible and portable with any type of Dag card or any configuration of them. He will work on developing an elegant way of handling the different types of legacy records in use. Currently, the user specifies it on the command line.

Klaus Degner (an undergraduate student from the University of Leipzig) will be coming here in October for a few months to work on real-time applications. He and Klaus Mochalski exchanged emails discussing ideas for real-time one-way delay calculation.

~ Special Traces

A new data set was captured: simultaneous traces at all Indianapolis monitors. The disk space lasted from 13:40 to 18:20 PDT. After the anonymization process, Klaus began using these in his real-time application work (detailed above).

A student by the name of Xun Su at CalTech is working as a REU with Jose Fernandez and Julio Ibarra at FIU and looking at the data. We are in dialog to collect some specific longer trace files for them to dive much deeper, and they are also interested in our upcoming real-time work. Jörg can see synergies developing here, reconfirming the positive impression that he already gathered during his visit to Miami, FL in May/June.

With regard to the FIU data, a rather sour point was Jörg's discovery of garbage records amongst the ATM headers captured, which are so frequent that you can easily spot them by visual inspection of the trace. He has been thinking what to do about this problem, as there seems little point in complaining to the vendor. The cards are now 2 1/2 years old and the Xilinx images are still (or again) not working properly, and we have finally developed an approach which is likely going to fix current, as well as future, occurrences of such issues. Although he did briefly think about renaming the project from Passive Measurement and Analysis to Passive Measurement or Approximation. ;-)

~ New (and developing) strategically important measurements and deployments

nai-p-amp ~ We worked to bring the new AMPATH passive monitoring system live with Ernie Rubi in Miami, who installed the new Dell monitor. We have performed the first check and machine configurations. All the signals are there, including CDMA time support, which makes this system an ideal platform for some measurements we are intending to launch with Klaus later this summer. We enabled data collection on the monitor, which took a few hours since the configuration is rather different from what we have in the field so far. Jörg is very pleased with the new machine configuration we have now in place. It has lots of horsepower and disk space at a low price point, is 1U, and also has full CDMA time support for PC and DAG cards. The system was turned into 8x90 collection for now, to have some data for AMPATH with which to work.

We received an invitation from Pere Barlet at UPC Barcelona, Catalonia, Spain, to continue the collaboration via placement of an OC48MON inside the Spanish R&D network.

Linda Winkler of the Univ. of Illinois, Chicago wants to place a GigE monitor on the connection to UIC and asked if we could convert the previously live, not decommissioned, OC3 monitor to GigE. We told her that the hardware of the OC3 monitor would not support the GigE cards, but that we would discuss the possibility of installing a GigE monitor. While working with us regarding return of the OC3 monitor, Alan Verlo reiterated Linda's desire to implement a GigE monitor at the UIC connection.

We started an email conversation with Malathi Veeraraghavan, Professor at the University of Virginia, on the placement of an OC3MON at VOA. She responded positively; we now need to find a time to have a phone call as a follow-up. (This was our first positive response from the "cold call" emails.)

We have been researching vendors and products on the optical networking space, sending some 40-50 inquiries to various vendors. We have received the first few replies, and also some initial quotes. Jörg had a number of conversations with vendors, most notably Aliathon, Xtellus, Calient Networks, and a meeting with Ian Graham of Endace. Looked into the various DWDM components we need for a prototype lambdaMON as well as a larger scale rollout. In the process we are making some strategic contacts as well as developing questions which will need to be tackled. Most likely, an early prototype of any sort would be good.

~ Upgrades, troubleshooting, and maintenance on the PMA servers and infrastructure

We continued to work with Michael Gleicher on the installation of the HSI interface to the HPSS storage system. The interface, which Mike helped to fine-tune, is now in place, but the first trial produced errors. We need to have a closer look at all the configurations, especially Kerberos.

Early in the month, the pma.nlanr.net data server (AMD64 PMA machine) began suffering from some unusual characteristics causing it to stop working for no apparent reason. This occurred twice, at the time, we did not think they were related. We added an additional drive (40GB) for backup purposes and reworked the nightly procedures. Later in the month, we were disturbed by the pma server crashing repeatedly and inexplicably. It has since cleared, one of the likely explanations is issues with SMP and locking support in FreeBSD 5.2, which should be overcome with FreeBSD 5.3. We need to keep watching it closely.

We are now using a symbolic alias ntp.nlanr.net due to the NTP support problem discovered during the collection of the new data set from all monitors at IPLS. We began spreading the configuration among the rest of the PMA infrastructure.

We identified a possible need for fiberoptic amplifiers for the PMA project. We spent time researching the specifications, price, and availability of the EDFA type amplifiers. Much progress has been made in that field in the past two years, and the number of new sources of amplifiers, as well as amplifier components, is increasing. The EDFA industry has developed in a way that has made it possible for many sources to supply individual EDFA components. This enables customers to design and assemble EDFAs to their own individual requirements, as opposed to having to fit highly standardized products to their needs.

We suffered serious network problems due to a network firmware upgrades on some routers this period. We worked with Todd Hansen in resolving (and/or excluding) NLANR or HPWREN issues to be the cause. As a result, Todd arranged that direct fiber connections be implemented to bypass some unnecessary routing; this will avoid these same problems in the future.

We had to send the failed OC48 card back to Endace on warranty repair. The shipping procedures for SDSC are obviously not completely worked out and the card actually went out days after it was scheduled.

Support and troubleshooting, existing PMA measurement sites:

The PMA monitor from the Tel Aviv University was returned this period. It is available for deployment to a new site.

Linda Winkler and Alan Verlo of the Univ. of Illinois, Chicago informed us that the connection at the SBC site in Chicago was no longer there. Alan is returning the OC3 monitor from the APAN site.

Both Tel Aviv Univ. (TAU) and APAN (APN) are decommissioned sites, now listed with the historic deployments, on the PMA Sites page.

Active Measurement Project (AMP)

~ Progress on the reimplementation of AMP and the development of a new testing architecture

The code for dealing with sub-tests was finished, including debugging a tricky little PHP extension problem related to the PHP symbol table. We started using the new version of the graphing tool (see below for detail on dgraph).

We started work on the IPv6 code in amp. An annoying inconsistency between how IPv4 and IPv6 are handled from the programming end was corrected. This actually made things easier in IPv6, but also increased the differentiation between the code handling IPv4 and IPv6. We implemented the icmpTest and the traceTest, and tested them using the loopback interface. They still need to be tested on an IPv6 connected machine to determine how much more coding is left.

We investigated a problem with the AMPlet code which the CRCnet people discovered when they ran it on their small solid state nodes. It seemed to be running the machine out of memory. After some initial experiments it appeared that the code may indeed have a slow memory leak. We fixed the leak, and also made some other changes to the configuration the CRCnet people were using for their machines. After testing, we did a new release of the AMPlet code - just for CRCnet -with these fixes.

The IPMP code was updated to the latest IETF Internet draft status and was included in the above release. This is not a public release because after feedback from other users, there are additional things that we want to look at (and potentially change) before a public release.

Great progress with dgraf (finished with it for now). We added the ability to do histograms and the IP hop count graph, as well as their associated html map code. Some of the code was reworked for better portability, which allows Tony to compile it on his machine as well. Memory leaks and pointer issues were also cleaned up. While adding an events graph type to the code, a design flaw was discovered; fixing it required rearrangement of some of the code. Time Testing of both workability and graph accuracy took place. The AMPvis code was added and some major restructuring performed, which yielded a little bit better performance. Comments and documentation further developed. Testing was done to make sure that it will compile on various machines; it compiles on Linux, FreeBSD, and Solaris. The eventual other part of this program is currently in dtrace. We need to determine how we are going to implement this functionality.

A database management schema (for all site, contact, test meshes, scheduling information, etc.) for the AMP machines was developed.  http://byerley.cs.waikato.ac.nz/~tonym/tmp/schema
~ ~

Work began on a Perl extension for amp. Eventually this will be a Perl interface to the C code that underlies the PHP code. The main use of this is for Warren Matthews' Web services interface, but we anticipate that there will be others, mostly learning and a few experiments.

A problem with email on the AMPlets was sorted out. Sendmail was unable to resolve DNS names and was sending thousands of syslog messages per AMPlet to amp. It was an odd situation because everything else could resolve names properly. Matthew suggested a reboot which, to our surprise did clear the problem. Therefore, an expect script was written and run to clear the mailq and reboot all the AMPlets. This was sad in a way, as many of them had close to two years uptime. They had not been down since we last upgraded the OS.

~ New (and developing) strategically important measurements and deployments

Site amp-cnic (China Computer Network, Beijing) was connected this period. The machine has been on site for several months, but the site was undergoing changes and the machine could not be connected. When site people connected the monitor, we walked them through the edits to bring it online, and finalized the connection of the AMP monitor. We followed-up with runs of the system manager process to initialize the monitor, distribute the international list file to all AMP monitors, and start data collection. The monitor is now up and collecting data on the international mesh. (There will be a press release regarding this machine.)

10GigE AMP ~ In preparation for testing the S2io 10Gb cards, new SuperMicro 6013 P-Ts machines were ordered. (They can also be used later for Gigabit PMA monitors.) The cards arrived and were installed, however there were problems. Little information about the S2io 10Gb cards was disclosed before their release, so configuration plans were not accurate. We are currently waiting further instructions and information.

~ IPMP and IPMP cross-traffic-from-trace (ctft) generator

Work was done on the BSD kernel implementation, removing unnecessary timers from IPMP flow records. Instead of each flow having a timer associated with it, there is a timer recording when the next flow will expire, and a list of flows in order of expiry. Issues with OpenBSD/NetBSD/Linux portability were also handled. Still to be done: work on the Linux IPMP flow code to sync up the implementations.

~ IPv6 and IPv6 Scamper

The bottleneck that caused Scamper to be limited to sending HZ packets per second was removed. HZ is 100 in most operating systems by default, which meant Scamper was limited to sending 100 packets per second.

Used dmalloc to track down abuses of malloc in Scamper. dmalloc is a very useful library. One instance was found where there was writing beyond the end of a piece of memory, and there were a handful of mallocs that were not freed. Additional work was done on the file writing code. We think that Scamper is on track to being released before SIGCOMM Network Troubleshooting workshop, which is our target.

One of the highlights of the period was adding MTU searching to the PMTU phase. If Scamper does not get a reply for a large packet, then it uses a table of L2 media types and their MTUs to decide on a smaller probe size to attempt. When it does get an answer using one of the known media MTUs, it tries a packet one byte bigger. If it does not get an answer, it has inferred the PMTU to the end host [or the next PMTU bottleneck]. If it does get an answer for the packet, then it does a binary chop to probe the PMTU.

Code was added to do a PMTU search to figure out the largest packet that can be sent along a path when an ICMP Fragmentation required message is not returned. A table of fairly common media types to aid Scamper in quickly finding the PMTU was created. This makes it more dynamic, in that it will not try lesser used media types, as learnt by Scamper. For example, when probing from a machine with a 4470 byte interface MTU, Scamper tries 3 or 4 other media MTUs before probing the 1500 byte Ethernet MTU, which is the most likely media case. Code was also written to allow a list of IP addresses to be passed on the command line, so that it can be used more like traceroute on the command line.

Code to do a TTL limited binary chop of the segments in question in order to isolate the node that is not returning the ICMP Fragmentation required message was written. Also added was a TTL limited search to isolate the hop[s] where packets are silently dropped. Matthew is quite pleased with the code, as it was not trivial to add, and is something we feel is reasonably original.

	traceroute from 199.109.33.1 to 129.82.201.9

 	 1  199.109.33.254  0.744 ms [mtu: 4470] 
	 2  199.109.5.86   14.041 ms [mtu: 4470] 
	 3  199.109.5.53   26.454 ms [mtu: 4470] 
	 4  199.109.6.2    22.627 ms [mtu: 4470]
	 5  199.109.2.2    35.079 ms [mtu: 4470]
	 6  198.32.8.77    38.690 ms [mtu: 4470]
	 7  198.32.8.81    47.683 ms [mtu: 4470] 
	 8  XX             59.337 ms [mtu: 4470] 
	 9  XX             63.797 ms [mtu: 4470] 
	10  XX             61.082 ms [*mtu: 1514] 
	11  XX             60.909 ms [mtu: 1500] 
	12  129.82.201.9   61.541 ms [mtu: 1500] 

The above traceroute with Scamper shows that the path is 4470 bytes all the way to hop 9. Hop 9 should send an ICMP fragmentation required message for 1514 bytes [Ethernet Max] but does not. Hop 10 sends an ICMP fragmentation required message for 1500 bytes, However, Hop 9 shields that message from being sent until a 1501 byte packet is sent.

Talked more with Bill Owens (NYSERNET) about some ideas we have for a paper to submit to the CCR special issue on Internet Vital Statistics. We are currently discussing exactly the data we want to collect, the tables we want to present, etc. Bill also suggested we link the IPv6 NetTs paper to the Internet2 IPv6 WG. He thinks it would provoke some interesting discussion on that list. So, we sent an email to the Internet2 IPv6 WG mailing list outlining the AMP IPv6 project and the paper.
Kenjiro Cho, Matthew Luckie, and Brad Huffaker. Identifying IPv6 Network Problems in the DualStack World. Proceedings of the SIGCOMM Network Troubleshooting Workshop (NetTs)

~ Upgrades, troubleshooting, and maintenance on the AMP servers and infrastructure

The plan to store AMP and VOLT data on the RAID array in the new servers using a NFS mounted system moved forward. During preparations, a major problem was discovered with one of the machine's system boards. It was returned to the vendor, Verari Systems, the successor to RackSaver. As it turned out, this vendor is no longer organized to handle our kind of business, and consequently the warrantee repair required a week. Following a recompile of the Linux kernel, it was reinstalled and placed online. Both the new and old AMP/VOLT servers are interconnected on the 10.28.5 network. Thus allowing the RAID array in the new servers to be NFS mounted to the old AMP/VOLT servers.

Unfortunately, the AMP transition to the new server array did not go well. We copied AMP's data to the new server and got it mounted back onto AMP so we can continue to use AMP to run the old software, while using the new server for the data, waiting until the new software is ready to go 100% live. After which, NFS was performing very badly. We think that this is due to the large number of processes writing to the file system (one for each remote site, plus others for the Web pages etc.). To improve the performance, we tried several things.

  • We increased the number of nfsd processes which resulted in the server performing well, but the client was still badly bottlenecked. (A directory listing that is instant on another client takes 20-60s on AMP.)
  • We fired up nfsiod with the maximum number of processes (20) and that help a bit.
  • We hacked the source to increase the limit, without much apparent success, leading us to believe that the performance is being limited by other resources. Looked at the kernel, tuned the OS, and changed the startup profile of the workload.
  • To deal with the additional load, we doubled the RAM in the machine from 2 GB to 4 GB, then towards the end of the period, increased that again to 6 GB.

At the end of the period, Server AMP2 is collecting data for the AMP server. While these modifications had improved the performance somewhat on the machine itself, unfortunately, NFS performance to AMP is still too poor for it to be used for Web serving.

The VOLT server which is already acting as the main data collector, is also providing the Web serving. Therefore, we archived data on VOLT to assure that the am_slave process would not crash during this time. We will continue to watch VOLT carefully during this transition period.

This very frustrating time where we tried a whole range of things without real success has brought us to the point where it appears that an IDE RAID cannot handle the load we are putting on it. The main problem is the number of random accesses that are happening. Of note, if we can make it work well: the eventual capacity of the new RAID arrays is expected to be such that archiving to the HPSS should not be needed for perhaps a year.

We also tried to install mysql on the calorie server for Warren Matthews, who thinks it will solve his performance problems. Unfortunately, the current ports will not build on calorie and the original ports from 4.7 refer to sources that are not easily available any more. We tried to debug the problem without success and tried several different versions also without success. We may have to upgrade calorie, but would rather not as we hope to retire calorie (and put that functionality on the new servers) in the coming few months. We finally got it to install, but it is not running yet.

Support and troubleshooting, existing AMP measurement sites:

A total of 13 remote sites in the AMP infrastructure received attention during this period. "Open" means that the site is still being investigated, or pending action by site technicians. Outages are considered "open" until the monitor is again collecting data. Details follow.

13 problem sites:  7 resolved, 6 open - at the end of the period.

amp-arizona (U. of Arizona) ~ brief outage due to quote error in the /etc/rc.conf file. It is unclear how the double quote was lost, and also how the subsequent, unplanned reboot followed. When the machine came back up, it was in single user mode and was off line, but the site technician edited the file and rebooted; now collecting data.

amp-cwru (Case Western Reserve University) ~ contacted us to schedule a move of the AMP monitor; now collecting data.

amp-fsu (Florida State U) ~ site technician Art Houle disconnected the AMP monitor there in preparation for a move to another network. That move is expected to be finished soon, at which point it will be restarted on the new network.

amp-jhu (John Hopkins U, Baltimore, Md) ~ during the process of moving the AMP monitor to a new network, the machine was removed from the network before a new address was assigned. Guided the site technician through booting the monitor in single user mode and editing in the new address and gateway. The monitor was successfully relocated; now collecting data.

amp-korea (KREONet2 in Korea) ~ down due to some confusion at the site. Manhee Lee, our original contact, has left the site, and turned his duties over to another person, who, in an attempt to learn about the monitor, booted the machine single user and made changes he thought necessary. Have been working with the new person, but a replacement machine or system disk might become necessary.

amp-ncsa-dca (NCSA Access Center, Arlington) ~ experiencing an outage; site technician is investigating. Site people at NCSA also want to explore a plan to place the monitor on a "fixed" DHCP address. We went through an exercise NCSA believed would allow that to work, but it was not successful. Will work with them as needed in the future.

amp-ou (Oklahoma U.) ~ has apparently installed a port 22 block on that network. It collects data and transfers correctly, but it is unreachable by ssh login and cannot be updated. Extensive coordination with the site technician has failed to resolve the cause. A replacement machine will probably be the solution.

amp-rice (Rice U., Houston) ~ brief outage; the site technician corrected it with a reboot; now collecting data.

amp-rnpb (RNPnet in Brazil) ~ shipped a replacement monitor, which arrived on site and was installed, replacing the failed unit. The system manager process was started on the photon server and the monitor was initialized; now collecting data.

amp-surf (SURFnet at Amsterdam) ~ the GigE network interface goes down for seemingly no reason, so we prepared, tested, and shipped a replacement to determine if the anomaly is hardware-related. Two short outages after the monitor was shipped; both were corrected through the out-of-band connection. After that, the monitor continued to suffer from the loss of connectivity on a near daily basis. The replacement should be installed soon. In addition, the ACL list at the site needs to be edited to the new NLANR server network 198.202.123, which should be done shortly.

amp-unin (UNINet in Thailand) ~ appears to have a block on ICMP echo requests, site people have been requested to investigate and report back.

amp-wisn (U. of Wisconsin Network, Madison) ~ experienced a short outage; the site could not reach other remote sites with ICMP echo requests or UDP traceroute. Corrected by a remote (soft) reboot; now collecting data.

amp-yale (Yale U.) ~ the site could not reach other remote sites with ICMP echo requests or UDP tranceroute. The problem was temporarily corrected by a power cycle reboot. The machine was replaced, and after a week of smooth operation, the issue recurred twice. However, those were corrected by a remote (soft) reboot. Site people are researching network components to determine if that is a factor; now collecting data.

Outreach, Collaborations, and Activities supporting Network Research

~ Collaborations And Activities Supporting Network Research

W.H. Carlisle, Auburn University~
AMP: Received an email from him asking for the tools used in the IPv6 network troubleshooting paper I coauthored with Kenjiro Cho and Brad Huffaker. He is in the process of changing to IPv6 connectivity and wanted to use the tools we created to compare the performance relative to IPv4 before and after the change. Kenjiro supplied the scripts and visualization tools that drive Scamper, and I supplied Scamper itself. [Matthew Luckie]

Xun Su, CalTech
PMA: An REU student (at CalTech) working with Jose Fernandez and Julio Ibarra at FIU (Miami, FL), looking at the PMA data. We are in dialog to collect some specific longer trace files for them to dive much deeper, and they are also interested in our upcoming real-time work. Jörg can see synergies developing here, reconfirming the positive impression that he already gathered during his visit to Miami, FL in May/June.

Jim Dolgonas, CENIC~
PMA: Had a very good conference call with Jim; he is reviewing information on the lambdaMon and then will respond. Jim is the new CEO of CENIC, now that Tom West has moved full time to his position as the head of NLR. CENIC is very keen to have a stronger PMA involvement, but they only have 10 Gigabit equipment. Interestingly, our new architecture for DWDM monitoring may be an exact match to help solve this problem. I think we have quite a few opportunities during the coming months to strengthen this relationship. [Jörg Micheel, Ronn Ritke]

Aaron Greusel, CENIC~
PMA: Brief exchange as a follow-up to my recent visit. [Jörg Micheel]

Chris Bruja, CISCO~
AMP & PMA: 10GigE AMP and follow up on Jörg's request for technical details. [Ronn Ritke]

Greg Cole, GLORIAD~
AMP: Phone call with Greg, who will be meeting the group from Russia in September and will work to move forward the installation of the AMP in Moscow, Russia. [Ronn Ritke]

Jim Ferguson, DAST~
PMA: We arranged demonstration times for SC2004 in November. They will give us a slot at the NCSA booth. Jörg will give a presentation on 10GigE application software. [Ronn Ritke]

Klaus Degner, University of Leipzig~
PMA: Email exchange discussing ideas for real-time one way delay calculation. [Klaus Mochalski]

Ian Graham, Endace~
PMA: Have been trying to arrange an appointment with him. [Jörg Micheel]

John Hicks~
PMA: Short follow-up with John regarding some results from our CISCO 15454 inspection. [Jörg Micheel]

Rick Summerhill, I2
AMP & PMA: Sent text to him on OC192mons and the Observatory project. [Ronn Ritke]

Srinivas Kota~
AMP: Srinivas had read our "Network Performance Visualization: Insight Through Animation" paper in IEEE Communication Magazine and asked for a sample Cichlid server. I sent him the code for one of the AMP servers. [Tony McGregor]

Warren Matthews~
AMP: AMP WebServices interface; worked with him on some performance problems.

Don Mitchell~
Several members of the team me with Don while he was in San Diego; giving him an overview of the project status and discussing various aspects of the project. [Hans-Werner Braun, Ronn Ritke, and Bud Hale]

Jon Dugan, NCSA~
PMA: Ronn and I have been chasing demo setups for SC2004 in Pittsburgh. Jon is in charge of the bandwidth challenge, and we had an email exchange with some detailed ideas about what we could arrange. [Jörg Micheel]

Bill Chang, NSF; Peter Arzberger, PRAGMA~
Conversations with Bill and Peter on PRAGMA and international collaborations. The three of us will meet at the next PRAGMA meeting at SDSC. Bill asked me to let him know when the China AMP is online. [Ronn Ritke]

Kevin Thompson, NSF~
PMA: Dialog regarding some of our measurement results in the Q2/2004 report. [Jörg Micheel]

Bill Owens, NYSERNET~
AMP: At Bill's prompting, I sent an email to the Internet2 IPv6 WG mailing list outlining the AMP IPv6 project and the paper authored with Kenjiro and Brad. [Matthew Luckie]

Ville Aikas, Pacific Northwest Gigapop~
AMP: Has been asking for permission to run throughput tests from their management subnet. I gave him permission on the condition he contact the sites concerned before doing large tests. I had to update the code base to understand address masks. He is now interested in seeing if we can upgrade the monitors at the set of sites he is interested in to GigE. The largest impediment to that is probably getting the sites to provide GigE, and hopefully 9kMTU, capable ports. I suggested that he make the initial approach. [Tony McGregor]

Bill Cleveland, Purdue~
PMA: Long phone conversation with Bill. We are both keen to get the PMA system there going, but appear to have no luck in encouraging the local sysadmin to tackle the problem with an optical power meter. Bill is looking for a good data source to start teaching in about two weeks, and after some searching we have decided to focus on the Leipzig-I and Leipzig-II data sets. [Jörg Micheel]

Susan Rathburn, SDSC~
AMP & PMA: Requested some text on international collaborations for Fran Berman's meeting with new UCSD Chancellor Marye Anne Fox. I put together and emailed to Susan two different-length summaries of NLANR/MNA international collaborations. [Ronn Ritke]

Vijay Samalam, SDSC Network Director~
PMA: During a meeting with him, Vijay expressed interest in the lambda passive monitor. [Ronn Ritke]

Teri Simas, PRAGMA~
AMP & PMA: Assisted her with PRAGMA meeting preparations at SDSC. [Ronn Ritke]

Felix Hernandez-Campos, UNC~
PMA: Approached me about PMA anonymization procedures. I explained the details and pointed him at the most up-to-date toolset. [Jörg Micheel]

UniLink~
PMA: My last meeting with this group, which handles external funding at Waikato, was very productive. [Ronn Ritke]

Pere Barlet, UPC Barcelona~
PMA: Invitation to continue our collaboration via placement of an OC48MON inside the Spanish R&D network. [Jörg Micheel]

Malathi Veeraraghavan, University of Virginia~
PMA: Started an email conversation on the placement of an OC3MON. She responded positively; we need to find a time to have a phone call as a follow-up. [Jörg Micheel]

HCI, University of Waikato~
AMP: The group is interested in working on visual representations of network data after seeing one of our visualizations. [Tony McGregor]

David Wetherall, Washington University~
AMP: Received an email from him asking for geographic location (latitude and longitude) of AMP machines. He wants a ground-truth database of IP addresses. We have the coordinates on our Web pages, but not the individual IP addresses of monitors. I will write a script to get the data out of the system_info database and send it to Tony to pass on to David. [Matthew Luckie]

~ Papers, Presentations, and Conference/Meeting Participation

Klaus gave us a presentation on his real-time work during one of our weekly meetings to great interest and continued comment afterward, including some long email conversations with Jörg.

Initial planning and arrangements for Jörg to give a presentation on 10GigE application software at SC2004 in November. We have arranged with DAST/NCSA to have a slot in their booth. Demonstration requests for the SDSC booth open soon; we will be pursuing that as well.

Ronn completed a set of presentation slides for the AARNet group in September.

Matthew worked on his talk for SIGCOMM Network Troubleshooting. He is also planning an IPv6 paper with Bill Owens of NYSERNET

Documentation, Web Work, Utilization Improvement, Publications

Planning for a new issue of the Network Analysis Times began, including establishing the international theme and announcement of the AMP machine at CNIC, Beijing, China now collecting data as the main article. The issue will first be distributed at the PRAGMA7 meeting in September at SDSC. A great new NATimes head banner that retains the previous look (recognition factor), as well as updates it, was created. Potential images and photos were chosen. Excerpts from the International white paper will also be used, as well as text from the forthcoming press release. Using the new layout program, a preliminary layout design was done.

We are going for one 11x17 sheet (4 pages). If we print them on the color machine in the SEQ conference room, then have the campus graphics (called Imprints) fold them, not only will we save money, but we will be able to have color on the inside as well. This will make this issue the first to have color on the inside. Traditionally, due to the huge expense of color, we have had only the front and back cover in color (as they are the on the same side of one 11x17 sheet), with the rest of the pages in black and white only. (The difference in cost is several hundred dollars for each printing of 200-300.)

We discussed a press release and article about the AMP monitor in China being activated and collecting data. We called Dave Hart at NSF/OLPA to coordinate.

Database Back end Project, utilizing PostgreSQL and Perl scripts ~

  • We accomplished the initial goal of creating a prototype of each type of Citings page (there are three) and now have three scripts which query two tables. The Perl scripts create the htmlm4 content file (which will then be processed like all Web pages).
  • We have created a simplified sample dataset (10 citations, three meetings, 2 years, or so) of the Citings info to use for testing. Using this and the scripts, we can generate pages which look just like the current ones.
  • Worked on commenting the Perl query scripts.
  • Lana explained to Maureen in-depth how the script interfaces the database. Discussed what is next, including adding the data re which project is cited (AMP, PMA, BGP, or unknown).
  • Thanks to Klaus, who is sharing his experience regarding database design and development, with us, we plan on creating an Entity Relationship Diagram (ERD) for the database (Citings and collaborations). ERDs are a major building block for database design and developing one will help immensely with this project.

A DHCP server was added on moat for the new NLANR/198.202.123 network.

Enhancements, refinements, and additions continued on the PMA Collection and Use Statistics (Web logs) project, including:

  • After rrdtools was successfully installed on the new PMA server, the existing scripts were fitted onto the new box, tested, and are now working.
  • We have generated graphs for the HTTP/FTP/Combined traffic dating back to September 2001. We also have a running graph of up-to-the-day totals for the current month. In addition there is a script which updates graphs for the amount of data and the number of files each remote monitor transfers in a day; this script also updates nightly.
  • The team met via conference call to discuss the next phase and developed a list of action items to complete the project.
  • Most of the auto-detection has been implemented and creation of the monitors pages has taken place. The script can detect if one of the current monitors has not returned any traffic for the current day, is so, its graph is not displayed on the Web page.
  • Also, the absence of data is now shown as a gap on the graphs as opposed to a zero (as was the case before).
  • A test script to make a current 2004 yearly graph was run, though this has not yet been made into a cron job to update nightly.
  • Changed the layout the monitors page so that each of the monitors' graphs is side by side (there are two for each site (daily number of files and total traffic volume).
  • Began making roll-over cron scripts to update the makefile/create the new pages, nearly done.
  • Some issues with taking multiple files on the command line and processing them were fixed.
  • Timing issues were finally resolved, and the updates appear on the graphs at midnight local (San Diego) time.
  • Coding for the auto-detect of the monitors so that the script properly generates the right graphs for the right monitors was finished.

Activities of each individual on the project

AMP team

  • Tony McGregor ~
    AMP reimplementation central code; CRCnet problems, inclusion of updated IPMP, and AMPlet code changes; AMPlet email problems; Perl extension; new AMP server; collabs; comments regarding the proposal to create a New Zealand R&E network; working with Jeremy; Cisco support/10GigeAMP.
  • Jeremy Kallstrom ~
    AMP reimplimentation programming (IPv6 code in amp; new version of the graphing tool - dgraf).
  • Matthew Luckie ~
    AMP IPMP; AMP IPv6/Scamper.
  • Bud Hale, Jim Hale ~
    new AMP server and AMP servers RAID array; new deployment machine prep; upgrades, troubleshooting, and maintenance on the AMP servers and existing infrastructure.

PMA team

  • Jörg Micheel ~
    potential new deployments and collabs; Xilinx images/ATM headers garbage records; real-time efforts with Klaus; optical networking prototype prep; CRI proposal; NTP support inconsistencies; HPWREN passive traces.
  • Klaus Mochalski ~
    PMA real-time applications: TCP packet loss detection engine; simultaneous traces at all IPLS sites; NTP support inconsistencies; presentation; sharing his knowledge regarding database design with Maureen and Lana.
  • Chris Gross ~
    Web logs project; real-time application.
  • Jim Hale, Bud Hale ~
    serious network problems; HSI/HPSS; S2io 10Gb cards and machine prep for testing; fiberoptic amplifiers; Tel Aviv machine; OC48 card to Endace; problems with new PMA server; new deployment machine prep; upgrades, troubleshooting, and maintenance on the PMA server and existing infrastructure.

MNA, AMP, and PMA Outreach, Documentation, Web work

  • Maureen Curran ~ NATimes; database back end project; write/edit (July report; CRI proposal; Cisco support; slides); Web logs help.
  • Mike Gannis ~ planning press release and article re AMP in China.
  • Lana Kennedy ~ NATimes; database back end project for Web and print (PostgreSQl, Perl); write/edit/compile information (July report).

Management and Administrative

  • Hans-Werner Braun, Ronn Ritke, Tony McGregor, Jörg Micheel ~
    Weekly NLANR/MNA managers conference calls.
  • Hans-Werner Braun ~ DHCP server; HPWREN measurements; NTP support inconsistencies; meeting with Vijay Samalam.
  • Ronn Ritke ~ planning press release and article re AMP in China (and for Network Analysis Times); collabs; SC2004 prep; Cisco support; CRI proposal; budget.

Note:   Lana Kennedy, the student writer, will be leaving NLANR in order to concentrate on her UCSD studies this fall (her last day will be Sept.10). I wish Lana great good luck in school and in the future and will miss her tremendously. She is a very quick learner and a pleasure to work with. We (all) will miss her excellent, thorough, work. [MCC]

- 30 -

divider line

Top       2004 Oct 20       NLANR/MNA home page

acknowledgment