NLANR/MNA logo

NLANR/MNA Activities Summary - March/April 2003

Highlights

Work progressed on the new testing architecture for AMP with several efforts, including the creation of a network of four machines to use as a test environment and significant work on the new measurement daemon.  An architecture for measured (the measurement daemon) was drawn up and  plans for the implementation were begun. During this work a fair amount of effort is being devoted to setting up the infrastructure for the whole of the reimplementation project. The ICMP and traceroute tests are nearly completed (still need to write the unit tests for these).

This period we shipped our first PMA GigE measurement monitor (to the National Center for Atmospheric Research (NCAR) site).  We previously had an OC3 monitor deployed there; NCAR has stopped using the OC3 connection. An OC12 monitor was also shipped and installed at the Front Range GigaPop site in Colorado.  At the end of the period, these machines were available for testing and were expected to be collecting traces shortly.  

Our extensive efforts in hosting PAM2003: the Passive and Active Measurement Workshop, April 2003 resulted in a very successful and well received meeting.  The impact to the research community was clear and it provided many excellent opportunities to develop and continue collaborations, as well as showcase our efforts.  The meeting was well attended by the HPC network measurement and research community, including many students.  One goal we had set for PAM2003 was to add some support for students and encourage student participation. These efforts were very successful - of 83 PAM attendees, 40 were students and 12 of the accepted papers listed a student as the first author.

Work to establish the groundwork and foundation for development of real-time analysis for the Passive Measurement Project (PMA) data was begun.  This included time spent to plan and step out a method for developing a program in C to do real-time analysis and backporting Dag API support to the Dag3.2 cards. This was done in order to turn the SDA (SDSC Abilene) PMA machine into a real-time monitoring system. The backporting was successful and a 12 hour (night time) trace file was taken.  

Work continued on the IPv6 address list (to trace the forward path to all the IPv6 prefixes).  The goal is to compose an address list that is comprehensive in that it will have an IP address from each prefix. Support to stop tracing to an address if it detects a loop was added to Scamper (traceroute daemon).

Bill Owens (NYSERNET) and Joe St Sauver (U. of Oregon) continued their close collaboration with us with regard to the IPv6 measurement and analysis mesh that we have created.  At IETF 56, Matthew Luckie met with several IPv6 researchers which he found very helpful.  While there, he ran  traceroute code on the IPv6 address list that he constructed prior to the meeting and discovered that his IPv6 list is a good start, but not comprehensive.

Scamper was run on all IPv6 AMP machines from amp-nysernet.  (Scamper traceroutes in a parallel method to a list of IPv4 and IPv6 addresses.) A tool was written such that, given a list of IPv4 / IPv6 addresses, it identifies which addresses belong to the same router (tool is named "groupie").  The plan is to take multiple Scamper views of Abilene's IPv6 network, run the output through groupie, and then generate topographical maps with graphviz dynamically (as is done with the other visualizations at http://watt.nlanr.net/active/cgi-bin/v6_portal.cgi).   

Development of IPMP continued this period with work on ipmp_pathchar.  It is an ncurses application that works to estimate the serialization rate of each link on the forward and reverse paths (shows a list of segments and the speed in kbps that it sees, reporting results in "real time"). By the end of the period it appeared to be working quite well and we experimented with it on the CRCnet wireless network. An investigation into ways to characterize the L2 devices of a capacity limiting link with IPMP was begun.  

Other IPMP activities included additional work on the Linux and FreeBSD kernel implementation and the addition of the ability to specify the 8 TOS bits in the IP header for IPMP echo packets, which at least one AMP site might find useful.

Jon Bennett, who has developed a strong interest in our IPMP protocol,  had discussion moved from the IETF IPPM working group to the TSVWG working group.  Matthew attended the session and spoke from our  perspective, which differs significantly from Jon's at some points.

As reported previously, eight new AMP monitors were shipped last period.  Work took place during this period to install and bring these monitors online and collecting data. Of particular interest and strategic value are new international sites (amp-hutf: HUT network in Finland, amp-hean:  HEANet in Dublin, Ireland, and amp-unin: UNINet Network GigaPop in Bangkok, Thailand) and new deployments at GigaPop sites in the US (amp-gpng: Great Plains Network GigaPop and amp-wisn: U. of Wisconsin Network GigaPop). Helsinki U. of Tech. (HUT network in Finland) is the first site to be in the "international only" mesh (HEANet in Ireland will be the second). Work continues to bring some of these sites online (problems arose, including anomalies with the system manager process, and are being investigated and corrected). The placement of AMP monitors at GigaPops is significant because it provides for research such that traffic performance over backbone segments between GigaPOPs can be studied versus end-to-end performance between hosts using those same segments.

Requests for passive (PMA) monitors were received from two strategically interesting sites:  KISTI site (Korean Institute of Science and Technology) in Taejon, Korea and the AMPATH GigaPOP connection for the Florida International University (FIU) in Miami.  Passive monitors were prepared (using the newly arrived latest version of the Endace software) and shipped to each site.

A new flag (status) was added to the AMP database which indicates that a monitor is no longer active. Setting status to disabled means that no tests are done to the monitor, but the data is still available in the Web pages.  

Documentation on starting up a new monitor was drafted.  As a result, some thought is being given to ways with which most of it can be automated.  While debugging the build_XXX.list script, it was updated to create the symbolic link in the system manager from the confirmation directory to the amp-name.

Several new requests for AMP monitors have been received and are in various stages of processing, including:  AMPATH network (Brazil end), University of Mexico, DFN network (Germany).  A number of existing AMP sites received attention during this period, but the mesh of 130+ machines is working very well.

Work progressed with regard to planning for the deployment of an additional OC192MON.  

The effort to modernize and mature the current implementation the Cichlid 3-D Visualization System continued with work on the design and implementation of a multithreaded network protocol and development of a new class hierarchy.  The idea of creating a Cichlid Web browser plug-in was investigated. This may be feasable as an ActiveX control, but probably not as a Netscape plugin; the idea has been shelved for the time being.    

The Web server (Moat) was replaced, and now has two 120GB hard disks in about 1/8th of its former chassis size. The IPMP homepage on AMP was updated with new draft of the protocol. Some user initiated additions were made to the AMP throughput page (http://watt.nlanr.net/active/cgi-bin/tput-request.cgi).  Several bugs were fixed:  bug in script for weekly AMP graphs, file name bug in Mozilla, bug in Web interface to extract data from the HPSS.  Discussions were held, and plans made, regarding several Web pages, including those for the PMA data collection and download displays and the Citings and Collaborations pages.  

A redesign of the AMP map of remote sites (for both static publication and use on the Web pages) was begun. After selection of the background map style, the map rendering code was written and is being tested.  Population of the map is based on latitudes and longitudes of site locations and is an overlay of a static topographical map. An important feature of the new AMP map is that clicking on a site will bring up that site's data page.  A Cichlid visualization of PMA site activity was created.  

"NLANR Holds 'Very Successful' Workshop on Passive and Active Network Measurement" was published in Online, Vol 7 (8), April 16, 2003.  Online is the SDSC/NPACI biweekly newsletter ( http://www.npaci.edu/online/v7.8/pam2003.html).  Both the AMP and PMA posters were updated (and displayed at PAM2003).  

Work continued on the AMP servers, including the creation of Ohm, (which we're using to migrate AMP and Volt to FreeBSD 4.7-RELEASE and to take care of some other issues at the same time).  

HPWREN measurement and analysis activities included significant improvements to network reports scripts and improved visual representation and enhanced descriptions on the Web pages.  There were additional miscellaneous background system upgrades/support.

The 3-D Anemometer data was packaged and posted with a related README file. It is almost 5GB total, and is available in about 35MB daily sections (http://stat.hpwren.ucsd.edu/MtLaguna_3DAnemometer/).  

A review of the wind data from the Mt. Laguna Observatory was performed because both the mechanical anemometer as well as the solid state one have been lost. Looking at the 10 samples per second data from the solid state one revealed times when quite violent wind exceeded the 90 MPH the instrument was built for.  These short bursts of excessive wind combined with significant and heavy-weight ice buildup at times explains the loss of these instruments.

The NLANR.NET domain has been renewed for several years.

Meetings regarding collaborative efforts (both continuing and developing) were held with many people during the period, please see the details section for more information.  



Details

The following details are taken not quite verbatim (a touch of editing is performed) from the NLANR/MNA weekly reports, which are written by each member of the team. Items within each subsection are in chronological order (going forward).


Passive Measurement and Analysis (PMA)

~ ~ ~ Real-Time analysis, development of new metrics

I've mainly been working on backporting DAG API support to the Dag3.2 cards in order to turn the SDA (SDSC Abilene) into a realtime monitoring system. Succeeded with that and took a 12 hour trace file during the night from Wednesday/Thursday. Learned that only one of the cards does have GPS/PPS support and have asked Bud to have a look at that. [Jörg Micheel]

Worked with Chris on what and how we want to proceed with realtime analysis works and display and I think it is fair to say, on both accounts, we could clear alot of misunderstandings and confusion. [Jörg Micheel]

Today, Friday, I spent time with Joerg understanding what he wants me to do regarding PMA and real time analysis.   [Chris Gross]

I also started thinking once again about the different types of metrics I could obtain from these trace files.  [Chris Gross]

The rest of the week, was spent learning about network programming in C. I have gotten quite far, I feel and know more about both C and Network programming in general too.  [Chris Gross]

I am also trying to plan and step out a method for developing a program in C to do real time analysis, as I have discussed with Joerg. All in all a very productive week.  [Chris Gross]

This week, I finished up my server client application program in C. It ran well, but when I changed it to be multi-cast from TCP, I broke the loop some how and now it just runs to infinity. Hopefully I will be able to work that out.  [Chris Gross]

I have also been practicing File I/O in C to get files in columnar form. All of this is in anticipation of setting up a real time analysis system for PMA.  [Chris Gross]

However, I did, with Todd's help, fix the bug in my program and at the same time learn quite a lot about network programming in general.  [Chris Gross]

~ ~ ~ GigE

This week we shipped a new PMA machine. It is the first GigE machine. It uses the Endace GigE card. The site is nai-p-nca (National Center for Atmospheric Research). It replaces the OC3 monitor there that was taken down because the site simply stopped using the OC3 connection. I expect the machine to be placed online next week. A signal splitter for the GigE connection is on order. <For more information, please see NCAR activities below.>  [Bud Hale]

~ ~ ~

On the plate is works on strategic positioning of the PMA project. [Jörg Micheel]

followup on OC192MON purchase and justification with Ronn.  [Jörg Micheel]

On deck are a number of activities to synchronize on PMA monitor installation works with Jim and Bud Hale  [Jörg Micheel]

While Joerg was visiting this week we were able to closely examine the time distribution unit supplied by Endace for the Dag3.x cards.  We determined that it was probably not possible to use the unit for both PMA machines and AMP machines. So we were able to plan some possible "work around" options for that situation. So at this time the nai-p-sda (San Diego Supercomputer Abilene) monitor is receiving the GPS derived timing signal from the time distribution unit. [Bud Hale]

Another important part of Joerg's visit is that Jim now has a later version of the Dag software for new PMA machine installations. [Bud Hale]

Received the new ENDACE software from Joerg. I've already begun installing it in machines for immediate use.  [Jim Hale]

I also spent time with Joerg and Bud in the machine room learning about the monitors, DAG cards, and time stamping that PMA is using.  [Chris Gross]

This week I spent much of my time understanding and playing with tsh files. I took some traces I had locally on my hard drive, ran a few scripts on them, and saw the output.  [Chris Gross]

I also spent time learning more about PERL and writing numerous test programs to become acquainted with various aspects of PERL that I will need in order to process trace files.  [Chris Gross]

I've begun making use of a small linksys router to configure machines going to remote locations using NAT so I can configure them using the address they'll be using at their final destination. [Jim Hale]

~ ~ ~ Trace Sampling project

This week I've been going over the data analysis that I collected last week.  I had my scripts run an analysis on all the data for the month of february, and they produced graphs and numerical analysis for each day.  The graphs showed the sampling accuracy for the day, and the numerical analysis was the average and standard deviation of sampling accuracy for that day.  Those numbers, as well as the size of the sample are stored in a text file.   [Justin Fields]

This week I wrote a program that looks through these daily summaries and plots accuracy and sampling size versus time.  These graphs have two y axis, one describing accuracy of the sampled analysis, and the other describing the amount of data there was to sample. Both these are plotted against time on the x axis.  I haven't looked over many of the graphs I've produced yet, as I'm still working on perfecting the way I want the graph to look, but I think they'll help show exactly when sampling accuracy breaks down. [Justin Fields]

This week I made more progress with my sampling research.  I've got my newest graph looking good now, and I feel it that it will help show relations between sampling rate, sample size, and accuracy. I've produced graphs for data sampled every 4th and 8th packet. [Justin Fields]

This week I've been reading up on sampling literature.  Right now I'm working through a text book on the subject, one I've found referenced in many of the sampling papers.   [Justin Fields]

I was running another analysis of sampling efficiency for the month of february at a sampling rate of 16, when I ran into a hardware problem.  My power-supply burnt out.  Bud replaced it for me on thursday.  I started up my analysis again, but am waiting for the results.   [Justin Fields]

This week I found a glitch in the way I was collecting data the last couple of weeks.  The different sampling rates that I had collected had different time frames.  I've fixed that, and ran my scripts to analyze the month of february for four different sampling rates.  I think the results from this will give me a clear picture of where sampling rates start to fail statistically, and for what size of data sets.  I plan on collecting all of the graphs togethor and writing up a small description of my procedures and preliminary analysis of the results.   [Justin Fields]

I also met with Chris and Cooper this week to talk about where we might have overlapped on pma work.  I believe the main decision of that meeting was to have Chris organize and direct any pma contributions we might have, as pma is his main focus.   [Justin Fields]

Justin Fields is now using a fixed set of PMA data for the sampling research, this will simplify things while he determines what analysis to conduct and what data to graph. [Ronn Ritke]

Meeting with Justin about the sampling research. We talked about a new graph he had in mind. He emailed the new graph and we will discuss it next week. [Ronn Ritke]

Justin has copies of some sampling papers (a couple I gave him) and wants to do a literature search for additional papers on network sampling. [Ronn Ritke]

~ ~ ~ new deployments

This week we received the PMA machine request from Manhee Lee for the KISTI connection. That machine is in final preparation and will be shipped early next week.  [Bud Hale]

Also we received the request for the PMA machine to go into the AMPATH GigaPOP connection for the Florida International University. That machine is also in preparation.  [Bud Hale]

New passive monitors are being readied for shipment to the AMPATH GigaPop at the Florida International University (FIU) and the KISTI GigaPop in Korea.  [Bud Hale]

Since we just received the network addressing data we've been waiting for from Man He Lee at KISTI, this was the first machine to receive the new software and will be ready for delivery early next week. [Jim Hale]

I'm also preparing the machine for AMPATH, it too received the new ENDACE software and will be ready for delivery for early next week. [Jim Hale]

This week two new PMA monitors were prepared and in shipment. One is for the AMPATH GigaPOP in Miami. The other for the KISTI site (Korean Institute of Science and Technology) in Taejon, Korea. Jim installed the latest version of the Dag software from Endace on both machines. This software was delivered to us by Joerg while visiting for the PAM2003 conference.  [Bud Hale]

Continuing to work on the monitor for the AMPATH GigaPop at the Florida International University. Also working on monitor for the KISTI GigaPop in Korea. [Jim Hale]

Completed and tested the PMA monitor scheduled for AMPATH. KISTI machine completed, tested and ready to ship. In both cases the new Endace software Joerg delivered was used. [Jim Hale]

PMA existing sites, maintenance and troubleshooting:  

A number of conversations with PMA monitor sites, initiated by Bud. Discussions with Indianapolis with respect to moving all or some of the new gear to the new Abiline POP at the Qwest location.  Have yet to answer an email from Matt Zekauskas.  [Jörg Micheel]

We are working on FLA, MAX, TAU, NCA monitors at present.  [Jörg Micheel]

~ ~ ~ BUF

Also worked on the BUF monitor, which was collecting empty trace files. Could not really locate any particular issue, when using the system in manual mode, it would collect properly. Waiting for more data to come through. [Jörg Micheel]

Previous weekend and beginning of the week I spend on BUF and MRA which had been producing empty trace files. I learned that there must be some form of trace condition, both systems started working normally after a manual trace run, which completed successfully. [Jörg Micheel]

~ ~ ~ BWY (Columbia University)

Progress at the BWY machine (Columbia university), it turned out to be a rather silly problem - the fibres were connected to the wrong port at the Dag3.2 cards. Now resolved, and the monitor is back in the flock.   [Jörg Micheel]

BWY (Columbia University) has been fixed after we learned that the site had been using the wrong port on the dual-SC connector. New procedures (key locking) are being considered for upcoming shipments. [Jörg Micheel]

~ ~ ~ FLA nai-p-fla (U. of Fla., Gainesville)

Another PMA site receiving attention this week was the nai-p-fla (U. of Fla., Gainesville) site.  An anomaly occurred when Joerg installed some software upgrade. When he rebooted the machine it failed to restart the sshd. However I worked with site people to get the sshd running and the site is again reachable. Joerg has been notified that it is again reachable and will make the corrections. [Bud Hale]

~ ~ ~ FRG (Front Range GigaPop)

The FRG unit is installed at the gigapop in Denver but the site technician has been unable to travel to the site to trouble shoot the OC connection.  [Bud Hale]

Other news from the NCAR/FRG area is that the site technician there traveled to the nai-p-FRG (Front Range GigaPop) site on Friday to work on the OC connection for that new machine. I expect more to report on that next week.  [Bud Hale]
 
I'm currently working to get 3 recently shipped machines going. Those are the TAU (Tel Aviv U.), FRG (Front Range GigaPOP) and NCA (National Center for Atmospheric Research GigE) machines.  [Bud Hale]

Scot <Colburn at NCA> is also is the one to get the Front Range GigaPOP machine connected. He needs to do some power level measurements and check his single mode connections through the signal tap. This work requires an all day trip from Boulder to Denver. He has not been able to work his schedule to get that done as yet. He knows we are anxious. [Bud Hale]

Also I discussed with him the OC12 monitor at the Front Range GigaPOP and his recent power signal level measurements. He said he will schedule a visit to the GigaPOP in Denver early next week to further diagnose the connection. [Bud Hale]

Other PMA news to report is that Scot Colburn has completed connections of the GigE monitor at the NCAR (National Center for Atmospheric Research) and the new machine at the Front Range GigaPop. At this time those two machines will be available for test and are expected to be collecting traces soon. [Bud Hale]

~ ~ ~ National Center for Atmospheric Research (NCAR)

Otherwise on PMA, Jim has prepared a GigE PMA monitor for the National Center for Atmospheric Research (nai-p-nca). As mentioned before the OC3 connection there was removed. All that is needed is some additional information about the connection hardware needed and the unit will be shipped. [Bud Hale]

Put in a couple of days assembling the GigE PMA monitor for the National Center for Atmospheric Research (nai-p-nca). Ronn showed a lot of interest in rolling up his sleeves and digging into this machine with me. With his schedule, he couldn't squeeze out any time.  The hardware is assembled, the software is installed and the configurations are nearly complete. I expect to have this machine shipping out first thing Monday Morning. [Jim Hale]

Finished the last minute configuration of the National Center for Atmospheric Research GigE PMA machine. Got that shipped out early this week. We received the data we needed and ordered the GigE splitter from NetOptics. [Jim Hale]

This week we shipped a new PMA machine. It is the first GigE machine. It uses the Endace GigE card. The site is nai-p-nca (National Center for Atmospheric Research). It replaces the OC3 monitor there that was taken down because the site simply stopped using the OC3 connection. I expect the machine to be placed online next week. A signal splitter for the GigE connection is on order.   [Bud Hale]

The latest PMA machine deployed was a GigE PMA monitor to the National Center for Atmospheric Research. That machine is on site and being installed. I shipped a GigE splitter to the site on Thursday and the site is expected to go online soon.  [Bud Hale]

I'm currently working to get 3 recently shipped machines going. Those are the TAU (Tel Aviv U.), FRG (Front Range GigaPOP) and NCA (National Center for Atmospheric Research GigE) machines.  [Bud Hale]

Scot Colburn at NCA has the new GigE machine in the rack and is getting it connected. I shipped the GigE splitter at the end of last week.  [Bud Hale]

Continuing to work with Scot Colburn at NCAR (National Center for Atmospheric Research) to connect the new GigE machine there. We now have it connected to the Ethernet connection but not yet connected to the fiber GigE connection. [Bud Hale]

Other important developments on PMA connections is the Scot Colburn of the Center for Atmospheric Research was able to make some corrections on the fiber connections to the Front Range GigaPOP OC12 monitor.   [Bud Hale]

Other PMA news to report is that Scot Colburn has completed connections of the GigE monitor at the NCAR (National Center for Atmospheric Research) and the new machine at the Front Range GigaPop. At this time those two machines will be available for test and are expected to be collecting traces soon. [Bud Hale]

~ ~ ~ Rice University

The OC3 monitor at Rice University appears to have a OS hang at this time. I've sent a message to the site technician requesting it to be power cycled. [Bud Hale]

~ ~ ~ SDSC Abilene (SDA) system

Also worked on the SDSC Abilene (SDA) system, which is connected just fine. The goal here is to make it non-standard and prepare for realtime monitoring. I am intending to upgrade it to the latest dag software release with API support, it presently runs under 2.4.4. [Jörg Micheel]

I've mainly been working on backporting DAG API support to the Dag3.2 cards in order to turn the SDA (SDSC Abilene) into a realtime monitoring system. Succeeded with that and took a 12 hour trace file during the night from Wednesday/Thursday. Learned that only one of the cards does have GPS/PPS support and have asked Bud to have a look at that. [Jörg Micheel]

In the later after- noon worked with Bud and Chris to understand the problems we are having to connect the PPS signal to the SDA monitor. We got quite a bit further and could prove a few points, but we can't just quite get it to work with more effort, and perhaps the help of ENS to hook to their GPS unit. [Jörg Micheel]

~ ~ ~ TAU (Tel Aviv U.)

The TAU machine appears to still be tied up in the Israel Customs Department.  [Bud Hale]

The other site of interest is the nai-p-tau (Tel Aviv U.) site. The replaced unit was received back here at SDSC. This indicates that the replacement unit is out of Israeli Customs and at the Tel Aviv U. site. Hopefully there will be more to report next week.  [Bud Hale]

I received a machine from Tel Aviv University. So I must deduce from that, The nai-p-tau machine must have finally been cleared through customs, seeing the box we received once contained the new nai-p-tau machine. [Jim Hale]

I'm currently working to get 3 recently shipped machines going. Those are the TAU (Tel Aviv U.), FRG (Front Range GigaPOP) and NCA (National Center for Atmospheric Research GigE) machines.  [Bud Hale]

Inclusive list of sysadmin tasks

Active Measurement Project (AMP)

~ ~ ~ New testing architecture

I have a network of 4 machines at home that I hope to use as test AMPlets here.  Unfortunately, I'm having trouble getting FreeBSD to boot on them.  After a range of odd errors I discoverd that FreeBSD's network driver for the ADMtek 985B kills the MAC address in the chip.  So they were all trying to use the same address 07:00:07:00:07:00.  I loaded Windows and Linux and they both work OK (Windows restores the MAC address even if it's been junked and Linux just leaves it alone).  So I upgraded the BIOS on the boards and attempted to upgrade the EEPROM on the ADM985.  The BIOS upgrades went fine (but didn't help -as was probably to be expected). Unfortunately, the ADM985 update killed the chip on the first machine I tried.  I'm not sure what went wrong (I followed the manufacturers instructions carefully and know someone who's done a whole lot of other ones).  So it looks like I'll have to get a separate network card for that one anyway.  Not quite sure where to go next (and would appreciate any suggestions). [Tony McGregor]

Lots of setting up Linux and FreeBSD on amplets. Figured out how to make dhcp set the mac address before requesting an ip address (using dhclient-enter-hooks) to work around the MAC address problem FreeBSD has with these machine.  To swap to Linux from FreeBSD requires a boot to Windows in between.  Swapped drives between two of my test amplets to confirm one was bad.  Send bad drive for replacement. Reinstalled Linux and FreeBSD on several of them and checked transferring a FreeBSD and Windows image from and to the amplets. [Tony McGregor]

Drew up an architecture for measured (the measurement daemon) and started to plan the implementation. [Tony McGregor]

I started setting up for the new amplet distribution in the latter part of this week.  I've mostly been learning the gnu tools for making portable distributions (automake and autoconf, and the related components).  Also CVS, which I've only used a little in the past. I've been writing code for the measurement daemon as the first component. [Tony McGregor]

I spent most of 2 days this week working on the new measurement daemon.  I have about 870 lines of code at this point.  Progress is slower than I'd have liked, but I think that is mostly because I'm doing a lot of setting up the infrastructure for the whole of the reimplementation project. [Tony McGregor]

I got about a day and a half on the measurement daemon done this week.  (less than I'd have liked).  Mostly I was working on the unit testing architecture and fixing bugs that found.  I've just started working on the threads code.  I also did my first major CVS update and cheked that out into a new directory and went through adding all the files I'd failed to cvs add. [Tony McGregor]

I got some more work done on the measurement daemon this week, including on my way home.  I have the thread code done and have been working on the unit tests for it.  I currently have about 2700 lines of code. [Tony McGregor]

My main work this week was on the measurement daemon.  I'm very please with my progress, having mostly completed the icmp and traceroute tests.  I still need to write the unit tests for these tests.  [Tony McGregor]

The next step is the icmp test then the measurement daemon will be mostly done. The major exception is the interface to the data transfer code (the replacement for am_master and am_slave) which has still to be written and the documentation.  I'm guessing another 3 weeks or so should see it done, then I'll move onto the data transfer code. [Tony McGregor]

~ ~ ~ IPMP

I did some more work on ipmp_pathchar.  It is an ncurses application that works to estimate the serialisation rate of each link on the forward and reverse paths. [Matthew Luckie]

I did some work on the way I incrementally update the IPMP checksum to take account of the fact it is now a much more simple process than it was before the recent changes to the protocol that were published in our Internet Draft. [Matthew Luckie]

Someone from Korea <Heonkyu Park, see Collaborations> posted to IPPM asking if there was a protocol that would allow them to identify each router in a path without sending a multitude of packets.  We pointed them at IPMP.  Someone else asked about implementation status, I said that we've got Linux and BSD implementations, and that I'd update my webpage with these more recent versions. [Matthew Luckie]

I did some more work on the Linux and FreeBSD kernel implementation of IPMP, and some more work on the ipmp_pathchar.  I have not deployed this kernel in the field yet, but have it deployed amongst a small group of machines in the lab which will be useful to test and debug my pathchar code with. [Matthew Luckie]

Jon Bennett has put IPMP on the agenda for the transport area working group of IETF.  Unfortunately, we didn't find out until late in the week.  Matthew has gone to IETF to make sure our perspective is presented. [Tony McGregor]

Jon Bennett is presenting his draft of IPMP to TSVWG (a transport area working group).  I can't see the relevance of the protocol to that group, but I'm going to IETF next week to raise the profile of IPMP in that group, and to point people at our draft.  I think it is important to communicate the different design goals that we have, mainly our desire to strip anything that is unnecessary from the protocol. [Matthew Luckie]

I worked on ipmp_pathchar while I was idling in some working groups.
sudo ./ipmp_pathchar -4 192.168.2.2
192.168.1.2 -> 192.168.1.1:  90464312
192.168.1.1 -> 192.168.2.2:  90234490
192.168.2.2 -> 192.168.2.1:  67957079
192.168.2.1 -> 192.168.1.2: 101486558
each link is a 100mbit ethernet, connected directly with a crossover cable.  I'm having problems doing math with big numbers, so I'm using a long double to calculate the serialisation rates.  The right hand column presents the serialisation rate that my ipmp_pathchar estimates with two packets (a small one and a large one).  They're in the right ballpark, except for hop 3, which is at the packet turn-around. [Matthew Luckie]

Work continues on ipmp_pathchar.  I added a warmup stage to the code so that it can send appropriate minimum-sized packets that still provide enough room for every node in the path, and also find the Path MTU for sending appropriate maximum-sized packets. [Matthew Luckie]

I added a significant number of sanity checks to the code to make sure that I can handle the case where there are multiple IP paths that a packet can take from a source to a destination. [Matthew Luckie]

I spent nearly all my time this week getting ipmp_pathchar into a runnable state.  As I've said before, it is an ncurses application that shows a list of segments and the speed in kbps that it sees, reporting results in `real time' even though that might not be appropriate.  Next week's work on the code will be to implement some kind of statistical selection of the mode(s), like how pathrate does.  I'm going to give some thought to how to go about dynamically selecting an appropriate sized bin based on the range of bandwidth estimations I've made.  I'm sure there has been work done in this area (perhaps not by network measurement people) so I'll look for that first. [Matthew Luckie]

I lead the Waikato Applied Network Dynamics (WAND) group Friday Student Meeting yesterday with a discussion on how the pathrate capacity measurement tool works.  We decided that we will continue on the bandwidth estimation theme for a few more student meetings, where i'll talk about pathload / pathchar, and the applications that IPMP has to bandwidth estimation. [Matthew Luckie]

More work on ipmp_pathchar, it seems to be working quite well.  I wrote some code to automatically select an appropriate bucket size (resolution) of the dataset.  Nothing particularly brilliant - I take the 2nd and 98th percentile, and currently divide the range up into 20 buckets.  I've got a data set that I'm working with to investigate other bucket selection methods.  Late in the week, I got a new kernel implementation out on CRCnet, so I'm going to be getting some data sets from that next week. [Matthew Luckie]

Late in the week I saw an email to various Internet research lists (e2e, IMRG, ippm) about new version of pathload and pathrate that now automatically select an appropriate bucket size.  I'm going to look at their code to see what they have, after I've experimented this week with my own technique.  Bit of a shame that they beat me to it. [Matthew Luckie]

I had a look at draft-ietf-ccamp-tracereq-01.txt, made a few comments on it, and suggested they look at IPMP.  The draft they've released discussed requirements for a protocol that can traceroute packets encapsulated in a tunnel, like MPLS / GRE / IP-in-IP. [Matthew Luckie]

I experimented with ipmp_pathchar on the CRCnet wireless network. The results are not really what I expected.  I estimated the serialisation rate of each wireless segment (orinoco card to orinoco card) to be 4.1 Mbps, which is not one of the serialisation rates of 802.11b (1, 2, 5.5, 11).  I'm told each ink on the network is forced to 11Mbps, but that 802.11b has a lot of other stuff happening in the same band which acts as overhead.  The bandwidth we get out of these links with iperf is close to 4.1Mbps. [Matthew Luckie]

CRCnet deployed a proprietry point-to-point link known as a fastbridge. 18Mbps capacity, 9Mbps full duplex.  ipmp_pathchar estimated 7.1 on the reverse path, but the forward path was very very very wierd - the graph of bandwidth estimation samples seems to be of the form y = 1/x, with a very long tail.  I need to look into this next week. [Matthew Luckie]

I added the ability to specify the 8 TOS bits in the IP header for IPMP echo packets, which at least one AMP site might find useful. [Matthew Luckie]

~ ~ ~

I spent some time thinking about ways to characterise the L2 devices of a capacity limiting link with IPMP.  The motivations for doing this is to come up with an answer to the problem described in the paper The Effect of Layer-2 Switches on Pathchar-like Tools http://www.icir.org/vern/imw-2002/imw2002-papers/145.ps.gz  [Matthew Luckie]

To do this, we need a packet train to be queued back-to-back at the capacity limiting link that is long enough to be split up by each of the L2 devices on a segment and then characterise wha the packet train saw.  The ability to do this is dependent on the packet train not becoming dispersed due to cross traffic on the segments prior to the capacity limiting link.  One idea I have is to send a large packet first set to expire at the capacity limiting link, with a train of small packets following it immediately, like how nettimer does. [Matthew Luckie]

Hopefully I'll be able to make enough sense of the distribution of the packets at the other end of the link to identify the underlying serialisation rates of L2 devices on that segment.  Because we have timestamps inserted at each side of the segment, we should get a very accurate picture of how the packet train was dispersed by the L2 devices, which is an improvement over current packet-pair techniques. [Matthew Luckie]

~ ~ ~

I didn't get very far into my IPMP intro chapter. [Matthew Luckie]

I did some work on the kernel code again.  I removed all checksum checks on the packet, which should speed things up markedly.   [Matthew Luckie]

One other of the new sites, amp-gpng (Great Plains Network GigaPop), I've been communicating with Jerry Niebaum and completing plans to complete the installation at the Great Plains site. Jerry is very anxious to move ahead with the AMP monitor at his site. Also he is planning look at how he can utilize the IPMP protocol of the AMP machine to analyze and manage his network connections. [Bud Hale]

~ ~ ~ IPv6

Friday I spent more time composing an address list that is comprehensive in that it will have an IP address from each prefix. I'm confident that I have a record of all the TLAs from the 6bone and the Regional Internet Registries, but I'm probably missing many delegations that are not published in the 6bone.db and will probably have to discover these via other means such as from BGP views and DNS walks. [Matthew Luckie]

I added support to the traceroute daemon (Scamper) to stop tracing to an address if it detects a loop. [Matthew Luckie]

I had some help from RIPE people to identify the prefixes that they've allocated, and pointers to the two IPv6 prefixes that LACNIC have assigned.  Bill Owens and Joe St Sauver forwarded me the output of "show ipv6 route" on three of their IPv6 routers.  I'm using the output to compare what they see with what is advertised, and to get an idea of how the Internet2 IPv6 TLA has been split amongst universities. [Matthew Luckie]

I discussed routeviews for IPv6 with Tony.  I asked Joe at U of Oregon to ask around about their plans for routeviews.  He says that it is not on their list of priorities, but that will need more investigation before we decide one way or another. [Matthew Luckie]

This week I attended IETF 56.  I met with David Kessens (nokia IPv6 and very involved with 6bone) and a few other helpful IPv6 folks to discuss methods to construct an IPv6 address list.  They suggested methods such as the following. [Matthew Luckie]

- probing the ::1 for each /32
- for each /32, try to see subnets along a /48 boundary, e.g. for 2001:618::/32, probe 2001:618:1::1, 2001:618:2::1, 2001:618:ffff::1
- i'm not suggesting that a probe to all 65535 subnets that lie between a /32 and a /48, but certainly probe for a few subnets that a site may have allocated or delegated - google for ipv6, and then resolve the dns names returned for AAAA records
- route tables from the likes of the 6bone (David says that he will provide a method for me to query ipv6 routers that he controls)

While I was at IETF, I also ran my traceroute code on the IPv6 address list that I'd constructed prior to arriving.  Of the 821 addresses in that file, 628 were unreachable - 289 were address unreachable, 282 were network unreachable, and 57 were unreachable due to a routing loop.  I discovered >300 IPv6 addresses in the process that are "routing addresses".  The conclusion I take from this is that the address list I created out of the dbone.db was a good start, but it is hardly comprehensive. [Matthew Luckie]

I got an email from Bill Owens about a change in the Abiline IPv6 Infrastructure. The Abilene folks moved the Kansas City-Denver OC-48 from the old GSR to the new Juniper yesterday at about 1000 PST, something I'd been watching for to see the effects on the path. And. . . there was almost nothing. You can see the step in the traceroute, but no change in the RTT. Perhaps something else is dominating, or the path asymmetry is hiding it (return traffic from UO goes through Sunnyvale). It does look like the very small loss we'd been seeing has gone away, so maybe the GSR was causing that. We'll have to see a few days' data to be sure.  [Matthew Luckie]  http://amp.nlanr.net/active/cgi-bin/v6_linkcomparison.cgi?from=amp-nysernet&to=amp-uoregon&date=103.3.27

I wrote a perl script that utilises Google's SOAP API to search for IPv6, as www sites that promote IPv6 will hopefully have an IPv6 address to trace to. [Matthew Luckie]

I wrote another perl script that performs zone transfers from.ip6.arpa up that can be used to find out how all IPv6 prefixes -that I can get a zone transfer for - have been split up.  I haven't run the code yet, as I'm not sure how people feel about using a zone transfer for this purpose.  I'm keen to hear any opinions. The initial code used the `host' command, but I'm working on a new script that uses the Net::DNS library to cache NS's from higher level prefixes and use those explicity, rather than start at the root for every prefix I find out about.  It also keeps a record of which NS's are likely to permit a zone transfer based on whether they've allowed one before so I don't bother NS's that won't unless I have to. [Matthew Luckie]

I finished the modifications to my dnswalk perl script. I'm still looking for someone to give me the go-ahead to run it from a host (and deal with any 'abuse' complaints). http://voodoo.cs.waikato.ac.nz/~mjl12/dnswalk.pl [Matthew Luckie]

I'm told that Oregon Routeviews for IPv6 is starting up soon, which is a good thing. [Matthew Luckie]

It was suggested by David Kessens (Nokia IPv6 group who I met at IETF 56) that I should send an email to the 6bone mailing list to let people know about my plans to do a DNS walk of the ip6.arpa zone.  I'm going to write the mail with the intention of not only allowing opt-out, but opt-in as well, so that if people who currently block zone transfers would like to allow me to get a detailed view of their zone for the purposes of mapping it, then to contact me and I'll give them an IP address to allow zone transfers from. [Matthew Luckie]

I also got my IPv6 implementation out of code-rot status by bringing it up to consistency with the draft we released.  I created diffs against linux-2.4.20 and FreeBSD 4.8, got my ipmp_ping source and passed them to Tony to use in his project. [Matthew Luckie]

~ ~ ~ Scamper

I contacted the UNILink office at Waikato to ask them about releasing my traceroute software (which I've called scamper) into the wild. Their initial response was that because I am a student, I can do whatever I want with the code, and that I _must_ distance myself from the WAND group and the University of Waikato.  I've asked them to reconsider as the software could be a good advertisement for the WAND group, at Tony's suggestion. [Matthew Luckie]

I did a scamper of all other IPv6 AMP machines from amp-nysernet and drew the following graph of the output with graphviz: http://voodoo.cs.waikato.ac.nz/~mjl12/nysernet.gif   [Matthew Luckie]

I wrote a tool, `groupie' that, given a list of IPv4 / IPv6 addresses, identifies which addresses belong to the same router.  The ideas of how to do this were taken from http://www.caida.org/tools/measurement/iffinder/but some of the techniques Ken used aren't available with IPv6 (like Record Route, for instance).  I'm going to take multiple scamper views of abiline's IPv6 network, run the output through groupie, and look at generating topology maps with graphviz dynamically like I do the other visualisations at http://watt.nlanr.net/active/cgi-bin/v6_portal.cgi [Matthew Luckie]

I sent email to Bill Manning.  He does regular DNS walks of ip6.arpa and ip6.int, so I've asked him if I can have the raw addresses that he collects in this process for doing a scamper of the IPv6 Internet. [Matthew Luckie]

I got route tables of one of a isi.edu ipv6 router for the past few years from Bill Manning, still waiting to hear about the data he collects in his DNS walks.

~ ~ ~ AMP servers and system disk

There was a new exploit in sendmail discovered this week and I spend quite a lot of time upgrading sendmail on amp.  It was difficult because AMP and VOLT are now running quite an old version of FreeBSD and we can't just use the automated update on it.  I initially tried just using a binary for it, but that gave a bunch of error messages.  The I tried the ports distribution (I had to load a new version of the ports tree first). Unfortunately there were a couple of documentation errors in the sendmail port which had me confused for quite.  I considered installing sendmail from the generic package but it's pretty complicated to compile and install. I eventually battled my way through it, and figured out how to get the ports version running, only to discover yet another release on Wednesday.  Fortunately that was a trivial change.  So we should now have a safe version of sendmail. [Tony McGregor]

Worked the weekend on getting the OHM Server to a preliminary level. I was reluctant to connect it to the 10.25.5 backend interfaces though I had prepared the additional interface.  In Tony's preliminary examination, oversights like the backend interface became immediately apparent.  Connected OHM to the 10.25.5 network switch, configured AMP and VOLT to accept communications from the new machine. Interesting, the difference in the configurations of the machines. Up till now I've not had a chance to look at the AMP project construction.  I'm still configuring the web aspects on OHM and should have that done by Monday. This opportunity has been fascinating. As a result I've begun looking at other possible options, such as storage, that might prove useful.  [Jim Hale]

Jim got the new server ready (which we're using to migrate AMP and volt to FreeBSD 4.7-RELEASE and to clean up a few other things at the same time).  I had an initial look over it and asked Jim to look into some issues, which he's now working on.  The hassle this week with sendmail makes me even keener to get this project completed. [Tony McGregor]

I got the web pages functioning on the OHM server. I consulted a bit with Cooper and Todd on apache functions and starting cgi functions. Some links to cgi pages return a permissions problem, I'm still pursuing the cause.  My progress seems slow, though I'm learning a lot, I'm understanding what I learn and the investment will pay off in the long run. [Jim Hale]

I added a GigE card to the OHM server on the 10.x.x.x backside network.  Upon completion that network connection began functioning erratically.  Within a short time I worked out the problems between OHM and VOLT, and communication began again. The communication with AMP took slightly longer, though communication was restored shortly afterward. I ordered a Netgear 10,100,1000 switch to install on the 10.x.x.x network. I expect to receive it and begin installing it early next week.  [Jim Hale]

Worked with Matt Luckie on installing PHP on OHM. I began having trouble installing from the Mod_php4 software from the ports tree. Matt Luckiegave it a try and it installed with little trouble. [Jim Hale]

Worked with Cooper Nelson on Loading Mod_php 4 on AMP from the ports tree.  It was thought in the beginning updating the ports might be a problem, but luckily Matt or Tony had previously updated them. With a just few glitches in the beginning we began the installation. During the install we noticed apache was being upgraded. This required a few reconfigurations of the httpd.conf. But with a few apache restarts, everything including php seem to be running just fine. Sicne the configuration file are replace on VOLT every night, next week we'll arrange with HWB to configure WATT to take VOLT out of the mirror so we can do the apache upgrade on VOLT.  Some of the web page cgi 's had stopped functioning in the http://ohm.nlanr.net/active/cgi-bin/tr-request.cgi.. Even after all the pages began displaying they still didn't display route information. Tony later informed me about the multiple security layers that prevent it. [Jim Hale]

Installed php on amp. Unfortunately this forced an upgrade of apache as well.  I had to rebuild the httpd.conf file before the new apache would start correctly.  All my amp viz stuff is now available with real time data, check it out at http://amp.nlanr.net/~coop/ampviz.php.   [Cooper Nelson]

Installed the new 10/100/1000 MB NETGEAR GS 508 switch. Replacing the original existing switch. After replacing the switch I found AMP and VOLT were able to reconnect and remain communicating without a problem. However OHM with its 10/100/110 MB interface card seems unable to negotiate the slower 100MB speed connected through the new 10/100/1000 MB  NETGEAR switch on the 10.x.x.x backside network. That problem doesn't seem to exist on the front side 198.x.x.x network. At the time I didn't think of this to be a problem seeing my next project was to replace the 10/100 MB network interface cards in VOLT with the 3COM 10/100/1000 MB network cards.  I installed a 3Com 10/100/1000 MB network interface card in VOLT. My previous practice with installing the 3COM interface cards was on Free BSD 4.5.  From that point I began running into problems. I searched around looking for the drivers for the card and eventually had to go to 3Com for driver, only to find out 3Com doesn't support FreeBSD. Then tried to reach Broadcom.  Before making contact with Broadcom, it was decided that upgrading the network interface cards was going to be more difficult then it was worth at this time. [Jim Hale]

After installing the NETGEAR GS 508 10/100/1000 Mb switch last week, It appeared that the new 3Com 3C996-T 10/100/1000 Mb network interface card installed on the 10.x.x.x backside network on OHM would no longer negotiate the speed difference with the existing 10/100 network interface cards in AMP and VOLT. The connection between AMP and VOLT through the new switch seemed to work as it always had. I took OHM offline and began investigating the problem by replacing the installed cards. One issue I did realize is the original network interface cards previously purchased were the 3c996B-T, a five volt card. I replaced the original card with the later 3C996-T network interface cards, 3.3 volt with a five volt tolerance. This issue had nothing to do with the problem, though it's an important note as the tyan 2518 motherboards have one 5 volt slot and one 3.3 volt slot. With some rigorous adjustment of the cards I was able to finally get the connection established, enabling the connection data exchange between all three of the machines. [Jim Hale]

In running a test on OHM by starting ./su_restart_am_slave I got the response /usr/libexec/ld-elf.so.1: Shared object "libc.so.3" not found. On Tony's recommendation I prepared to re-compile am_master. I conferred with Cooper on the plan. I then ran the re-compilation. After moving the new binary to the correct location. I restarted the am_slave and still got the same response. So I'll continue to pursue the problem. [Jim Hale]

I did finally receive the new Gigabyte switch. For the price, capability and physical characteristic The NETGEAR GS 508 seemed to best fit the requirements. [Jim Hale]

Began working on OHM, continued with ownership issues and continued on documentation of OHM. [Jim Hale]

I'm just a few parts short of a new AMP server to replace second of the upgraded servers. This machine is a product of the experience I got putting OHM together without previous mistakes. [Jim Hale]

I've been looking into why AMP and VOLT machines seem to be running so slow and why AMP is no longer collecting mail messages. [Jim Hale]

Mail died on AMP (Jim noted that in his report last week). The problem was a bad flag in the rc.conf file (the documentation error I mentioned two weeks ago).  I had forgotten to update the flags when I sorted the problem (I was sure I'd done that, but the evidence is against me!)  When AMP was rebooted sendmail didn't come back up. [Tony McGregor]

Preparing for the installation of gigabyte interfaces to AMP and VOLT early next week. My plan is to install the new high-speed interfaces in VOLT first, early on Monday morning. I should be able to install the new interfaces on AMP on Tuesday. I've practiced the installation on non-production machines, so I suspect I'll have no problems. [Jim Hale]

I made some changes to the status email and web pages (that are mostly used by Bud) so that the function correctly for sites that are not in the HPC mesh (e.g. international mesh only sites). [Tony McGregor]

On Friday the AMP server data collection disk setup (8 disks concatenated into one) reached 90% fill. That is very near the upper limit. I started the archiving process. It will probably run most of the weekend. [Bud Hale]

In my report last week I mentioned I had started the archiving process on AMP. That was successful and brought the disk fill down to 66 percent. Since the completion of the run on Monday the disk fill has reached 67 percent. The non-concatenated data disks 0 through 7 on VOLT are currently in the low 80 percentile with the highest at 84 percent. [Bud Hale]

Following work on the VOLT server this week it was discovered that I had inadvertently left am_slave off for a time. As a result I have resolved to change my practices and methods of monitoring am_slave on AMP and VOLT. Also OHM as soon as that machine is fully online. I plan to integrate that check with my checking of the server disk fill using df. Failure to catch that the process was not running for some days was my responsibility. At some point those monitoring tasks may be automated with some built in alarms. [Bud Hale]

A result of the am_slave process being off, the VOLT data disks quickly filled at the end of this week. The fill was to the point that am_slave would again be halted. I have started the archiving process to archive data off to the HPSS and reduce the VOLT data disk fill. [Bud Hale]

Continue to monitor the creation of third server, OHM. I wish to be in a position to see that the documentation on the servers and their management is complete. Of course we need to continue to work on our intra-departmental communications in the management of infrastructure upgrade projects. [Bud Hale]

I reported last week that I had started the process to archive AMP data on the VOLT server. That finished early this week and took the the data disk fill down to between about 68 to 78 percent over the eight data disks. I am working to do some re-balancing on the disk array as soon as possible.  [Bud Hale]

Ran into many problems upgrading apache on volt, based on Tony's advice I will update the BSD ports tree and try again next week. [Cooper Nelson]

I talked to Tony about maintainence of AMP servers once ohm is up and running. [Matthew Luckie]

~ ~ ~ Misc.

We got a request from a user to stop the AMP machines sending test traffic to an old AMPlet (Supercomputing 02) which is no longer running.  To do that nicely I added a new flag (status) to the database which indicates that a monitor is no longer active. Setting status to disabled means that no tests are done to the monitor, but the data is still available in the web pages.  I marked sc02 as disabled and distributed the new .list files. [Tony McGregor]

Bud wrote up a (very long) list of what has to happen to start a new monitor.  I went through that and am starting to think about ways of automating most of it.  It's a bigger job than I thought (read: Bud has been working _even_ harder than I realised) [Tony McGregor]

During the staff meeting on Wednesday we resurrected a discussion of a design change on the AMP monitors to utilize solid state disks as opposed to existing rotating disks. Tony indicated that I might do some availability and price investigation. In that investigation I've found low cost solid state ide interfaced disk emulating modules available as follows: 1 GByte: $591, 1.5 GByte: $895 and 2 GByte: $1,179.  Costs continue to come down, indicating that this may be an idea whose time has come. [Bud Hale]

I've continued to fix small AMP problems this week.  I corrected bad ownership on volt:/amp-data/130.217.250.21 which was causing a couple of error messages and looked into the version of the kernel that we were installing on the new monitors for Bud.   [Tony McGregor]

I fixed a bug related to the ownership of the meshes file.  That's bitten us a few times in the past, but I wasn't sure where the problem was happening until Bud and I stepped through the process together. [Tony McGregor]

I worked on the build_XXX.list script.  Bud had had some problems with it.  The problem turned out to be human error (a small typo) but in the process I updated the script to create the symbolic link in the system manager from the confirmation directory to the amp-name. One small job off the list for Bud on the creation of each new monitor.  I also added some error checking to deal with the type of problem that caused the error.  [Tony McGregor]

I also wrote code to create the confrmation directory but had to take it out in the end because it raised so many new issues I decided I don't have time to deal with them all at the moment. [Tony McGregor]

As part of the process I improved the build_amp_data.pl script so that it includes a header from /amp-data/acl.head before it includes the data from the database when building the acl file.  In the end I didn't need that, but I added it because I wasn't orriginally going to put the site in the database.  In any case, it gave me a chance to add some comments to the start of the file and it may be useful in the future.  [Tony McGregor]

Someone displayed their dislike for our use of storage space in the basement. On a visit to gather parts for a machine, I found someone had pushed through the isle of equipment pushing equipment off the shelves to the far side and pushing stacks of sensitive gear onto the basement floor.  I realized we were encroaching in shared space, though I didn't think someone would be so overtly destructive.  I don't think SDSC will notice my redecoration, though I did have to make some room away from areas considered shared and re-locate our equipment there. Now the isle is clear, I think our gear is safe from battles for elbowroom on the other side of the rack. So if anyone is having trouble finding anything, let me know. I might know where it is. [Jim Hale]

~ ~ ~ new deployments

As reported last period, eight new monitors were shipped; work took place this period to install and bring these monitors online and collecting data.

~ ~ ~ One site, amp-uida (U. of Idaho) is online in the domestic HPC mesh and has been initiated. It is capturing and transferring data to the AMP servers. [Bud Hale]

~ ~ ~ amp-hutf (Helsinki U. of Tech.)

One of the new sites amp-hutf (HUT network in Finland) is installed and online. I'm in the process of moving it into the system manager to initialize it to start data collection in the "international" mesh. [Bud Hale]

The machine at the new site amp-hutf (Helsinki U. of Tech.) was installed this week. I completed the setup in the AMP and VOLT servers and the system manager on the photon machine for the site. This is the first site to be in the "international" mesh only. I ran system manager to start the site. However it has not collecting data. I am trouble shooting the cause of that. [Bud Hale]

This week I've been starting up the amp-hutf (Helsinki U. of Tech) site. This action has served to correct some misconceptions of the build list process as well as provide Tony with some data he used to correct a bug. I believe that will be working smoother now. The amp-hutf site is in the process of being moved back to the "international" mesh only. The data status page is expected to show that site working with only those sites included in that mesh.  [Bud Hale]

As reported last week I was in the process of starting the first of the "international" mesh sites that is "international" only. That site is amp-hutf (Helsinki U. of Technology, Finland). The data status page, status messages and data storage files that the AMP monitor system manager process failed to cause the site to initialize. However Tony pointed out that the main data page indicated data from the site.  But additional research fails to explain why the data storage files remain empty and the main data page indicates proper operation. I'm continuing this investigation. [Bud Hale]

~ ~ ~ One other of the new sites, amp-gpng (Great Plains Network GigaPop) will be delayed for a short time while the site prepares. Some changes occurred at that site when Rick Summerhill moved from GPNG to a new position at Internet2. However Rick put me in touch with his successor, Jerry Niebaum. I've been communicating with Jerry and completing plans to complete the installation at the Great Plains site. Jerry is very anxious to move ahead with the AMP monitor at his site. Also he is planning look at how he can utilize the IPMP protocol of the AMP machine to analyze and manage his network connections. [Bud Hale]

Two more of the recently deployed sites, amp-wisn (U. of Wisconsin Network GigaPop) and amp-hean (HEANet in Dublin, Ireland) are now installed and connected. I am in the process of putting the two in the collector/servers and in the system manager. I plan first to initialize the Wisconsin site since it goes only in the HPC mesh. This site is significant since it is another machine going into a GigaPop.  [Bud Hale]

Following that I will initialize the Ireland site while I am continuing to investigate the amp-hutf site. This is related because the Ireland site will be the second site to go only in the "international" mesh.  [Bud Hale]

This week I undertook to bring two more of the newly shipped AMP monitors online. These two were the amp-wisn (U. of Wisconsin Network GigaPop) and the amp-hean (HEANet in Dublin, Ireland). During that process some anomalies with the system manager "update_tree.pl" process were discovered.  Tony is looking at these anomalies and I will complete the process of putting these site online when he is finished.  [Bud Hale]

Also, following a small correction to the "update_tree.pl" process on the Photon system management machine by Tony, I initiated the new site at amp-wisn (U. of Wisconsin Network GigaPOP). This is one more of the machines to be located at key GigaPOPs. This development provides for research such that traffic performance over backbone segments between GigaPOPs can be studied versus end-to-end performance between hosts using those same segments. [Bud Hale]

Assisted in the assessment of some issues that appeared with "update_tree.pl" in the System Manager process. [Jim Hale]

Two of the new international mesh machines were also initiated this week. They are the amp-hean (HEANet Network in Dublin, Ireland) and the amp-unin (UNINet Network GigaPop in Bankok, Thailand). The amp-hean net started and is currently collecting and transferring data.  [Bud Hale]

Participated in the bringing online of two of the newly shipped AMP monitors amp-hen (HEANet in Ireland) and the amp-wisn (University of Wisconsin GigaPop).  [Jim Hale]

However the amp-unin initiation appears to run correctly but it is not collecting and transferring data. That is under examination. [Bud Hale]

Some new AMP sites are still in startup process. However the amp-hean (HEANet Network site in Ireland) startup was completed this week. It also was delayed by some anomalistic events in system manager but it was completed.  [Bud Hale]

Another "international" mesh site still in startup is the amp-unin (UNINet Network in Thailand) site. Some of the needed files failed to transfer during startup, initially causing an email flood of error messages. That was stopped by halting the cron. The problem is under investigation.  [Bud Hale]

It became necessary to prepare a letter of collaboration to be sent to Prasert Pongluksamana for Dr. Tanakorn Aun-On of Uni-Net expressing the collaboration to assist in freeing an AMP machine tied up in customs in Thailand. [Jim Hale]

Began preparation for the initialization of IPv6 machine amp-sla6. Investigating what will be required in order to collect data from amp-sla6 on the existing equipment (i.e. AMP and VOLT, more likely OHM where ipv6 is already enabled.) Communicated a little with Matt Luckee. [Jim Hale]

On the subject of the "international" mesh a request for an AMP monitor for the Brazil end of the AMPATH network was received this week. [Bud Hale]

Tested network configuration on the AMP monitor planned for AMPATH.  [Jim Hale]

Worked on the AMP machine bound for AMPATH. Copied the master disk, updated the rc.conf file, and tested to boot it without a keyboard. A hole in the cover above the fan is still needed and then this machine is ready for shipment. [Ronn Ritke]

Also 3 new AMP monitors are under preparation. And another site, University of Mexico, has indicated that they will be requesting participation. [Bud Hale]

Juergen Rauschenbach from the DFN network in Germany has also made an enquiry about hosting an AMP monitor. [Tony McGregor]

Phone call to Tony McGregor for updates on AMP and how to respond to some requests for AMP machines. [Ronn Ritke]

AMP existing remote sites maintenance and troubleshooting:  

A number of existing AMP sites received attention during this period: [Bud Hale]

~ ~ ~ amp-asu (Arizona State)

had inadvertently applied a ssh block preventing machine updates. Removed by the sites following discussions.  [Bud Hale]

The amp-arizona site experienced a short outage for a chassis move on Friday. When site personnel notified of the impending move it was halted from SDSC, move and power up. It came back on line without incident.  [Bud Hale]

~ ~ ~ amp-bu (Boston University)

AMP monitor has a bad disk. It appears to be a corrupted disk and not a hard failure. A replacement disk was shipped and installed on Thursday. The re-initiation was started late Thursday and appears to have run correctly. However the monitor has not yet started collecting data. That is being investigated now. [Bud Hale]

First, the amp-bu site suffered a failed disk that was diagnosed last week. A replacement disk was shipped and started up this week. Some delay occurred in the startup due to some anomaly events in the system manager process. Those anomalies have not yet been fully investigated and understood.  [Bud Hale]

~ ~ ~ amp-clemson (Clemson U. in North Carolina)

site monitor has suffered a hardware failure. A replacement machine in in shipment.  [Bud Hale]

The amp-clemson (Clemson U., South Carolina) site machine suffered a hardware failure. A replacement machine was shipped and installed this week. It was restarted and is again transferring data.   [Bud Hale]

The amp-clemson (Clemson U. in South Carolina) was brought back online at the end of last week after a machine failure.  [Bud Hale]

~ ~ ~ amp-colostate (Colorado State)

had inadvertently applied a ssh block preventing machine updates. Removed by the sites following discussions.  [Bud Hale]

~ ~ ~ amp-fnal (Fermi Nat. Lab.)

experienced a minor outage. That outage appears to have been caused by a KVM switch being plugged into the machine and being placed in a scroll lock mode. It appears this hung the OS when the system attempted to write out to the monitor. Removal of the switch allowed the machine to boot up. However I am currently working with the site people to change the default router.  [Bud Hale]

~ ~ ~ amp-fnal (Fermi Nat. Lab.)

Another site requiring some attention was amp-fnal (Fermi National Lab.). Some network reconfiguration at that site caused it to need a default router switch.  [Bud Hale]

amp-fnal (Fermi Nat. Lab.) having frequent intermittent outages.  [Bud Hale]

~ ~ ~ Another site, amp-harv (Harvard U) appears to have been brought down by a system manager run. I discovered that the rc.conf file in system manager was missing a double quote. The last system manager run placed that file on the machine and rebooted it causing it to lose connectivity. I'm getting the machine corrected and re-started. [Bud Hale]

~ ~ ~ amp-miami (Miami U.) suffered a network cable problem. [Bud Hale]

~ ~ ~ amp-psc (Pittsburgh Supercomputer Center)

Another site failing this week is amp-psc (Pittsburgh Supercomputer Center). To the local technician it appeared to be a power supply failure. I shipped a tested replacement power supply and it was installed. However this failed to correct the problem and caused it to appear more as a failed system board. I am now preparing a replacement machine.  [Bud Hale]

I had repaired the failed machine from Clemson and shipped it to PSC as the replacement. I brought it online on Thursday. [Bud Hale]

The last site with a failure was the amp-psc (Pittsburg Supercomputer Center). A replacement machine was on site late last week and it was initiated over the weekend.  [Bud Hale]

~ ~ ~ The amp-rice (Rice U. at Houston, TX) experience some network connection problems following some minor cabling changes at the site. [Bud Hale]

~ ~ ~ Other site of interest is the "IPv6 only" site at Stanford Linear Accelerator Center. At this time we are working on making the AMP/VOLT/OHM servers able to work with that site.  [Bud Hale]

~ ~ ~ Other sites needing attention were amp-smu (Southern Methodist U.) which needed a reboot due apparently to an OS hang  [Bud Hale]

~ ~ ~ amp-ucf (U. of Central Fla.) having frequent intermittent outages.  [Bud Hale]

~ ~ ~ Site amp-uconn (U. of Connecticut) is reworking their power systems so that site has and will be experiencing some brief outages from time to time over the next two weeks. [Bud Hale]

~ ~ ~ amp-wayne (Wayne State U. at Detroit) made a default router change and needed to be reconfigured for that. [Bud Hale]

Inclusive list of sysadmin tasks

Additional Performance Measurement and Analysis Activities

Top Network Outages and Peak Network Utilization scripts (Network Reports pages):

This week I made a big improvement to the design of my script, moving the work of html output to the individual router and interface objects themselves, greatly simplifying the mess that was the html output code.  Also I added plain text output code to the objects to make for much easier integration into other scripts. I also cleaned up some particularly nasty constructs I had used that were making my code very confusing. [Bill Gahr]

This week I worked on cleaning up the design of my script and worked on making the changes to make the output match the new style guidelines. [Bill Gahr]

This week I completed a rather nice overhaul of my design and cleanup of my code that offers several advantages over my previous version including being a simpler design overall with small code size, allowing for the inclusion and simple addition of additional output formats including plain text and changing of the html output, and easier integration into other scripts. I also discovered a somewhat critical bug when reworking and testing the new code that sometimes causes the first data point in a log file to be ignored, that I am working on tracking down when exactly it happens and what about the logs for those times causes this problem. After I work out this last issue, and receive a link to the new style guidelines to check my new html output against, I plan on submitting various example output to Todd and Kim for comments and make changes as necessary. [Bill Gahr]

I was working on the UI checklist, adding additional line items to be checked and documented.  The checklist itself should serve as technical documentation for any code that is produced (e.g. it will  be included with any documentation that comes with the deployment of a script).   [Jeff Baker]

This week I spent a short amount of time tying up small errors and inconsistencies in some of the websites on stat.  This was after doing a final sweep of the pages I had updated in accordance with the web guidelines I had written. [Jeff Baker]

I spent this week working on the Weather Data page on stat.  It's been fairly interesting... the cgi on these interconnected pages are a bit more complicated than some of the other pages on stat, so it took a bit to get familiarized with the way the code works. The front-end of the pages are coming along nicely, and thankfully the code that deals with the data has not been affected.  The pages affected were the main link page, the current weather conditions page, the weather almanac, and parts of the graphs page.  [Jeff Baker]

I also revised some small details in the Web Standards documentation to reflect accurately what is represented on the main pages on stat. [Jeff Baker]

This week I spent my time creating a web-based "template" for coders to use with their scripts.  The purpose for it is to allow coders to simply copy and paste (or follow along as they program) the display portions that are needed when creating HTML pages for their data presentation online.  I also created an explanatory page, summarizing these details while providing a visual guide as well. [Jeff Baker]

I have also been working on updating my interface guidelines and UI checklist documents to reflect some corrections that needed to be implemented. [Jeff Baker]

I packaged up the 3D Anemometer data for anyone to grab, It is almost 5GB all in all, and available in about 35MB daily chunks at http://stat.hpwren.ucsd.edu/MtLaguna_3DAnemometer/and a README file at http://stat.hpwren.ucsd.edu/MtLaguna_3DAnemometer/0_README.txt  [Hans-Werner Braun]

Spent some time looking at wind data from Mt. Laguna, as by now we have lost both the mechanical anemometer, as well as the solid state one. Looking at the 10 samples per second data from the solid state one, I saw times where the quite violent wind (probably significantly) exceeded the 90MPH the instrument was built for, often for very short amounts of time. This then likely gets compounded with significant and heavy-weight ice buildup at times. No wonder we lost those instruments. [Hans-Werner Braun]

I started building a system disk for stat. This is a 120GB drive which should really help with not running so often out of disk space.... [Hans-Werner Braun]

The HPWREN stat machine was finished last weekend, and now has almost 0.5TB of disk space, half of which for backup. This also allowed to transition HPWREN functions off moat (moat.nlanr.net, www.nlanr.net, nlanr.net are hosted there), in preparation for redoing the moat machine, which was still running its 1998 version of pre-3.0 FreeBSD. Moat was replaced yesterday (Saturday), and now has two 120GB hard disks in about 1/8th of its former chassis size. [Hans-Werner Braun]

This week I completed a web form interface to my script.  On Monday, I added the second subroutine to complete the web form / graph, the web interface works but need some minor correction.  On Wed., I finally found the reason why setting the plot ranges on one graph would cause a blank output.  It was because the y-axis range is so wide that all the points were plot onto the x-axis.  I also reviewed comments in my scripts and made correction to them.  On Friday, I use the w3c.org's html validator to check the cgi output of  my script and made correction to it.  I also tested a cron job script and work on presentation for next student meeting.  [Zhao Li]

This will be my last official day here at HPWREN, I have enjoyed working here and have gain more useful skills.  On Wed, I work on writing a few notes about the first and second version of Tsunami scripts.  My cron job test from last week was successful and now there is a rough version of working cron script that can be use for initial testing purpose. For Friday, I continue with writing notes and work on a few change to my web page, change cron script to run from the web directory, add a script to allow running the median calculation scripts from command line and not limit to 10 minute of run time.   [Zhao Li]

Documentation, Networked Data, and Tools

~ ~ ~ Web pages, server

I also changed the AMP page background and added some info links to the throughput page (http://watt.nlanr.net/active/cgi-bin/tput-request.cgi) as a consequence of a suggestion from a user. [Tony McGregor]

I also fixed a bug in weekly.cgi that meant under a few circumstances the weekly graphs had lines crossing times when the monitor was down, rather than gaps.  This only occurred when the monitor was down for a bit over 24 hours (the detection code was fooled by the change of day). [Tony McGregor]

I spent some time removing .gz. from inside image names we produce to avoid a mozilla bug (mozilla assumed an image was compressed if it had .gz anywhere in its name). [Tony McGregor]

I also updated the IPMP homepage on AMP with new draft of the protocol, mostly so I could refer to it in the PAM student handout.  [Tony McGregor]

I also fixed a bug in the web interface to extract data from the HPSS.  The fault was because the web user didn't have permission to fetch the data.  I fixed that with a C suid wrapper to tar.  This was pointed out by a user. [Tony McGregor]

Walked with Cooper through web pages we want to generate for the PMA data collection and download display. [Jörg Micheel]

Had an email discussion with Tony about AMP users and the Citings page (which I'm going to incorporate into the rolling reports/home page system that Dave and I are working on). [Maureen Curran]

For the rolling reports/dynamic front page management project, I set up a system with a NoteTab outline file with which I can quickly create and update a Web page with design info and needs for Dave (so he doesn't have to troll through emails).  Worked on the fields, did subject categories and date fields, and posted for Dave. [Maureen Curran]

Worked on the design and formatting of the front page/rolling reports management system project; began and made significant progress on the html template for Dave to use for the scripts.  Rethought some of the design elements as a result of a conversation Ronn had with Greg Monaco. [Maureen Curran]

Began working on formats and basic elements for the AMP web pages. [Jim Hale]

Prepared AMP web page elements.  [Jim Hale]

The HPWREN stat machine was finished last weekend, and now has almost 0.5TB of disk space, half of which for backup. This also allowed to transition HPWREN functions off moat (moat.nlanr.net, www.nlanr.net, nlanr.net are hosted there), in preparation for redoing the moat machine, which was still running its 1998 version of pre-3.0 FreeBSD. Moat was replaced yesterday (Saturday), and now has two 120GB hard disks in about 1/8th of its former chassis size. [Hans-Werner Braun]

~ ~ ~ Cichlid 3-D Visualization System

Monday March 3: I worked on designing a network protocol, writing some routines that would help me explore the nature of UDP communication.  [Ben Reesman]

Tuesday March 4: I worked on the socket class that I wrote, I was having difficulty with a couple of things and have ironed out a few bugs. In addition, I wrote a short test harness for the socket class and one of my protocol routines.  [Ben Reesman]

Wedensday March 5: I came into the office for the student meeting and AMP/PMA meetings. I also met with Todd to learn more from him about how to do the protocol and overall design.   [Ben Reesman]

Thursday March 6: I simply worked on the protocol. It is not terribly complicated, but I have never done anything like this before and I want this to work well with both TCP and UDP. It is beginning to come together on paper.  [Ben Reesman]

Friday March 7: I wrote several routines for building and dismantling frame blocks. They do not quite work yet. I will continue to work on these and get them right.   [Ben Reesman]

Next week: I will simply continue to work on this piece of software. I underestimated the complexity of designing this network protocol. However, I am also very pleased with a design change that Todd and I came up with:  we will isolate all serialization of datasets into one class, allowing the Protocol class to work for any kind of dataset. [Ben Reesman]

Monday March 10: Spent time in the Cichlid networking code to understand a bit better how things are done and how to multithread. Spent some time writing simple multithreaded programs to learn a little more about pthreads.  [Ben Reesman]

Tuesday March 11: Spent two hours working on my socket class, I am trying to iron out a few bugs that keep cropping up. I spent a little while developing a buffered socket class, but I think that it will be overkill.   [Ben Reesman]

I got to participate in a back and forth discussion about cichlid with Ben. This is the second week in a row and it is really fun and I feel productive. [Todd Hansen]

Friday March 14: Spent 4 hours working on my protocol. All of the hard parts are now done.  [Ben Reesman]

Also, met with Cooper to talk about designing a visualization with him for PMA trace downloads.  [Ben Reesman]

Tuesday March 18: I worked on learning the Cichlid bchart server interface for creating the PMA visualization. I also worked more on the DataSet class.  [Ben Reesman]

Monday March 17: I spent 3 hours working on the networking routines. The protocol layer basically works at this point, but is somewhat unreliable and I am still testing it. [Ben Reesman]

I also spent some time testing my new Cichlid code.  [Ben Reesman]

Also spent time learning more about the Standard Template Library.  [Ben Reesman]

Monday April 7: I worked on Cichlid networking code. I also worked on the new class hierarchy.  [Ben Reesman]

Tuesday April 8: I was in the office for 3 hours. I worked on code and also on learning how to build makefiles. I wrote a simple makefile for my project. At Bill's suggestion Todd and I decided to add more robust behavior to the framebuilding routines.  [Ben Reesman]

Thursday April 10: I was in the office 4 hours. I worked on code and met briefy with Todd. Tony has suggested that we look into making Cichlid a web browser plugin. I expect this to be feasable but I am not yet sure.  [Ben Reesman]

Friday April 11: I looked into the web browser plugin issue briefly. I think that it will be feasable as an ActiveX control but probably not a netscape plugin.  [Ben Reesman]

Monday April 14: I worked for one hour learning more about networking routines.  [Ben Reesman]

Tuesday April 15: I worked on the transport code.  [Ben Reesman]

Ben Reesman and I talked about Cichlid development plans. We will meet next week to discuss this further. [Ronn Ritke]

Met with Ben Reesman to review his Cichlid plans and timeline. [Ronn Ritke]

Meeting with Ben Reesman to go over Cichlid plans and Tony's question about a Cichlid plugin for a browser. [Ronn Ritke]

Meeting with Ben Reesman and Todd Hansen about Cichlid plans. [Ronn Ritke]

~ ~ ~ Other visualization work

Spent more time working on the mouse manipulation of my wind applet, though the datastream is down so I cannot use actual data at present.   [Ben Reesman]

arranged for Ben to spiff up the static AMP map, and forwarded to Ben an idea of ours for a spiffy IPMP image, waiting for a graph from Matthew, after receiving this, Ben will have all the parts for an interesting collage. [Maureen Curran]

Worked w/Ben and Tony on a cool AMP map (US only, static, close but not precise) to use in print and an IPMP collage image. [Maureen Curran]

Thursday March 18: I worked for one hour on the AMP map image. [Ben Reesman]

Monday March 24: I worked on images for AMP. I am redesigning the map for publication. Several drafts were considered. We have settled on one and now it is time to begin populating it with locations. Preliminary work on logos for AMP.  [Ben Reesman]

Tuesday March 25: I continued to work on logos and also spent more time on the map.  [Ben Reesman]

Reimplemented the amp site map in php with proper mercator projections. This gave different results than the current amp map, which leads me to believe that there is problem with the rendering, not the site locations.  [Cooper Nelson]

While I was working on this I put together a little tool for Todd to render a dot for any lat-long based on http parameter.  Try it at http://mercali.hpwren.ucsd.edu/~coop/globe.php?lat=0&long=0  [Cooper Nelson]

Spent alot of time chasing down some trick bugs with my new map rendering code.  Managed to fix them all, currently busy updating and validating the locations of all the amp sites.  Premilinary version available @ http://mercali.hpwren.ucsd.edu/~coop/ampviz.php.  [Cooper Nelson]

Reviewed first test of Amp viz project with the amp group.  Collected lots of good ideas, including creating a dynamic zoomable map and using transparent overlays for site performance.  Investigated using tables for the overalys and javascript imagemaps for site info pop-ups. [Cooper Nelson]

Finished implementation of overlay based rendering of amp sites over static map.  Worked on perfomance analysis of RTT data and javascript for pop-ups of site info.  [Cooper Nelson]

Implemented prototypes of map resizing and site performance code on dev. box due to problems with development on amp.  Will integrate with amp code next week.  Found a bug with my overlay map, for some reason it doesnt line up correctly, currently working on a fix.  Talked with Ben about collaborating on amp map based on my code.  I showed him what I have done to date and he is currently investigating using PHP generated javascript for site info rollovers.  [Cooper Nelson]

I also worked with Cooper on learning PHP imaging in order to create a better looking and more sophisticated AMP map.  [Ben Reesman]

Finished modifying realtime amp performance map to render site connectivity performance via colors.  So far I've gotten interesting results, the current performance was almost always faster than the previous days average performace (at least when I looked at it in the afternoon, PST) which leads me to think that either short periods of high latency are skewing the averages, or there is a day/night cycle for RTT.  [Cooper Nelson]

I scratched my dynamic resizing code (too much overhead) in favor of using precalculated maps of the US and world. I'm working on integrating these maps with the code on amp and will finish next week.   [Cooper Nelson]

Fixed lat-lon information for most of the amp sites, only a few international ones missing now.  Investigated installing php on amp, it looks straightforward, will take care of early next week.   [Cooper Nelson]

Finished cichlid visualization of amp site activity with Ben's help.  [Cooper Nelson]

Had a productive discussion with Maureen regarding my PMA viz project.  I will work with her next week to make my project more palatable to end users. [Cooper Nelson]

Met with Cooper and Ben about Ben doing some Cichlid images of the data that Cooper's working on.   [Maureen Curran]

Met with Ben to discuss some potential viz methods for trace activity project.  Settled on using Cichlid, so I started putting some scripts together to generate some data files Ben could use to generate animations.   [Cooper Nelson]

Wrote scripts to collate PMA site activity by analyzing file size of trace downloads.  Forwarded results to Ben R. for graphing via Cichlid.  [Cooper Nelson]

Talked with some of the students about their PMA projects to see if there is anything we can work together on.  Ben is done with finals so he is resuming work on the Cichlid viz. of my trace analysis.   [Cooper Nelson]

Worked with Ben R. to create cichlid visualization of PMA site activity.    [Cooper Nelson]

Met with Joerg to discuss formating of my web stats and site activity web projects.   [Cooper Nelson]

Met with Joerg to discuss various PMA projects.  I now have detailed examples of page layout for trace download viz.  Worked on modifying existing code to do this.  [Cooper Nelson]

Modified my log analysis code to populate an RRD database. Still working on the graphing.  [Cooper Nelson]

It turns out the root of my data graphing problems was a mistake I made in how I thought data was stored in an RRD database.  For counters like http and ftp transactions, storing timestamps is useless without saving totals as well; otherwise all averages will reduce to 1.  I am going to have to write new code to compute totals for each day and store that as one datapoint, then graph that.  I'm currently in the process of doing this and hope to finish next week.   [Cooper Nelson]

Cooper Nelson showed me the AMP visualizations that he has been working on. The AMP sites appear to map correctly to the US sites. [Ronn Ritke]

Cooper Nelson showed me the latest AMP site visualization and some of the PMA stats he has displayed on a web page. Bud and I gave him some additional information on the AMP site locations. [Ronn Ritke]

Cooper let me know that a list of AMP site short names and long names (the full site name) would be helpful. I provided a list of over 100 AMP site long names. the US map he is using for the AMP visualization is very nice. A drop down menu lists the AMP sites and a select all site option is in place. [Ronn Ritke]

Met with Cooper. He showed me the current AMP map that displays the router hops. [Ronn Ritke]

~ ~ ~

Thursday March 27: I worked on the PMA visualization preliminaries.  [Ben Reesman]

Friday March 28: I continued to work on the PMA visualization. It is nearly workable.  [Ben Reesman]

April 2nd: I worked on the visualization briefly. I discovered that the way I had previously structured the program didn't work adequetly, and decided to rewrite it.  [Ben Reesman]

April 3rd: I rebuilt the entire visualization program from the top down today. I wrote the whole thing from scratch and got it working with the data in time to show Cooper. He told me he would rewrite the data script to sort the sites before they were displayed.  [Ben Reesman]

April 4th: Met briefly with Ronn to show him the progress of the visualization that I created for PMA filesizes.  [Ben Reesman]

Thursday April 17: I worked on the network code and also learned more PHP.   [Ben Reesman]

~ ~ ~

Worked on the documentation for construction of the Ohm Server. [Jim Hale]

~ ~ ~ Reports

We were working on "nuggets" contributions for NSF's GPRA. Tony, Joerg, Maureen, and Ronn sent me materials for NLANR that I collated and submitted to NSF. [Hans-Werner Braun]

The first part of the week was mostly taken up with admin jobs.  I read through the documentation on the GPRA nuggets and put together a couple of AMP related outlines, along with suggested images.   [Tony McGregor]

I spent last week preparing a large portion of text for the nuggets contribution that had been requested by Greg Monaco from NSF. [Jörg Micheel]

Worked with Ronn re nuggets for NSF.  Mid-week Ronn forwarded the NSF guidelines/instructions for the nuggets.  After reading them, I scraped the idea of the preliminary report (although I do need that info for the annual report and will be re-tasking those folks who haven't done theirs yet).  (Cheers, one less report to do.) [Maureen Curran]

Wrote a nugget on Cichlid, tying it into our history of strong student contributions (retrospective and prospective), sent to HWB to send with the others.  Helped Tony with the images for his AMP nuggets for NSF: sent him a resized version of the Cichlid tristrip.  [Maureen Curran]

Finished the September and October monthlies: finished the condensing of info and mapping to the proper section(s), revising/reordering for logical flow, the html formatting (per templates I previously set up), posted for review, received a few comments, made changes accordingly, posted to Web site, updated the Reports page, sent to Greg Monaco at NSF, created the PDF (and sent that as well). [Maureen Curran]

Am working primarily on catching up with the reports and made significant progress on the next round, which are the November and December monthlies.  Received comments from Tony on the last two; will use in the quarterly for that period. [Maureen Curran]

Mostly worked on reports this week, primarily on the November and December monthlies.  Also did some work on a possible preliminary report with regard to our activities January through now.  Used the todo task system to assign it to everyone.  As a result of doing this, I came up with some ideas that I forwarded to Tony for Perry to make this kind of "mass-tasking" easier (than filling in the same info 15+ times, once for each person).  Double checked the cooperative agreement to be sure a program plan is required to be submitted with the annual report (yes); sent out a reminder email to masg. [Maureen Curran]

Completed and posted the November and December monthly reports.  Received and incorporated comments; sent email to Greg Monaco.  Also had an interesting email exchange with Tony about the format of the monthlies; plan to develop a suggestion of his for the next one. Began preliminary work on January and February's reports. [Maureen Curran]

Discussed the reports in detail with Ronn, including time lines. He will be doing the final report for the last cooperative agreement and I will continue on with the monthlies, (only one more two-month monthly report to complete in order to be up to date). Went over the FastLane sections with Ronn filling him in on what material I've already included and what needs to be done.  Went over what maps to which section and which just need review and additions, and which sections need to be written.  Sent him some material that I already had set aside to use in the final report. [Maureen Curran]

Very busy week: reports (monthly, final, annual, etc.) and some PAM tasks.  Worked with Ronn re the final report for the last cooperative agreement.  Helped by sending loads of resource material, text, and ideas that I'd compiled.  Went over the various FastLane sections and their requirements.  Many already had text that I'd done earlier, so I went over which needed just a look over and which needed to be written anew, pointed out which resource material mapped to which section. Edited Ronn's nearly final, just missing PMA section, draft of the final report.  For time expediency did just a light editing, making sure MOAT was used vs. MNA and deleting unnecessary detail only when it was easily accomplished.  Wrote a couple paragraphs to use in lieu of some detailed sections. [Maureen Curran]

Made good progress with the January and February monthlies; will be done early next week.  These are the last outstanding monthly reports! [Maureen Curran]

Monthly reports:  these are now up to date, with the completion of the January and February monthlies. I completed, posted for comments, made a change or two, then added to the Reports page, and made a pdf for Greg Monaco (and sent off).  While working on these two monthlies, I continued to make notes on the design and layout for the front page/rolling reports management project that Dave and I are working on.  Dave's been ready and waiting for me for a bit and I'm really sorry that I haven't had the time to get ahead of him, so it's high on my list of things to do. [Maureen Curran]

Thanks to an email conversation with Tony and resultant ideas, the process of the January and February monthlies went much faster (until I got to the formatting).  While keeping the same look in general, I did much less editing on the individual weeklies and used them to form the long details section.  Then, I wrote the highlights section (previously had just done a bulleted list) for the beginning.  I'll be able to use the paragraphs from the highlights section in the quarterlies and annuals.  Even though the previous monthlies didn't look like it, they really had a lot of editing. This became quite apparent when I realized that this two month report is twice as long as the previous two monthers.  I'm really looking forward to getting a prototype going with Dave because while it'll still be a time intensive process, it should also really streamline things and make them much easier. [Maureen Curran]

Read monthly reports and sent comments to Maureen. [Tony McGregor]

Suggested changes to Maureen for the monthly report. [Matthew Luckie]

I also read the Nov/Dec monthly report Maureen circulated and gave her some comments on it and some thoughts on an alternative approach to generating the monthlies.   [Tony McGregor]

Discussed the program plan requirement with Ronn re the annual report.  Put some information together on what's needed (by NSF) in order to help the development of it, which I put into the TODO task (for Tony, Joerg, and Ronn). (Actually put it in twice, there was a bug the first time.  Tony was in the task the same time as I was and my changes didn't "take"; Tony's asked Perry to look into this.)  Asked for and received info on HPWREN's info for the program plan. [Maureen Curran]

Began work on the first quarterly (which will become part of the annual).  Received the draft of the program plan from Ronn. Arranged with him to take care of the participants (Mary has input and he has reviewed), collaborations, and pubs sections of FastLane for the annual report. [Maureen Curran]

Had previously put in some additional detail into the TODO re the program plan requirements, but found out that it didn't "take" (a second time) so I put it back in and emailed Perry (who has since fixed the problem I think).  Worked with Ronn on how to format the slides he received from Tony and Joerg into a program plan. [Maureen Curran]

Discussed with Mary the new coop agreement requirements re budgets relative to the annual reports and what they really wanted (it appears that they are requiring some new financial reports, not needed previously).  She sent an email to NSF inquiring and apparently they aren't sure either.  They will be getting back to her. [Maureen Curran]

Some preparation for the Program Plan for the upcoming Annual Report for the current award. [Ronn Ritke]

Progress on the annual report for the current MNA award - I completed the Program Plan draft and reviewed the Project Participants section.  [Ronn Ritke]

Continued to work on sections of the MNA Annual Report. The Program Plan, the 5 contributions sections, the publications section and the collaborations section. The Program Plan and collaboration sections are drafts. Text for the other sections are now in Fastlane. [Ronn Ritke]

Met with Maureen about the Program Plan, contributions sections and publications section for the Annual Report. [Ronn Ritke]

I Logged onto the NSF Fastlane site to check the current status of the Final Report for the last NLANR Cooperative Agreement. Reviewed some existing text, completed some edits, updated the publications section, collected information and text for this report. Printouts of the 4 annual reports for this agreement will be used to create text for the Project Activities, Findings and Outreach sections. [Ronn Ritke]

A major focus this week was work on the Final Report for MOAT. A number of report sections were entered (AMP, PMA, BGP, etc.). All sections are now in Fastlane. The next step was for Maureen to do a final review of the 20+ pages. On Friday she sent me 9 of the pages with some edits. Those edits are now complete. Once I have the rest of the pages reviewed I will do those edits and submit the report. [Ronn Ritke]

Met with Neil Cortofana to review the BGP and SNMP text. [Ronn Ritke]

Over last weekend, continued to work with Ronn, helping him with the last couple of parts for the final report from the last cooperative agreement.  Edited the PMA section (had already done the others). Ronn submitted on Monday.

I wrote a draft of the AMP text for the final report <last coop agreement>. [Tony McGregor]

Meetings with Ronn Ritke and light editing of several documents.  [Mike Gannis]

I'm also editing Ronn's draft of a HPIIS workshop report. [Mike Gannis]

Met several times with Ronn Ritke for discussions and guidance.  Currently reviewing report on HPIIS activities.   [Mike Gannis]

I've been preparing a writup on the HPIIS workshop for Ronn Ritke.  [Mike Gannis]

Met with Maureen Curran to go over 2 montly reports. [Ronn Ritke]

Papers, Publications, Presentations, and Conference/Meeting Participation

~ ~ ~ Presentations and Meetings

~ ~ PAM2003: the Passive and Active Measurement Workshop, April 2003

From Sunday to Tuesday spent all my time at PAM2003. I think it is very very positive to note that, in our impression, things went really well, and to the satisfaction of the audience. The costs for the NLANR team have been substancial, with all the little things that people had to attend to, but the impact to the research community has made it more than worth our while. I don't say this very often, but I am proud of what we did achieve with this years PAM and it is important to note that having Ronn attending to all the local arrangements was the crucial component to the success of the workshop. Well done, everyone. [Jörg Micheel]

This was the week of the PAM conference at SDSC. It was a very big success and, most of all, it afforded an opportunity to do some face to face talking with many leaders in the network measurement and research community. It is very helpful to be face to face with the people I communicate with by email and phone to elicit the support I need at the remote sites to keep the NLANR infrastructure alive and well. Also it was an opportunity to talk with newer members of the measurement community as well as students. Contributions by students this year were excellent. Besides the members of the general measurement community, it afforded an opportunity for a visit by both Tony McGregor and Joerg Micheel. I was able to spend time with both and discuss many critical issues. Much of the discussion with Tony included his planning as the "international" mesh continues to grow. I was able to talk with Tony and Bruce Morgan of the AARNet about the planning for the AARNet AMP mesh. Also talked with Matt Zekauskas about his planning to install the I2 PMA monitor at his new location in Ann Arbor, MI. [Bud Hale]

This week afforded me the opportunity to participate in the PAM2003 conference. It was very motivating to listen to the presentations and talk to the participants. I got the change to speak up close to those collaborating with NLANR in our measurement activities. [Jim Hale]

This week I spent time attending the two days of PAM2003. Due to classes I was unable to attend the entire conference but I did catch a number of papers on Monday afternoon and Tuesday morning. [Chris Gross]

Pam2003 took place at the beginning of the week and from all indications was a great success.   [Maureen Curran]

NLANR/MNA hosted PAM2003 Sunday - Tuesay. One goal we set for PAM2003 was to add some support for students and encourage student participation. Those efforts paid off. Of 83 PAM attendees, 40 were students. 12 accepted papers listed a student as the first author. NLANR/MNA and ENDACE provided partial sponsorship for 13 student authors. [Ronn Ritke]

~ ~ ~  56th IETF Meeting, San Francisco, CA, March 16-21, 2003

This week I attended IETF 56.  I met with David Kessens (nokia IPv6 and very involved with 6bone) and a few other helpful IPv6 folks to discuss methods to construct an IPv6 address list.   [Matthew Luckie]

I attended quite a few working groups, but the most interesting ones were the IPv6 working groups (3 meetings over the course of the week), the 6bone decommissioning WG (where they seemed to come to consensus that the 4th of April, 2004 (4/4/4) would be the cut-off date for obtaining an IPv6 allocation from the 3ffe:: space, and that the 6th of June, 2006 (6/6/6) would be when the 6bone would be turned off. [Matthew Luckie]

Jon's talk about IPMP to TSVWG (a transport area working group) was kind of a let down.  He basically presented the same content that Tony & I presented at IMW 2001, but without credit.  TSVWG appears to want to come up with a requirements document that specifies the problem that IPMP solves, and then debate protocols. [Matthew Luckie]

I took the microphone to basically say "I'm co-author of the draft from which Jon's is derived.  We support the idea of IPMP, but we want our draft to be considered along side Jon's as we have a number of technical differences that we believe are very important.  I presume the technical discussion is off-topic at this stage?".  The answer was yes.  I pine for the days where the IETF didn't have a requirements draft phase for new topics. [Matthew Luckie]

~ ~ ~

Attended a presentation by Andrew Odlyzko (Minnesota) in the afternoon, Andrew is on a visit here at SDSC. [Jörg Micheel]

I also put together some slides on my vision for AMP for the management group meeting.   [Tony McGregor]

I also began work on my student presentation.  [Ben Reesman]

Wedensday March 12: Finished and presented mt student presentation.  [Ben Reesman]

~ ~ ~ Papers and Publications

"NLANR Holds 'Very Successful' Workshop on Passive and Active Network Measurement" was published in xxOnline, Vol 7 (8), April 16, 2003.  Online is the SDSC/NPACI biweekly newsletter.
  
Finished up article on PAM2003 conference (see http://www.npaci.edu/online/v7.8/pam2003.html).   [Mike Gannis]

I worked with Tony, Joerg and Gail to update both the AMP and PMA posters.  Those arrived and are ready to display at PAM2003. [Ronn Ritke]

Bounced some words around with Ronn and Bud for the updated AMP poster. [Tony McGregor]

I had to write a progress report for my PhD.  I'm getting to the write up stage now, but still have a lot of implementation /experimentation to do.  I'm using Vern's thesis as a guide to how I should layout the thesis. [Matthew Luckie]

Collaborations and Student Involvement

Che-nan Yang (Taiwan) is making progress with their local AMP mesh. I sent him the international.list mesh file, which he's promised not to redistribute or do other tests to.  He has one amplet and a data collector testing to our international mesh.  I still need to add his machine into the mesh from our end. [Tony McGregor]

I've been working with Che-nan Yang from Taiwan this week to get his monitor integrated into our AMP mesh.  He wants the data stored on amp and volt but they own and operate the monitor.  I've set it up in the database and created amp and volt directories and symlinks for the data. Currently the data is arriving as if it were in the HPC mesh but I've given him instructions as to how to change that. Once this stage is stable, I'll add them to the international mesh so that our monitors test to his. [Tony McGregor]

Today I am in Berlin, meeting with Klaus Mochalski (Leipzig) later in the morning, also with a friend of his, to see if there is a chance to find more collaborators on passive measurement research and tool development. [Jörg Micheel]

Between PAM2003 and Easter Saturday completed my travel with visits to Europe and the US East Coast. In Berlin had a good meeting with Klaus Mochalski (Leipzig) and Sven Hessler, a colleague of his, talking about alignments for joint work on passive measurement research. There are some followup actions from here. [Jörg Micheel]

I transfered all the data from January and February (about 60Gb worth) to Teng Fei from the University of Massachusetts. They are: "studying the properties of link delay and packet loss in the global Internet using end-to-end measurements approach.  We attempted to characterize the link delays according to temporal and spatial criteria." [Tony McGregor]

I worked with Heonkyu Park of South Korea to get IPMP going on his FreeBSD 4.3 machine.  He seems to have it all under control and knows what he is doing. [Matthew Luckie]

I had some help from RIPE people to identify the prefixes that they've allocated, and pointers to the two IPv6 prefixes that LACNIC have assigned.  Bill Owens and Joe St Sauver forwarded me the output of "show ipv6 route" on three of their IPv6 routers.  I'm using the output to compare what they see with what is advertised, and to get an idea of how the Internet2 IPv6 TLA has been split amongst universities. [Matthew Luckie]

This week I attended IETF 56.  I met with David Kessens (nokia IPv6 and very involved with 6bone) and a few other helpful IPv6 folks to discuss methods to construct an IPv6 address list.   [Matthew Luckie]

I send 3 days of traceroute data (~450Mb) to Anukool Lakhina, a CS PhD student at Boston University.  I'm not sure what he wants the data for at this stage (I've asked him to send us a description of his project) but he seemed very happy to get it! [Tony McGregor]

I wrote a draft of the reply to Dave Lien (U Idaho) about ICMP shaping and passed it on to Matthew who will add stuff about what he's found using IPMP. [Tony McGregor]

I've been talking with three different amp host sites this week. The most interesting talks have been with Dave Lien from the University of Idaho about the possibility of having multiple test streams from a monitor which have different QoS parameters.  I'd like to make progress on that before too long because I can see it being useful in a number of places. [Tony McGregor]

Replied to a few email messages from Che-nan Yang (Korea) about them setting up an amp mesh. [Tony McGregor]

I've been discussing hop counts from AMP data with Xiaoming Zhou (Delft University of Technology, The Netherlands) who has a project in that area. [Tony McGregor]

Kevin Walsh asked me to contact Jim Ferguson from DAST so they could talk about NPACI E2E measurements. [Ronn Ritke]

I arranged a conference call with Jim Fergusen and Kevin Walsh. Kevin is interested in modifying iperf to meet some NPACI site requests for performance measurements. [Ronn Ritke]

On the NLANR call, Jim let me know more about the Global Grid Forum work on common definitions for measurement terms. The conference call with Kevin may result in collaboration on iperf development. [Ronn Ritke]

Called Thomas Ndousse to get further information on the request for slides for the LSN Meeting, to let him know that PAM will preclude me from attending and the proposed meeting next Sun