Summary of Research Activities - Nov. 2003
~ Continuing development of new metrics and real-time analysis for PMA
- The following abstracts on our real-time efforts were submitted to PAM2004:
- A Real-Time Packet Burst Metric. [Klaus Mochalski, Jörg Micheel, Maureen C. Curran (ed.)]
- Design and Implementation of a Scalable Real-Time Network Sensor. [Chris Gross, Maureen C. Curran (ed.), Jörg Micheel]
- More development of the real-time tool using MAX this period, refining more of my code to use better pointer management and use dynamic memory a bit better. [Chris Gross]
- Started work again on Flow engine. Once SDA comes back online at a Gigbit speed, I will then have two fast monitors to develop my sensor on. [Chris Gross]
~ Progress on the reimplementation of AMP and the development of a new testing architecture
- Working on the IPMP code, including the information exchange. I think it is close to done, although right now I am stuck on an odd bug where sendto complains about invalid argument on some of my packets, even though the arguments appear to be identical. [Tony McGregor]
- I have had a programmer here (Xing Deng) working on putting pathchar (a bandwidth estimation tool) into the new AMP code. Along the way he has come up with quite a few issues and this period I finally managed to get through the mail he has been sending me and check out his problems and suggestions. He has caught a couple of important bugs as well as a few more, which is good. I may ask him to do a complete code review when I have finished. [Tony McGregor]
- I am going to work with Tony on designing and developing new AMP code. [Ben Reesman]
~ IPMP
- I wrote up some proposed experiments to do on the WAND emulation network that are IPMP related so that I can change my ISMA abstract slightly and resubmit to PAM2004. Then worked on producing the data to include in the PAM submission and extended the abstract to include the data. The data is used to empirically show the difficulties of measuring the capacity of a link that is much faster than the link that fed it (e.g., going from a 10mbps link to a 100mbps link). I have included a brief discussion in the paper about it. This is also good stuff to show at the ISMA workshop, I think. [Matthew Luckie]
- Submitted to PAM2004: Segmentation of Internet Paths for Capacity Estimation.
http://voodoo.cs.waikato.ac.nz/~mjl12/pam2004.pdf [Matthew Luckie, Maureen C. Curran (ed.)]
- I am also a coauthor on another PAM submission "Identifying IPv4/IPv6 Path Differences in the Dual-Stack World". Kenjiro Cho (WIDE) is the primary author on that paper and Bradley Huffaker (CAIDA) is also an author. http://www.csl.sony.co.jp/~kjc/tmp/draft.ps [Matthew Luckie]
- Development work continued on the cross-traffic-from-trace (ctft) generator; fixed up some performance limitations. It is now done, subject to performance tests. [Matthew Luckie]
- Worked with Maureen on the IPMP Internet-Draft; she has reviewed the draft and has many suggestions on things to change that will improve the readability and correctness of the draft. Made the edits and submitted the draft to the IETF Internet-Drafts editor. [Matthew Luckie, Maureen C. Curran (ed.)]
- Matthew submitted the new IPMP draft and it made it into the IETF draft archive. Updated the AMP pages to include it. [Tony McGregor]
- Spent nearly two full days involved in IPMP discussion on the IETF Internet Measurement Research Group (IMRG). That was unexpected but is, all in all, I think a good thing (even though there were some ongoing `issues' with Jon Bennett. A couple of good ideas have come out of the discussion. The IPMP discussion on IMRG petered out the following week. I think Jon Bennett scared everyone off. I had a couple more exchanges with Mark Allman, and noticed that he posted our new draft to the list. [Tony McGregor]
~ A new tool for Path Display is under development. The purpose is to display a path under investigation and the paths from other nearby monitors (in order to localize problem paths).
- I will be writing a program which will help locate and display blocks in the network with a graphic output showing the network topology and similarities in paths between AMP machines. Downloaded a dataset from the AMP Web site and unzipped that. Coded a method to read in the traces from a file, and am now writing a couple classes to handle the time stamps and to contain the path data for later comparison. [Lana Kennedy]
- Lana will be working to produce a new graphic that I think may form the core of our approach to divide and conquer work with AMP. [Tony McGregor]
Submitted to PAM2004: Flow Clustering Using the EM Machine Learning Algorithm. [Tony McGregor, Maureen C. Curran (ed.)]
~ Cichlid 3-D Visualization System Activities
- It was decided that I will work on Cichlid only one day a week for awhile, and report that progress directly to Tony for planning and review. At Tony's suggestion I am drafting a number of design documents, particularly high-level design diagrams. I am also trying to break down the process of testing the current code into pieces that can be verified individually. Wrote unit tests for some of the Cichlid code. Tested one of the data structures that is important to Cichlid. [Ben Reesman]
- Progress on the animated display of earthquake related data: Finished the code that generates the vertex arrays and vertex normals with smoothing. Worked on the terrain map rendering code, I am having some trouble with getting textures to render properly onto it. I do not quite have this working yet but I think it is close. [Ben Reesman]
~ New (and developing) strategically important measurements & deployments
Following up the discussions with Rick Summerhill and I2 people regarding the implementation of AMP sites at the major I2 GigaPops. This implementation had a number of requirements, including the need to power AMPlets from the -48 VDC power source normally found in GigaPops, and maintain the 1 RU rack space. We were able to locate a source for the -48 VDC sourced power supplies to fit the 1 RU chassis just up the coast in San Clemente. [Bud Hale]
Communicated with Chris Small of I2 re the Atlanta I2 GigaPop site, prepared and shipped the first of the AMPlets for the I2 sites. However, Chris needed an 8 ft. length power cable. Jim was able to get the company in San Clemente to ship a new cable immediately. Shortly afterwards, Chris reported he had it installed at the Atlanta location. (As it developed he used the three foot power connection instead of the new eight foot cable.) Site initialization took place (adding site to the database and system manager and get it started collecting data); amp-i2at (Internet2, Atlanta) is now up and collecting data. [Bud Hale, Jim Hale]
Bud's outstanding work in getting the first of the Abilene node AMPs done in time (despite supply problems) is also good news in this area. [Tony McGregor]
I am happy to report that the AMP monitor we configured and installed at the Supercomputing 2003 collected data all week (amp-sc03 (Supercomputing03). I wish to thank Kevin Walsh of NOC for his help at SC03 with the AMP monitor. Jim had installed the monitor at SC03 and it was connecting correctly. Jim returned home and I started the startup process on the system manager. However I allowed an error in the uploaded rc.conf file. I called on Kevin to help me and he corrected the problem and got the monitor back on line for me. Matt Zekauskas also offered to help. [Bud Hale, Jim Hale]
The TransPAC connection is moving from the Pacific Northwest (PNW) GigaPop in Seattle to a location in the Los Angeles area. Our AMP monitor at the PNW GigaPop was on the TransPAC connection, therefore it must be moved, or associated with another network. Discussions with Bill Mar and Jan Eveleth of PNW GigaPop led me to Jim Gagliardi and Lisa Erspamer at ESNet. In discussions regarding co-locating the AMP monitor at PNW GigaPop with ESNet, I was asked to provide them with a realistic calculation of the inbound and outbound traffic load of an AMP monitor at a site. My calculations indicate this to be less than 20 kilobits per second assuming a full mesh of 150 AMP sites and an average of 10 hops per path between sites. The plan regarding the new location for the TransPAC connection (Los Angeles area) is to communicate with John Hicks and install an additional AMPlet in Los Angeles to cover the TransPAC connection. [Bud Hale]
The MAX monitor is now the first fully functional OC48MON since some time and we are just waiting for Chris to finish his experiments before running a longer collection there. We should also have word from Kathy Benninger shortly, she has been in contact with us with a range of questions regarding operation of the new PSC OC48MON (an outcome of our meeting at IMC2003 in Miami late October). [Jörg Micheel]
The OC48 machine planned for the Pittsburgh Supercomputer Center (PSC)was completed and shipped. [Jim Hale, Bud Hale]
I have ordered the equipment for the new nai-p-sda machine using the new Dag 4.3GE cards (just arrived). I am anxious to get that machine up and installed. I expect that unit to yield some interesting results. [Jim Hale]
OC192Mon installation at SC2003 ~
- I worked very hard after Stephen Donnelly's visit to get the two working OC192 monitors connected to the DTF Traffic. Once again if you let it sit for a moment it gets reabsorbed. I had hoped we could collect traces on the OC192 monitors before the machines went to Phoenix (SC2003). Configured the OC192 machines for Phoenix, packed them, and arranged to include them in the SDSC shipment. [Jim Hale]
- Traveled to Phoenix the Friday before SC2003 opened. Candy Adams, the booth facilitator was extremely helpful. Found the SCinet NOC and introduced myself to Matt Zekaukas and Jon Dugan. The NOC was bustling, with everyone performing very determined tasks. No one had time to spare. Matt immediately went to work determining my needs and setting out to fulfill them. You could tell he was trying get my equipment installed and get me on my way. I do believe next year we might want to schedule better with him (maybe a day earlier or a day later). I spent a lot of time trying to stay out of the way of him putting out fires. Let him pick what day we should arrive. [Jim Hale]
- The OC192MONs were installed and connected by Jim, but when I was trying to collect data some other equipment was using the tap and so it took another loop and help from Matt Zekauskas to get things in place and working. Great support by Matt, who double checked the light readings to make the second system work. I started a contiguous trace instantly after having the boxes working, and then did a first check on the data collected. For some reason the payload content does not at all look IP data. I double checked all the PHY parameters, and the Khatanga looked fine, so I was going to check with Stephen Donnelly to understand if there is anything wrong with the Xilinx images we are using, or else. By the time we got around doing this I had a call from Matt for graceful shutdown of the system in Phoenix. So the only remaining chance is to see if the data on the systems is any good for publishing. We'll check that once they are back in San Diego. [Jörg Micheel]
- Had tremendous concerns about the OC192 Monitors at SC03. I was not informed how long the machines would be disconnected from the network for the Bandwidth Challenge. It looked as though they were reconnected just shortly before the conference ended. Also, after SC2003 I see how difficult it can be to do something as simple as get a monitor and keyboard on a piece of equipment at a collaborating facility. [Jim Hale]
- The OC192c's are back from SC2003 and Jim was quick to put the first one online for me to have a look at the data collected (thanks Jim!). [Jörg Micheel]
- After their return (with the other SDSC equipment), I reconfigured the OC192 machines from the SC03 conference. This will enable Jörg to retrieve the data collected at the conference. However, I was only able to connect one machine at a time; will work with SDSC ENS to get both machines working on the network at the same time. [Jim Hale]
~ IPv6 and IPv6 Scamper
I started doing some analysis on the weekly Scamper traces that William Maton is generating for me. http://voodoo.cs.waikato.ac.nz/~mjl12/ipv6-ryouko/ [Matthew Luckie]
Examined how to efficiently implement PMTU discovery in Scamper, for the purpose of finding tunnels. The original idea to keep a table of links does not seem as if it will work, as the same link may have different routes to it that will be revealed from the host running Scamper. A tree of the paths seen so far seems to be my favoured solution. Then implemented Path MTU discovery in Scamper. I have set out the data structures I am going to use, and written some code to select an appropriate initial MTU using the routing socket. The routing socket is standard on BSDs and MacOSX, sadly Linux does things differently. [Matthew Luckie]
I exchanged some rather useful emails with Lorenzo Colitti (RIPE) who has implemented path MTU discovery through the RIPE NCC TTM boxes and uses it to find IPv6 tunnels (which is the same idea behind what we are doing). He pointed me at his code and a great technical report they have written on it. [Matthew Luckie]
Michael Swoboda (RIPE NCC) sent me the output from running Scamper on what seems to be all the RIPE TTM boxes that are IPv6'd. He sent me the output from 20 Scamper runs. I wonder if we now have the bulk of the European IPv6 Internet mapped thanks to Michael. [Matthew Luckie]
Also did some planning for a paper I would like to do with some Scamper data. [Matthew Luckie]
~ Papers
Worked on several PAM2004 abstracts (with me contributing to varying degrees). I was rather stunned to notice that they seem to have had about 200 abstracts registered. [Tony McGregor]
Great progress on the papers for PAM2004. Locally, I have been working with Koryn Grant from Endace Technology on a paper which looks at comparing a range of Gigabit network interface cards and their performance with each other. Chris has prepared his version of our work on the remote network sensor application. With Klaus Mochalski in Leipzig we are submitting the results of his work during his stay at SDSC during the summer, concerning fine grained analysis of packet data on the link. [Jörg Micheel]
Talked with Maureen several times about my paper for PAM2004. Wrote, edited, and reviewed the submission, in addition to learning LaTex (had a bit of trouble with graphics at first). [Chris Gross]
Worked with Klaus, Matthew, and Tony editing their drafts for PAM2004. Worked with Chris on his submission from beginning through to final draft (helping him develop the overall paper structure and nature of individual sections). (This was his first research paper submission.) One of the big issues for everyone was the limitation to just two pages for the abstract. The short abstract requirement apparently led to a huge number of submissions: about 180 submissions were received, and only 30 will be accepted. [Maureen C. Curran]
The following abstracts were submitted to PAM2004:
- Gross, Christopher W., Jörg B. Micheel, and Hans-Werner Braun. Design and Implementation of a Scalable Real-Time Network Sensor.
- Luckie, Matthew and Tony McGregor. Segmentation of Internet Paths for Capacity Estimation.
- McGregor, Anthony J., Perry Lorier, and Mark Hall. Flow Clustering Using the EM Machine Learning Algorithm.
- Mochalski, Klaus and Jörg Micheel. A Real-time Packet Burst Metric.
~ Presentations and Conference/Meeting Participation
SC2003, Phoenix, AZ, Nov. 15-21:
- In the SCinet NOC - Network Operations Center, we installed an AMP machine and two OC192 monitors. <See above for greater detail on deployment activities.> [Jim Hale, Bud Hale]
- Attended SC2003. In preparation, worked with Dave Hart at NSF and Mike Gannis to modify text on NLANR/MNA's activities at SC2003. Helped Jim coordinate the machine installations. Worked with Kevin Walsh (SDSC ENS) on the Bandwidth Challenge (served as a judge). Had a number of meetings and talks with people while there. [Ronn Ritke]
- Assisted Ronn with preparations for SC2003 and with responses to requests for information on NLANR from Dave Hart of NSF. In Phoenix, assisted Ronn, Bud, and Jim with SCinet activities and other arrangements at SC2003. [Mike Gannis]
- 150 copies of the NATimes were sent to Mike in Phoenix for distribution at SC2003; all were picked up.
WIDE-CAIDA-RIPE-NLNET workshop at ISI:
- Gave a talk about the Scamper 'project.' http://voodoo.cs.waikato.ac.nz/~mjl12/caida-wide-workshop-scamper.pdf [Matthew Luckie]
- Attended the workshop, the topics included IPv6, DNS, and BGP measurement. [Tony McGregor]
I found out with brief notice that I had a 10 minute slot at the TSVWG meeting at IETF to talk about IPMP. While it went well, they want a requirements draft before they will take it on. This is rather frustrating because the requirements are pretty basic and do not really need an excessive amount of debate. [Matthew Luckie]
While attending IETF in Minneapolis, sat in on the IPv6 working groups, IPPM, and TSVWG. IEPG was the highlight for me. A number of good things resulted from this IETF meeting. [Matthew Luckie]
I gave a talk to some CAIDA people about the IPMP bandwidth estimation techniques. They provided useful feedback for when I give the talk to the ISMA workshop (in December). I found a bug in my only formula in that paper. I started writing my presentation to give to the CAIDA ISMA Bandwidth Estimation (BEst) workshop. http://voodoo.cs.waikato.ac.nz/~mjl12/lastsummer/ [Matthew Luckie]
Had visitors from University of Waikato and a Professor from Finland and student visitors from Finland and Sweden. I gave them a 90 minutes overview presentation on passive measurement research and technology. [Jörg Micheel]
Attended most of the JET meeting. Rick Summerhill was happy to see the first AMP go into the Abilene backbone as part of the Observatory project. [Ronn Ritke]
~ Collaborations and activities supporting network research
Received note from Pere Barlet from UPC Barcelona that he can join us for two months in Hamilton to work on porting his Netmeter application to other platforms, performance tuning. http://www.ccaba.upc.es/netmeter/ I had visited Pere and Josep Pereta at the end of August following SIGCOMM2003. We are intending to build a stronger research relationship between CCABA and NLANR. Pere has offered me one of is longer traces at the Catalonian R&D network for publishing via PMA. [Jörg Micheel]
Several emails with Bill Cleveland from Bell Labs, he is still keen on support for the instrumentation at a major provider on the East Coast, and with the support from everyone at Knox Street <Endace> have managed to sneak out one of the only two new DAG3.6EP (dual port 10/100s) for instrumentation of a link in New Jersey. [Jörg Micheel]
I have been discussing data with Weidong Chi, a new PhD student at UC Berkeley. He has fetched all our online data (via the Web get interface). He wanted historical data from around the code red time, so I wrote some code to extract that from the HPSS. He also requested to be enabled for the on-demand throughput tests. Continued sending more data during November, including the 5 months' data he requested. [Tony McGregor]
I met George Michaelson, from APNIC. He is very enthusiastic (about most things networking related, it seems). He was volunteering to tcpdump one of APNIC's DNS servers while I kicked off a DNS walk, but we realized that this was not needed because I only talk to the roots for their authoritative (root) zone. He is going to be running Scamper from some interesting locations within APNIC. He also offered suggestions as to how to handle the Jon Bennett situation. He also runs gaim on NetBSD, for which I am the maintainer, and he has some patches he would like to see make it into NetBSD's pkgsrc. [Matthew Luckie]
I participated in the NLANR managers call this week because using AMP data for DASTs advisor system was a major topic. The thinking we have been doing about implanting the GGF NMWG XML schema for measurement data. There are still some hurdles for us to get over, but it is clear this is an important direction. [Tony McGregor]
Reviewed a document that John Towns sent out on an upcoming project; met with him to discuss the nextinet project and NLANR collaborations. [Ronn Ritke]
Spoke with Eric Boyd (also Guy Almes and Russ Hobby) from the I2 Pipes project. He would like to meet with Tony at the SDSC measurement workshop in December. [Ronn Ritke]
Several transactions with Ian Pratt regarding support for PAM2004. [Jörg Micheel]
Sent Peter Arzberger some text for the PRAGMA handout for SC2003. [Ronn Ritke]
While in San Diego, Tony and Matthew met with folks at CAIDA to discuss various measurement and analysis activities.
~ Documentation, networked data, publications
Completed and produced a new issue of the Network Analysis Times, the theme of which is our AMP IPv6 efforts. Edited primary articles, wrote fill in ones. Worked with Jim to create an updated AMP IPv6 map with all 13 of the current IPv6 sites to use on the cover of the issue. A huge thank you to Lana for her excellent, detailed work on the layout and other help. FedExed 150 copies to Mike in Phoenix for distribution at SC2003. http://moat.nlanr.net/NATimes/NAT.4.1.pdf [Maureen C. Curran]
We approached Bill Owens and Joe St Sauver about articles for the NA Times that Maureen is writing. They replied with some excellent articles that are great publicity for us / the AMP IPv6 project. At the suggestion of Joe, Maureen also asked Larry Blunt, who sent copies of his IPv6 slides, which she excerpted into a short article. [Matthew Luckie]
I worked on the new AMP splash page and the associated scripts. This project is just about completed, everything looks and works extremely well. The template that Maureen sent me has been integrated with the rest of the code and the look/feel is consistent with the rest of the new Web page design. [Ben Reesman]
I talked over a bunch of things with Ben, but in particular the architecture of the new component based Web pages for the re-implemented AMP central site software. [Tony McGregor]
For the AMP pages, spent some time learning more general PHP because I am going to be using it very extensively in the coming future. My knowledge of PHP to this point has been very narrow, focusing only on the specific tasks that have been useful. I would like very much to simply be generally proficient in the language. Worked with PHP and its capabilities for handling cookies as a means of providing persistent content. I developed several little demos that use PHP to generate different cookies and retrieve information. I also spent time trying different design ideas for the new AMP pages that will allow different users to view AMP data differently based on individual preference. [Ben Reesman]
Worked on code that makes the new AMP pages remember users with cookies and with PHP's session id facility. I wrote new pages to allow users to select which pair of amplets they want to query, using different colored stars for the selection. I made a couple more refinements to existing code. [Ben Reesman]
Worked with Ben on the AMP splash pages, helped him with the html of the internal table index of sites. Cleaned up the html of the site indices, validated it, and checked across browsers. Created a special version of the regular template to be used with the extra large AMP map images. [Maureen C. Curran]
Met with Tony to discuss page designs that I have developed for regular pages and data pages for the PMA Web pages (with helpful input from Klaus). Though similar to ones I showed Tony last May for AMP, they have changed quite a bit (for the better!). As a result of a great suggestion of Tony's, there is a change in page templates: for pages with the full navbar, the two 100 pixel width "feature" images, which used to be together at the bottom/end are now split with one above the navbar links, and one below. (This will be great as not only does it look better, but will really take advantage of the fact that these images will be dynamic, i.e., rotated every two weeks, or sooner, as needs arise.) We also discussed the C preprocess/make file system for making changes to the Web page templates, not just format/style changes, but significant changes in the navbar, for example. Tony is going to write a sample for me to use in learning this. [Maureen C. Curran]
I met with Maureen for a number of reasons, including discussion of an approach that will allow her to create Web pages that use her template without replicating the template within every page. What we plan to do is to write the pages in what we are calling htmlcpp. These files are html but can use the C preprocessor command to #define and #include components. We will also make a make file that runs cpp on the files whenever the appropriate components have changed. [Tony McGregor]
Put together the system for managing the Web page template. In the end I used m4 rather than cpp after a suggestion from someone here at Waikato. I built a sample file and set up one of my pages to use the system (including a global makefile). It was pretty simple and I think it will work well for what we need. [Tony McGregor]
Quickly created a new feature image, Ipv6, and linked to actual page for the navbar. Also added the international link to current front page. [Maureen C. Curran]
When I posted the final draft of the quarterly, I updated the index page for reports to the latest version of the MNA template and made a small change to the formatting, creating two columns for the current reports. [Maureen C. Curran]
Worked on the PMA pages, including with Jim to get set up with the proper permissions on the PMA server. [Maureen C. Curran]
As reported elsewhere regarding the AMP monitor at Old Dominion U., the PMA machine at ODU was reconnected. It is now working again and collecting traces. That disconnect was a result of the blaster worm and the security concerns. That points up the need for more and better NLANR discussions on our Web pages. I am working on that for the AMP infrastructure. Probably need the same for PMA. [Bud Hale]
~ AMP servers and system disk, upgrades
This period, we had Tony and Matt here for nearly a week. Discussions and planning included the new AMP and VOLT servers, new AMPlet deployments at I2 GigaPops and new and exciting methods in AMP implementation - to name just a few. [Bud Hale]
Jim and I are in the process of testing a new system board for use in the AMPlets. We need to resolve that issue very soon. That is because we will need to acquire AMPlet machines for the I2 locations discussed previously. Since the Pentium III system boards are going out of manufacture, it will be necessary to go to Pentium IV boards. At first there appeared to be some FreeBSD incompatibilities but at this time those items seem to have been resolved. The test is running successfully. We are finalizing the requirements for the new AMP and VOLT servers and expect to have them on purchase early next period. [Bud Hale, Jim Hale]
I met with Bud, including going over the specs for the new AMP and VOLT servers. [Tony McGregor]
For most of the month, the AMP and VOLT data disk fill progressed at a good balance until the upper limit of approximately 91 percent was reached on hte AMP server several days before anticipated. This resulted in the am_slave process on the AMP server being halted for a half day while the archive process brought the data disk fill down to a level that would allow the am_slave process to be restarted. As mentioned previously, archiving is required more frequently and the disk fill is reduced less. Of course this is caused by a steady growth in the number of AMP sites causing the volume of data to grow proportionately. This situation emphasized the need to move a quickly as possible on the new replacement servers we are acquiring for AMP and VOLT. [Bud Hale]
~ PMA server, upgrades, changes
Purchased additional hard drives and chassis from Dell to increase the storage for trace collection on the OC192A machine (the original machine we purchased about a year ago). I purchased four 146 GB drive for a total of 584 GB and after pulling them all together in a Raid0 it leaves an available capacity of 573390988 KB and a 110MB/s write, 124 read. Stephen Donnelly has been a great help with this week. [Jim Hale]
Existing measurement sites maintenance and troubleshooting:
A total of 21 remote sites in the NAI infrastructure received attention during this period: 13 have been resolved and the monitors are again collecting data. 8 were still being investigated, or pending site action, at the end of the period. (Outages are considered "open" until the monitor is again collecting data.)
AMP - 15 problem sites: 10 resolved, 5 open
PMA - 6 problem sites: 3 resolved, 3 open
~ AMP machines
The amp site amp-pngs (Pacific Northwest GigaPop, Seattle) is still down due to the UPS issues there. And since the TransPAC router will be moving soon, that site may not come back up until it is relocated to ESnet. <For further information regarding the TransPAC router change, please see the "Activities Extending the NAI" section, above.> [Bud Hale]
Last week I reported site nai-a-aarn (Australia Ed. and Res. Network) with an outage. A power failure had shutdown the site. Bruce Morgan got the power back up but when the site came back online the network router was blocking ICMP echo requests. However Bruce was able to get it corrected shortly. [Bud Hale]
As previously reported some sites continued to block ICMP echo requests. They were: amp-ksu (Kansas State U), amp-rpi (Rensselaer Poly. Inst.), amp-wayne (Wayne U. in Detroit), amp-odu (Old Dominion U.), amp-nmnu (New Mex. State U), and amp-sdsu (San Diego State U). Workng with site techs (Graham Doig, Carlo Musante, Shelia Beilsmith, Dave Hyatt & Byron Hicks, and Skip Austin, respectively) I have been able to have them create and implement a router hole for ICMP echo requests to and from the AMP monitor. Finally all of the AMP sites blocking ICMP echo request have been fixed. [Bud Hale]
amp-mit (Mass. Inst. of Tech) has been disconnected for a machine room re-structuring at the site, and the AMP machine seems to have the lowest priority for re-connect. Sent a couple more emails to Jeffery Schiller, the site technician. The monitor is still disconnected and Jeff reports we are low on his priority list; however, I have been assured we have not been forgotten. I am asking him how we could get bumped up on that list. it is been a long time, having taken much of six months, but I am confident it will happen soon and I will be increasing my efforts to get that monitor reconnected. [Bud Hale]
The amp-dartmouth (Dartmouth U) site had an outage this week, due to a failing fan. The site technician, Steve Campbell, reported it was causing a loud scraping sound. However Jim sent them a replacement fan and Steve installed it the next day and the site was back online right away. I will express my appreciation to Steve Campbell for his quick and capable help on that matter. [Bud Hale]
Site amp-ncar went down early in the week. It develops that severe wind storms in the Boulder area caused wide spread power outages. During this time power backup anomalies caused some equipment to be powered down, including the AMP monitor. However, after a time, it was powered back up and is fine now. [Bud Hale]
Site amp-ampath-mia (AMPATH GigaPop in Miami) had an outage caused by a failed machine which will need to be replaced. We prepared and shipped a replacement machine. It is on site and expected to be installed shortly. [Bud Hale]
An outage at amp-umbc (U. of Maryland, Baltimore County) was caused by a widespread power outage due to a wind storm in the DC area. At this time, the amp-umbc site is back up and functional. [Bud Hale]
Site amp-fsu (Florida State U.) turned the AMP monitor there off for a time. When contacted they reported they had an indication that the monitor had been hacked. It appears to be another case where the traceroute function was reported by a firewall as port scans. I worked with the site to carefully check the machine and it was turned back on. [Bud Hale]
Site amp-surf (SURFnet in Amsterdam, Holland) is down. I have requested Wim Biemolt of SURFnet to have his technician in Amsterdam to diagnose it. [Bud Hale]
Site amp-columbia (Columbia U. in New York) indicated some outage. That appears transient, but I will be following it. [Bud Hale]
~ PMA machines
The machines for the Frontrange Gigapop and Pittsburgh Supercomputer Center shipped this period. I have not shipped any machines, till these two, that I had so much confidence in their functionality. [Jim Hale]
We talked about some of the PMA sites and actions taken to bring them back online. Two of the PMA sites are waiting for GigE cards. [Ronn Ritke, Jörg Micheel]
We lost the connection to the GigE PMA machine at the NCAR (National Center for Atmospheric Research, Boulder) site (nai-p-nca) again. The site technician, Scot Coburn, is investigating to determine if it is a machine problem or a network problem. [Bud Hale]
While in DC this period I arranged a visit to the nai-p-max (Mid-Atlantic Crossroads [MAX] GigaPop) to evaluate the OC48 monitor we have there. I was escorted and assisted by Quang Bock of the MAX group. As indicated previously, the monitor has been detecting and collecting outbound traces. However it was not able to collect traces on the inbound traffic. Some cursory signal level measurements and some fiber tracing revealed the cause of the problem to be an incorrectly placed attenuator of excessive value. Measurements and calculations revealed the signal level to the router would be more acceptable without the attenuator and the Dag card signal input would fall into the center of its range, which is approximately -21 dbm. This incorrectly placed attenuator was a 10 db attenuator placed at the source side of the 90/10 attenuator. It was dropping the signal to the Dag4.2 card by 20 db. We were able to make that change during the day time since the path break would be no more than a second. However another visit was required at midnight the following night to obtain a measurement of the actual signal seen by the network router and switch. During that visit the actual network signal level was measured to be -11.65 dbm inbound and -11.45 dbm outbound. In my analysis that is just about ideal for the network equipment. And the levels going to the Dag4.2 cards is right at -21 dbm. Also ideal. During the visits Jim and Chris were on hand at SDSC verifying the monitors ability to collect the traces. In conclusion, the MAX site is completely functional and appears to be quite stable. [Bud Hale]
With Chris Gross's help we had looked pretty extensively at the performance of the MAX machine and concluded that one card was receiving sufficient signal and one was not. Beyond that, the performance of the cards was within expected functionality. I figured there was a pretty good chance that the problem didn't lay in our machine. Just in case, Bud did pack two additional dag cards to take with him. Bud's appointment to gain entry into to Gigapop was on Wednesday morning. Chris and I stood by waiting for his call to read the activity of the cards during his diagnosis. [Jim Hale]
Jim completed and shipped the replacement PMA machine to the Front Range GigaPop in Denver this week. I will be working with Scot Coburn there to get him to the Denver site and get the replacement machine installed. [Bud Hale, Jim Hale]
The nai-p-mem (U. of Memphis) PMA monitor went down this week. The machine would respond to pings but was not reachable with ssh. And traces were not being collected. However it was corrected by a power cycle by the technician at the site. [Bud Hale]
This week I was in contact with Matt Grover at U. of Florida, Gainesville. He is now back to working on getting the OC12 PMA machine there back online. [Bud Hale]
~ in addition
I spent a week in the DC area and had a great visit with Kevin Thompson. We talked about some of the new thinking discussed in the last weekly staff meeting when Tony McGregor was in San Diego. This was especially related to the use of the NLANR AMP and PMA infrastructure in the "divide and conquer" methods of network diagnosis. We briefly touched on Tony's ideas about new "path" graphics. Kevin is interested in the new Web page ideas and any new "look and feel" developments. He expressed a desire to be kept up to date on that and anything he can share in the JETnet meetings. [Bud Hale]
Met with Kevin Thompson to review the Quarterly report and go over current NLANR/MNA activities. [Ronn Ritke]
Had a number of NLANR related discussions with Tony and Jörg. [Hans-Werner Braun]
Met with Tony when he was in town to discuss various things. [Hans-Werner Braun]
Tony and Matthew visited SDSC the first week in November. While here they met with most of us multiple times regarding AMP issues, servers, papers, Web pages, etc. Everyone found their visit to be very useful and productive.
Working with Maureen and Lee Dolan (SDSC HR), I updated Hans-Werner's academic file biography for the UCSD personnel files and converted it into the required (by Jacobs School of Engineering/UCSD) new format (which has several new categories). This process was a good learning experience for me: I gained a better understanding of the differences between refereed and non-refereed conferences, and of the different types of academic works that are produced in the field. [Lana Kennedy]
Weekly NLANR/MNA managers conference calls. [Hans-Werner Braun, Ronn Ritke, Tony McGregor, Jörg Micheel]
- 30 -
|