For all three of my regular readers, sorry for the slow updates. I've decided to make the site a little less newsy and more about what I'm learning on a day-to day basis.
And I've been learning quite a bit, because about three weeks ago, our internet usage spiked incredibly - saturating our T1s out to the internet and slowing traffic down to a crawl.
The chart below shows the traffic flowing
into our district from our ISP. Note the sharp increase on Weds in week 15 - that's when the trouble started.
I first became aware of the problem when our AP and payroll clerks came to ask why their terminal screens were taking 3-5 seconds to display
every letter typed they typed. I checked the normal things - CPU utilization on my router and ATM switch (they would be high in a virus outbreak), Checked my
Fluke One Touch to check for broadcast storms or excessive errors, and checked my mail server for signs of spamming. Nothing seemed out of the ordinary - low CPU utilization, no network problems, no spam.
I called the Lead Tech at my ISP to see if he saw anything going on out of the ordinary. He checked for PTP traffic, spy/adware traffic and streaming media and found just standard web traffic going on - his traffic shaping box didn't see anything out of the ordinary. I decided to give it a day or two and see if it settled down on it's own - while figuring out what my options were if it didn't.
Monday found the traffic still heavy and me scrambling for solutions. I decided to set up
Squid on the
SuSE 9 Enterprise Server I had recently set up for testing the possibility of migrating my web server to Linux. The install went fairly smoothly with
YaST, and I had Squid up and running in a matter of minutes. The only problem is, that it comes programmed to disallow all connections by default. I just needed to edit the config files and allow the machines in my network to talk to it.
At this point in time, I decided to install
Webmin, as it simplifies this type of task tremendously. Unfortunately, it isn't included in the base SuSE packages, but I was able to download a
RPM off of the site and get it up and running in no time. After a bit of ACL tweaking, I was able to point my workstation to the proxy and hit the web.
I used Windows group policies to set a couple of grades to use the proxy Monday afternoon, and let them run Tuesday to make sure the server didn't choke on a fair-sized load. Tuesday also kept me busy setting up the Fluke
OptiView on loan from my ISP. Initial tests with the OptiView showed nothing out of the ordinary. I set the proxy as on and active in all Windows Group Policies later Tuesday afternoon.
I was rewarded with a greatly reduced traffic load on Wednesday (week 16). Telnet traffic was smooth again, and the web lost the slow as molasses feeling it had developed. I wasn't quite back to pre-spike levels, but I had seen a lot of streaming video form ESPN, and hoped that blocking it would bring me where I wanted to be. I patted myself on the back and decided all I needed to do was decide on if I should use the linux/squid combo going forward, or if I should pick up a commercial appliance that would do the same.
How wrong I was....... the gory details are forthcoming in my next installment in which we see bandwidth continue to rise, and the real culprit unmasked.