Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Slashdot.org News

Load Testing the New Server (Take 2) 41

Load testing the new boxes is going well: I've actually flushed out a pair of really bizarre bugs. If you're up for more load testing, feel free to visit the new box but let me remind you that we're trying to simulate normal activity- just lots of it, so writing a script that tries to submit the same comment a few thousand times in a few minutes just isn't gonna help (for that matter, just reloading the same page doesn't help either really) And thanks to the guy from 209.80.X.X (I'm sure you know who you are) who must have been running something really smart 'cuz (except for the fact that it was all coming from one IP) it was pulling down like 20 different pages a second! Anyway, the new setup was handling ~3x the usual Slashdot load for awhile there. I wanna try to get to 4x and see how that goes. Then we just gotta change the DNS, and there shall be faster Slashdot for everyone!
This discussion has been archived. No new comments can be posted.

Load Testing the New Server (Take 2)

Comments Filter:
  • Would you be interested in sharing your new
    webserver configuration along with some load
    stats(maybe an mrtg traffic graph)?
    I am curious to see what kind of load you are getting that is prompting you to goto a cluster
    system.
  • I can't even reach the new site [209.207.224.40]! It responds to ping though.

    Is Slashdot being slashdotted? :-)

  • I tried the new site last night and again a few moments ago, and both times the load times for pages were _very_ slow, often taking minutes for the table portions to show up. Accessing the regular slashdot during these times was as snappy/sluggish as usual, so it wasn't my connection.

    Hopefully this is just a result of over-agressive stress testing, but I haven't seen any speed benefits with the new site yet.

  • Not to be foolish, but it seems to work fine, why not go live now? There are lots of times I can't even seem to reach slashdot because it's being slashdotted.... I for one can't wait!
    "I have no respect for a man who can only spell a word one way." - Mark Twain
  • cuz i really hate haveing to wait a second for the pages to load on my end. ive got better things to do then wait 1 friggin second. I want to be able to hit tha book mark, and BLAM, there it is. ;>
    Gorfin
  • Does that person care to comment on how he did his "scripting"?..........
  • by Blewit ( 3281 )
    Hmm, curiously enough, and slightly off-topic I admit, but the Slashdot code still seems to have the odd bug. This afternoon (31/08/99 1:18 PM BST) Slashdot gave me the following : Error:Illegal division by zero at /home/slashdot/Slash.pm line 1832. Where exactly do we submit bugs .... 8)
  • I don't know whether you have the same expression where you come from, but here in Australia, we call that attitude "The cowboy mentality" (making do with what you've got)

    Rob obviously ain't no cowboy!

    He plans to do everything possible to do accurate load testing. That's why he said scripts posting the same shit over and over are no good.

    I applaud this approach. There's no way you can accurately simulate a REAL production load, but you can come pretty damn close. That's what he's aiming for.

    What he really needs to do is get a lively flamewar happening over there. Maybe he should post a coupla vi/Open Source/Gnome/RedHat/sendmail vs. emacs/Free Software/KDE/Debian/Qmail flamebait stories over there.

    Now that's load testing!


  • :Will the new Slashdot automatically kill "First comment" postings? :-)

    No, just the posters.
    s/poster/poseur/g

    Oh, and btw...You're not.
  • Hey! Do you have an actual software package that did that abuse? I'd LOVE to get a copy (or buy it) for testing some of my own web sites/web apps.

    I'd REALLY appreciate hearing about it.


    Really.
  • by Anonymous Coward
    I think it might have been me, I hav a BOx with an ATM connected to it the proxy Server is no this Box, I put up a script to Download the server on to all 80 PC's at the same time to test out the ATM connection. Since Slashdot was testing this Server, i Figured what a Better way to test it.??!!!??? Here is the preview, on 45 PC's there was absoloutly no Lag. on the Rest i Got an internal Server error, and at that point i restarted the script, to see if it was my end or the Slashdot End. the great thing was, SlashDot had a 98.5% Successful Load Rate. i then pushed a Random Load on all 70 PC's. and there was a 100% Success Rate. i only reloaded the pages, as they finished loading on the assigned PC's The Browser's used were Netscape 4.51 and IE 5.0x the load times were between 1 and 10 Seconds. i disabled the Cache, and Forced the Server reload, Rather than use the Cache. I will try the Load test again later today. If it is ok with Slashdot.org ???
  • While I am not that person I am just using a
    simple script on a fat pipe to load pages from all
    over the new site.

    #!/bin/sh
    While $I
    do
    lynx -dump http://
    lynx -dump http://
    lynx -dump http://
    etc.. etc..
    done

    Of course you wouldnt use the same address over and over. You want to pick different articles and
    links and such throughout the new site.

    I am adding at least a page or two a second
    by doing this..

    Mike


  • I seem to recall that the current server was installed just 6-9 months ago. I'd be curious to
    see a plot of number of pages served per day as a function of time. If the old server only
    lasted this long before getting beaten to a pulp, I can hardly wait to see what you'll need to upgrade to a year from now :)

  • What I did, is make a directory with 20 subdir's in it. Then I did a..


    will@servername:/home/will/tmp/slash_test > ls
    st_1 st_11 st_13 st_15 st_17 st_19 st_20 st_4 st_6 st_8
    st_10 st_12 st_14 st_16 st_18 st_2 st_3 st_5 st_7 st_9
    will@servername:/home/will/tmp/slash_test > ps ax | grep wget | wc -l
    20
    will@servername:/home/will/tmp/slash_test > foreach file ( st_* )
    foreach? cd $file
    foreach? wget -b -r -l 4 http://209.207.224.40
    foreach? cd ..
    foreach? end
    Continuing in background.
    Output will be written to `wget-log.2'.
    Continuing in background.
  • Does this mean we'll finally get our stats back??
    It's about damn time! :))

    Floris
  • Somebody asked this question earlier but I don't think there was any response, not even speculation.
  • Now to use this in the same manner as the previous generation.
  • Yes, why not at least link some of the articles to the new hardware? That'll give you incremental load for testing.
  • Ooooh! A "Black Ice" function!

    victim = isFirstPoster(getPoster());
    hang(victim);
    draw(victim);
    quarter(victim);

  • OK here is some perl to run against the site... does not do much except try to load all links (from the site you are contacting.. not external) several times. this may get formatted funny (use view source to get arround this).. maybe some stats can be added to this...

    #!/usr/bin/perl

    #$site = "209.207.224.40";
    $site = "http://209.207.224.40/";

    $num_of_forks = 0;
    $MAXFORKS = 20;

    $SIG{CHLD} = sub { wait; $num_of_forks--; };

    $me = `basename $0`;
    $me =~ s/\n//;

    if ($ARGV[0] eq "-k") {
    print "Killing all $me processes\n";
    system "kill -9 `ps axuw | grep $me | grep -v -e -k | grep -v grep | aw
    k '{print \$2}'` > /dev/null 2>&1";
    exit 0;
    } elsif ($ARGV[0] eq "-h" || $ARGV[0] eq "-help" || $ARGV[0] eq "--help") {
    print "usage: $0 -k|-h/-help/--help|\n";
    print "\t-k kills a running version of this script\n";
    print "\t-h|-help|--help gives this duh\n";
    print "\tanything else should be a url\n";
    exit 0;
    } elsif ($ARGV[0] ne "") {
    if ($ARGV[0] =~ /^-/) {
    print "Invalid option: $ARGV[0]\n";
    print "Try $ARGV -h\n";
    exit 1;
    }
    $site = $ARGV[0];
    }

    $getslash = "lynx -dump $site | grep -e $site | grep http |awk -Fhttp '{print \
    "http\"\$2}'";
    $iterations = 5;

    ### HERE IS WHERE WE REALLY GET GOING
    print "getting urls from $site\n";

    @urls = `$getslash`;

    print "starting the good stuff\n";
    print "Running $iterations iterations\n";
    print "Making $MAXFORKS max connections at a time\n";
    while ($iterations) {

    $iterations--;
    }

    sub ProcessURLs
    {
    foreach $foo (@urls) {
    unless ($child = fork) {
    $parentpid = getppid();
    &GetUrl($foo);
    exit 0;
    }
    $num_of_forks++;
    while ($num_of_forks > $MAXFORKS) { } #waiting
    }
    }

    sub GetUrl
    {
    local($foo) = @_;

    $foo =~ s/\n//;
    ($foo) = split(/ /, $foo);
    if ($foo ne "$site" || $foo ne "http://$site/") {
    system "lynx -dump \"$foo\" > /dev/null 2>&1";
    print "retrieved $foo\n";
    }
    }
  • That's not a function of a normal MAC-level switch.

    There are boxes that perform this kind of function (we use 'em), like the Cisco Local Director (works like a modified bridge) and various layer 4 switches too. That's why I was asking. What is Slashdot using?
  • Or in IE5 make the page available offline to a depth of n pages, where n is any suitably large number (but not so large as to overfill your hard disk... :-)

    angelos
  • Have you looked into using Akamai to host your images? They disperse them over their servers around the world so that they are very near your users. They do charge though...
  • OK here is some perl to run against the site... does not do much except try to load all links (from the site you are contacting.. not external) several times. this may get formatted funny (use view source to get arround this).. maybe some stats can be added to this...



    #!/usr/bin/perl



    #$site = "209.207.224.40";

    $site = "http://209.207.224.40/";



    $num_of_forks = 0;

    $MAXFORKS = 20;



    $SIG{CHLD} = sub { wait; $num_of_forks--; };



    $me = `basename $0`;

    $me =~ s/\n//;



    if ($ARGV[0] eq "-k") {

    print "Killing all $me processes\n";

    system "kill -9 `ps axuw | grep $me | grep -v -e -k | grep -v grep | aw

    k '{print \$2}'` > /dev/null 2>&1";

    exit 0;

    } elsif ($ARGV[0] eq "-h" || $ARGV[0] eq "-help" || $ARGV[0] eq "--help") {

    print "usage: $0 -k|-h/-help/--help|\n";

    print "\t-k kills a running version of this script\n";

    print "\t-h|-help|--help gives this duh\n";

    print "\tanything else should be a url\n";

    exit 0;

    } elsif ($ARGV[0] ne "") {

    if ($ARGV[0] =~ /^-/) {

    print "Invalid option: $ARGV[0]\n";

    print "Try $ARGV -h\n";

    exit 1;

    }

    $site = $ARGV[0];

    }



    $getslash = "lynx -dump $site | grep -e $site | grep http |awk -Fhttp '{print \

    "http\"\$2}'";

    $iterations = 5;



    ### HERE IS WHERE WE REALLY GET GOING

    print "getting urls from $site\n";



    @urls = `$getslash`;



    print "starting the good stuff\n";

    print "Running $iterations iterations\n";

    print "Making $MAXFORKS max connections at a time\n";

    while ($iterations) {



    $iterations--;

    }



    sub ProcessURLs

    {

    foreach $foo (@urls) {

    unless ($child = fork) {

    $parentpid = getppid();

    &GetUrl($foo);

    exit 0;

    }

    $num_of_forks++;

    while ($num_of_forks > $MAXFORKS) { } #waiting

    }

    }



    sub GetUrl

    {

    local($foo) = @_;



    $foo =~ s/\n//;

    ($foo) = split(/ /, $foo);

    if ($foo ne "$site" || $foo ne "http://$site/") {

    system "lynx -dump \"$foo\" > /dev/null 2>&1";

    print "retrieved $foo\n";

    }

    }

  • Tried to login to new server, and it gave me an Error Logging in. Is the login not yet available on your new server (This maybe a stupid q, I'm not sure)?
  • It's Open Source... fix it yourself and submit the patch to Rob !! ;-)
  • I am very curious about this load balancing. I've recently taken over as a sys admin (yuk)/head programmer (yipee) job and we've got about 350-450 websites that we would like to use this kind of scheme on. I haven't found any documentation it though. HELP????

  • I just tried to log in (several times) to the new server, and nothing really happened (perhaps I didn't wait long enough.. darn my impatience)

    just wanted to let you know, and wonder if anyone else was having probs like that...
  • I can see how the signal-to-noise ratio would drop from such articles, but would this traffic really test the system? I have not seen any comments about the different traffic patterns and how they impact the system. Are short "First Post" comments more of a strain than long dissertations? I can see why Rob might not want to talk about the weak areas of the system, but maybe he could suggest areas he wants to see heavily exercised.


    Aside from that, the best an individual can do is use the new system in the same manner as the previous one and hope other people do likewise ;-)

  • A lot of places are using the LinuxDirector -- Very similar (and, actually, much superior) to the Cisco LocalDirector.. I -believe- the web page is down, but you can find links to it on freshmeat, or, email me for more information.
    I, personally, use it for a hi-av proxy cluster. If one goes down, my users never know.
    Comics:
    Sluggy.com [sluggy.com] - Poing!
  • I've been able to log in just fine. However, I have run into not being able to log in *at all*. One of Rob's passwd scripts or something. Anyway, I had to create another account...


  • Not him either, but in one window

    while true; do wget -r 209.207.224.40; done

    and the other window, same dir

    while true; do rm -rf 209.207.224.40; sleep 5; done

    Simply to keep from filling my /tmp :)
  • I monitored the stats while I downloaded /. beta a couple of times and saw that it is compressable by more than 10:1. Since that data is compressed by high performance Symplex compressor equipment (http://www.symplex.com) on the hops between the Net gateway and my place it may not be useable in real life. But there seems to be boatloads of room to push more data over a given amount of bandwidth. Or as is more real in this case to further optimize the html/graphics to serve (more) clients in less time. Maybe I missed an appareant reason why this data is so compressable. Wasn't there talk of compression support in Netscape a few years back?
  • Actually, the code that's out there for download isn't the code that's running the site. The downloadable code is generations behind, thus it would probably be hard to find the bug. See for yourself at here [slashdot.org].

    .Laz
    --
    My car is orange, my sig is not.

No man is an island if he's on at least one mailing list.

Working...