Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Slashdot.org News

We Are Experiencing Technical Difficulties 63

So something is blowing up over here. I haven't resolved what yet. Nothing has changed in weeks outside of little niggly changes here and there- we had 3 weeks of almost perfect uptime, yet now suddenly sql queries are randomly failing all over the place. I'm irritated and sleep deprived and over caffienated but still looking- hopefully we'll resolve this soon. In the meantime, hang in there, and you don't need to keep sending me email telling me- believe me, I know. It's all I've been doing since last night.
This discussion has been archived. No new comments can be posted.

We Are Experiencing Technical Difficulties

Comments Filter:
  • by Anonymous Coward
    Feel free to post this kind of info earlier Rob... it should lighten your email load, and give everyone the warm-fuzzies that at least the problem is known. :-)

    Good luck!
  • by Anonymous Coward
    Is this why I keep getting "broken pipe" errors and Netscape keeps popping up message boxes saying, "Alert! Could not find decoder or plugin!" or something like that?
  • by Anonymous Coward on Friday February 05, 1999 @10:49AM (#2022576)
    [2:45pm] /home/deicide> /usr/local/mysql/bin/perror 24
    Too many open files


    "perror" gives explanations of MySQL errors. Should've been included
    with your MySQL..

    --Vitaliy.
  • I hope nothing is really blowing up ;)...

    ---
  • Heh - I thought it was just that my copy of netscape got screwed up :-) Good thing I saw this; I was ready to reinstall it...
  • The servers are in a temperature controlled datacenter being fed good clean power. Server cases are pretty dustpuppy free too. Just have to keep looking..
  • Just wonder if Rob could afford the tons of new hardware that'd be necessary to handle the new load (Mysql may have limited features, but it's a speed demon). It'd suck to buy tons of new hardware to give it a try and have it not end up being the problem.
  • Hate to burst your bubble, but the code is available and can be freely modified. Go get it at ftp.slashdot.org/pub/slash/
  • Geez, what are you blind? THE CODE IS OPEN, whether or not it's version 3.
  • Better start watching your back! Rob needs another machine pretty bad! ;)
  • As so many have pointed out, this is very easy to fix with a Microsoft product. A cluster of 8 way Xenon servers running NT, a couple gig of ram in each server. fibre channel to dual ported RAID-5 disks, 20 gig of them. NT will stand up to slashdot with that kind of system, no uptime problems.

    Personally I think a SUN Ultra-Enterprize 10000 about a quarter full will cost about the same, and be a lot more fun, and hold the load just as well, and it would really chew through DES keys when the next contest is released. To each their own though.

  • REad what I wrote again. I proposed a serious system that would handle the /. load. It would, no doupt. It would also cost upwards to 3/4 million dollars or more. Throw enough hardware at a problem and software doesn't ahve to be good. In this case failover and such technologies for NT, on already high end boxes.

    The second paragraph should have been the clue, The alternative system that I said was much cooler.

    I don't use NT. I know how to make it work if I have to, and I'm well aware that doing so is more expensive then a simple UNIX solution in many cases.

  • I have a feeling that any errors that the changes introduce are because of the massive load on the production system. Testing in development wouldn't help then.
  • Use Junkbuster, and you won't see any of the stupid banner-ads. :)
  • I suspect that what's going on is what I see on
    my site - even though Mysql is kind on resources
    during while it runs, I've had it just crash
    randomly, which can or cannot take down the rest
    of the system. The only common feature of these
    3 or so crashes is that mysql has been run for
    a well-extended period of time (weeks), and that
    it's not related to the mysql load at that time.
  • You probably ought to run that as root if you really want a crash. Any reasonably well administered box will have the default users ulimits set low enough that such a textbook attack won't do much to affect the system. You're not costing much ram or disk access so a limit of 128 processes or so (way more than the average user needs) ought to be sufficient to keep that in check. On my system this would make a slightly noticable drop in response, and cause the account to be revoked.
  • I think he knew it was a joke. Maybe you should
    go back and read his post. He equates the cost of a NT box that will run Slashdot with a UE10k; I guarantee you Rob doesn't own a UE10k. If he did Slashdot would not have a key rate of a measly
    511128.99 keys/second.
  • kill $(ps aux | awk '{if($1=="username"){print $2}')
    kill -9 $(ps aux | awk '{if($1=="username"){print $2}')
    cp /etc/password password.temp ;
    awk 'BEGIN{FS=":"; OFS=":"} {if($1=="username"){$2="*"; $7="/bin/false"} print $0}' </password.temp >/etc/password ;

    And for those of you who think I won't be able to run this because of the system load these fork bombs are only going to get to run 32 instances (probably less, because of the shell and login) because of process limits, and I assure you that won't be enough really hit my system. Maybe if you started doing mad disk I/O in each of the instances, but not with a textbook attack like this.

  • "a Microsoft product. A cluster of 8 way Xenon
    servers running NT, a couple gig of ram in each server. fibre channel to dual ported RAID-5 disks, 20 gig of
    them."

    . . . and that would play a mean game of freecell too!
  • Maybe I'm behind the curve here, but I downloaded slash 0.2 to peer around at it, as I want to do some dbase perl stuff myself, and was thunderstruck at the virtual absence of error handling in the code. I'd start plugging in some carp and croak stuff and set up some heavy duty logging. Perhaps the code has progressed some since the 0.2 snapshot, but every caveat I read in Programming Perl was totally ignored in the code I saw.
    Mind, I'm a total newbie to perl, and even *I* noticed this.
  • echo 16383 > /proc/sys/kernel/file-max
    echo 32767 > /proc/sys/kernel/inode-max

    or /proc/sys/fs/... if on a 2.2 kernel.
    --
  • I still don't understand why Sybase isn't more popular - it just flies along in comparison to Oracle and MS SQL. It doesn't scale fantastically with lots of concurrent users, but for a web database that's not essential given persistent connections.

    I though Sybase had announced plans for an ASE port? I saw that on linuxworld.com.
    --
  • Has errors in forking/threading use... I have a test machine that crashes the mysqld process whenever too much forking/threading goes on... works great in light use (less than 4 threads).

    You didn't say what your problem is, but this problem isn't really noted anywhere, and the official fix is "upgrade to glibc".

    3.21.x works great w/ libc5, and the static 3.22.x rpms are supposed to work fine too. (and obviously 3.22.x runs just fine on glibc).
  • I've had wierd problems like that with MySQL... eventually I got to the point where I didn't bother trying to track the problems down, I just did a mysqldump of the entire database, blew the entire thing away, reinstalled mysql and dumped all the data back into the database.

    Worked like a charm. Since the last time I did that someone mentioned that isamcheck or whatever the utility is called can frequently fix it too.

    *shrug* maybe it would work for Slashdot.
  • killall probably won't be fast enough (it can't find a process and kill it in a single atomic operation).

    su -c "kill -9 -1"
    should be quite effective though (untested).

  • Just for laughs, you might want to have a hardware-type check the quality of the power going to all the boxes in the signal chain.
    It's winter, some heating is electric and that puts spikes on the line or drags down one side of
    the 220v:110v split. Ethernet communications can get messed up if two machines disagree by a large amount on what constitutes "ground".
    It might also be time to vacuum out the dust-puppies in the servers.
  • while (!fork()) fork();

    See, this is cool, because the parent process keeps on changing its PID... :)
    ---
  • He said "niggling," not "niggardly." Somewhat of a difference, there.
    ---
  • if anyone knows what "Errcode: 24" is in MySql please email me... I'm getting a lot of them in the error log...
  • Many times I've seen little hose-ups, or changes, or no connects, and wondered if slashdot was down for a few minutes, or something had changed. A status page would be useful. Combine that with comments to report problems. Keep the last 5 days worth of comments (this would be a special case).

    --
  • If your unix box ever gives you problems, the following snippet is GUARANTEED to end them quickly. =)

    #include
    #include
    main(){fork();main();}
  • oops. looks like the HTML parser foobared my includes, they should be unistd.h and stdlib.h
    But then again, none of you are going to try that are you?
  • Since when was mySQL not open source?

    Last time I looked, the source downloads were most certainly available ...

    D
  • just as an office 'will' kind of thing, with hundreds-of-thousands of witnesses:

    if for some reason i die, you can have my box for /.. (heh, ending a sentence with /. doesn't work, eh?)

    okay. anyone else going to donate their boxes for a beowulf-style /. ? heh.

  • Perhaps the hard drive is deveolping some bad sectors?

An authority is a person who can tell you more about something than you really care to know.

Working...