![]() |
Using the top tool - my server overloading
My server is overloading with very few users. When running #top, what do you guys see on your system? Can you post an example of what you see for comparison?
Code:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND Do you see the similar values for the VIRT column as I have above (for both NAMED and HTTPD)? Does a typical httpd request take 0.2% memory on your system as well? I'm hosting single static pages. No PHP. Yes yes I know this is dependant on server config I am just looking for examples of what is within normal ranges. |
Quote:
What makes you say your server is overloaded? What does "free -m" show? What is your load average? (uptime or top) Are you using swap? - This is shown in top aswell. Also, I always run "top -c" rather than "top". It's much more helpful. :thumbsup |
On a side note, the site in your sig has been suspended.
|
Sometimes when you have high CPU processing on httpd processes and nothing else it could be indicative of poorly structured rewrite rules in one of your .htaccess files.
You may want to take a look at that. |
About every 12 hours, or 18 or whenever, my server gets "stuck". uptime goes to load average: 150.26, 150.37, 151.32 and everything stops. I do a httpd stop, wait, then httpd start and it runs fine:load average goes back to 0.01 etc.....but only for another X hours then it sticks again. Completely random. Could be in 20 minutes, or 18 hours.
I have been looking for it for about a month, and can't find what the deal is. I have shut down Mysql, suspended (almost) all accounts except for the account that gets most visitors, and it still gets stuck. I have a SINGLE static page, and bam, still crashes. No php, no nothing. I say "stuck" because it is still running, and I can shell to it, just that httpd is no longer processing requests, and doesnt flush out the old ones. It's stuck. I have hired 2 admins to help out, but both look at the server and are stumped after about an hour... |
Quote:
|
Is that dedicated or shared hosting?
|
So technical for me here. hehh
|
that top output looks fine :2 cents:
Every 12 or 18 hours? You have anything setup in cron that is processor intensive? Have you checked logs for bots? Sometimes they hammer a site and bring it to a crawl. Are you running apache? What version? What OS? |
Quote:
Quote:
I will shut down all my cron jobs to make sure. I forgot about that! But shutting down MySQL should shut down any cron jobs that use MySQL data. But good idea! I don't think it's bots or traffic, because if it was just too much traffic, then after restarting apache, it would get stuck straight away again. It happens at odd times. Here is some interesting notes when I managed to "catch it in the act" of getting stuck. When it is starting to get stuck, I notice uptime will start to report higher and higher values. During this time I can get the apache status. Once its stuck, then I can't. But I did manage to catch it in the act, and here is what uptime showed me: 0.12 0.08 0.09 5.12 0.12 0.08 30.2 5.12 0.12 90.0 30.2 5.12 150.1 150.2 149.0 (server is now toast) In a very short time, it's as if it gets "stuck". Once it's stuck, I shutdown apache, restart it, then everthing runs fine for 13 more hours, then it happens again. lynx --dump localhost/whm-server-status So, tell me what you make of these apache status reports: http://www.photofight.com/xx.txt (started to get stuck) http://www.photofight.com/xy.txt (a little more stuck) http://www.photofight.com/xz.txt (now it's really stuck) Notice the jump halfway down, where the milliseconds for a response becomes 1921913212 ?!?!? Search for 9068 and you will see q2growth.com GET /Printing-Supplies/Printing-Supplies.html HTTP/1.1 Shutdown apache and restarted: http://www.photofight.com/xq.txt (yay its not stuck) If it was simply too much traffic, then restarting apache would cause it to overload immediatly. So it can't be just a matter of too much traffic. There is something fishy and I can't figure it out. I can't work on ANY more websites until I figure this out too, since I don't have any money coming in while its 'stuck' and it is top priority. I love the word stuck. Its the best way to describe it. Ideas? Notice how requests build up, but dont get out! They just stack on top of each other! |
why/how is your apache instance serving up both HTTP/1.0 and HTTP/1.1 ???
Everything on srv1.ryd0.com is 1.0 whereas all the rest are 1.1 --edit nm, it's client based |
Hmm... just thinking about possible issues, and this one jumped out... why do you have mysql on the server if all pages are static? Do you run a script that rebuilds pages?
|
What's weird is all your apache requests are pretty much "sending reply"
Normally, you have a big mix of "Reading Request",. Closing Connection, Waiting for Connection, and just a few sending replies. Yours are all sending reply. Here's an output from one of my servers that is handling ~60 requests/sec to your 7 requests/sec. Code:
Parent Server Generation: 0 In any case all the sending replies is not normal |
Quote:
I put up the static page to rule out the PHP too. Basically I removed everything to get it stable, then I would add items and find out what is causing it. Still no luck as even with a single static page, it is still fucking up. I am a good PHP programmer, but a lousy admin. :-( |
It's interesting, but I do have the same issue as you described. The server load normally is totally ok, around .01, or so. But at some point of time during the day it also gets soo high, and the apache got stuck. Until I restart it, or let it do it by itself, the HTTP pretty much doesn't work at all.
I tried to look into it but couldn't find anything that would cause this. It's really annoying ;( |
Quote:
|
Holy shit I see a pattern in my hosts down!!!
4:30am, 12:30pm 8:30pm. Give or take 1 minute!!!! Wow a clue in a clueless world! That means a backup, a cron job. Backup or something. I predict a crash at 8:30 tonight! |
Quote:
Could be backup, but the thing is that Apache gets overloaded, not other process, that's the problem. |
Do you have a shitty host? Why aren't they looking into this?
|
Quote:
Under FreeBSD when you make a snapshot of a drive that has a zillion files it can sometimes freeze the OS for a minute or two, you can type at the cmdline but as soon as it needs to access a HD it just waits... I know you're on a completely different OS, just saying that backups are not always 100% seamless. :) |
Quote:
If you need help decoding when stuff is scheduled to run I can help. |
Quote:
|
interesting infos
|
for every username in
ls /var/spool/cron run (as root) crontab -u <username> -l to see which crontab is the culprit. |
I tell you one thing though, I've learned ALOT about my server while trying to track this down. I have focused my skills on PHP and development, not so much admin, but I am actually getting confident in handing some admin functions now.
|
I recommend set cron every minute and log apache server status so you can see which pages/scripts are requested in the peaks. This will help you determine which script is causing server overload
|
Quote:
ps -aux gives me this Code:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND Server got stuck again this morning at exactly 4:41am. Still didn't find what cron job it is... Code:
# crontab -l |
Quote:
|
30 */4 * * * /usr/bin/test -x /scripts/update_db_cache && /scripts/update_db_cache
Comment out this one. It runs at 30 minutes past the hour, every 4 hours :thumbsup Or, run it yourself and see how long it takes to run and if it causes your server to halt. |
choopa would knock this out for u
|
Quote:
I commented out both lines: 30 */4 * * * /usr/bin/test -x /scripts/update_db_cache && /scripts/update_db_cache 45 */8 * * * /usr/bin/test -x /usr/local/cpanel/bin/optimizefs && /usr/local/cpanel And I will see what happens... |
Or it is possible that the first task, takes more than 4 hours to run, and therefore the second time it is run, halts the server until the first is done. Causing the halt every 8 hours (2 tasks while second is waiting for first to finish).
Hmmm... |
It's unlikely the first task took that long, but it is possible. I didn't check your times properly (I skimmed the thread, so sue me :P ), so it is more than likely the second task that runs every 8 hours which is causing issues.
Let us know how it goes :thumbsup |
Well that wasn't it. Server uptime is 150.0 150.0 150.0 at 11:17am...Both cron jobs were removed..
Back to square one. Having a server that doesn't work, with so few visitors sucks. |
try checking MaxClients in Apache. I just figured out in some of the error logs that it says maxclients are riched, try increasing this value. I increased it, and now I'm monitoring to see what happens.
|
Quote:
|
The problem seems to be that apache threads are not exiting. They stay in the top of the output buffer.
Here check it out. Here are some outputs as the server is slowly crashing... http://www.photofight.com/xx.txt http://www.photofight.com/xy.txt http://www.photofight.com/xz.txt See how the top thread is never leaving? |
check if your getting botted
netstat -anp |grep ‘tcp\|udp’ | awk ‘{print $5}’ | cut -d: -f1 | sort | uniq -c | sort -n |
Quote:
Are you sure you don't have a rogue js somewhere? I know I can visit a site, and my browser will be forever "receiving item 19 of 20" or something similar. Means some js or remote DNS call is taking forever to complete. But it can bring down a server if that server is having enough hits at that time... |
you need to tweak your apache...
|
What do you have set for the following values and which version of Apache?
MaxRequestsPerChild KeepAlive KeepAliveTimeout MaxKeepAliveRequests |
MaxClients 250
MaxRequestsPerChild 100 KeepAlive On KeepAliveTimeout 3 MaxKeepAliveRequests (not set) |
Server overloading - that's not good. Good luck.
|
Quote:
|
All times are GMT -7. The time now is 02:58 PM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc