PDA

View Full Version : Server Optimization Help


WotC_Tech
Wed 14th Jan '04, 3:43pm
Hi,

We have a fairly large vB install (54K active members / 1.4 million posts). We have noticed our performance has significantly declined as our boards grow. The slowness doesn't always correspond to peak usage hours, so it may not be related to concurrent users.

I would appreciate it if I could get some help in determining the bottlenecks and finding a solution to alleviate the problem.

1. Dedicated server

2. * 2 Servers with same hardware config (1 Linux/Apache/PHP and 1 Linux MySQL)
* Dell PowerEdge 2650 Dual 2.8GHz/512K Cache Xeon
* 2GB DDR 200MHz Memory
* 2 x 36GB 15K RPM RAID 1 for system
* 3 x 36GB 15K RPM RAID 5 for data
* RedHat Linux 7.3. Kernel 2.4.20-7smp
* Apache version 1.3.27
* PHP version 4.3.1 installed as Apache module
* MySQL version 4.0.12
* vBulletin version 2.3.0

3. No innodb

4. MySQL compiled from source. I don't remember which options.

5. Top stats on Apache server:

2:14pm up 204 days, 26 min, 1 user, load average: 2.97, 4.66, 6.18
199 processes: 181 sleeping, 12 running, 6 zombie, 0 stopped
CPU0 states: 61.0% user, 6.1% system, 0.0% nice, 32.2% idle
CPU1 states: 55.1% user, 7.0% system, 0.0% nice, 37.2% idle
Mem: 2064832K av, 1490380K used, 574452K free, 0K shrd, 155548K buff
Swap: 2040244K av, 35140K used, 2005104K free 485228K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
11452 nobody 10 0 7600 6736 2084 S 27.5 0.3 0:01 httpd
22748 nobody 9 0 9276 8428 2180 S 10.7 0.4 24:06 httpd
11326 nobody 9 0 6100 5236 2104 S 8.1 0.2 0:00 httpd
11324 nobody 19 0 6312 5456 2104 R 7.7 0.2 0:00 httpd
21140 nobody 20 0 9004 8140 2124 R 5.9 0.3 49:59 httpd
11385 nobody 20 0 6196 5328 2056 R 5.7 0.2 0:00 httpd
4241 nobody 11 0 8992 8128 2120 S 5.3 0.3 28:33 httpd
32485 nobody 20 0 8940 8016 2104 R 4.1 0.3 55:43 httpd
26430 nobody 18 0 8352 7500 2132 R 4.1 0.3 26:03 httpd
11317 nobody 11 0 6100 5236 2088 S 3.3 0.2 0:00 httpd
24944 nobody 17 0 8632 7752 2116 R 2.7 0.3 30:37 httpd
11439 nobody 10 0 6056 5184 2088 S 2.7 0.2 0:00 httpd
6 root 15 0 0 0 0 SW 2.5 0.0 6510m kscand
11378 nobody 10 0 5524 4652 2072 S 2.5 0.2 0:00 httpd
18730 nobody 9 0 8704 7780 2144 S 2.1 0.3 58:45 httpd
12951 nobody 9 0 8960 8088 2172 S 2.1 0.3 26:29 httpd
22883 nobody 15 0 8780 7948 2192 S 2.1 0.3 2:47 httpd
13933 nobody 9 0 8320 7456 2116 S 1.9 0.3 28:57 httpd
26462 nobody 9 0 8192 7360 2188 S 1.9 0.3 2:26 httpd


Top stats on MySQL server:

9:30am up 107 days, 2:02, 1 user, load average: 0.50, 0.77, 1.07
293 processes: 292 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 1.1% user, 63.2% system, 0.0% nice, 34.1% idle
CPU1 states: 2.0% user, 2.0% system, 0.0% nice, 95.2% idle
Mem: 2064716K av, 2052296K used, 12420K free, 0K shrd, 5372K buff
Swap: 2040244K av, 105784K used, 1934460K free 1755448K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
824 root 25 0 1188 1188 840 R 63.7 0.0 0:03 top
416 mysql 16 0 280M 205M 29368 S 1.0 10.2 0:12 mysqld
31970 mysql 15 0 280M 205M 29368 S 0.7 10.2 2:02 mysqld
417 mysql 15 0 280M 205M 29368 S 0.7 10.2 0:15 mysqld
26155 mysql 15 0 280M 205M 29368 S 0.3 10.2 5:06 mysqld
26930 mysql 15 0 280M 205M 29368 S 0.3 10.2 3:26 mysqld
32406 mysql 15 0 280M 205M 29368 S 0.3 10.2 1:07 mysqld
421 mysql 15 0 280M 205M 29368 S 0.3 10.2 0:13 mysqld
516 mysql 15 0 280M 205M 29368 S 0.3 10.2 0:08 mysqld
587 mysql 15 0 280M 205M 29368 S 0.3 10.2 0:11 mysqld
1 root 15 0 448 408 396 S 0.0 0.0 1:20 init
2 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU0
3 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU1
4 root 15 0 0 0 0 SW 0.0 0.0 0:03 keventd
5 root 34 19 0 0 0 SWN 0.0 0.0 0:49 ksoftirqd_CPU0
6 root 34 19 0 0 0 SWN 0.0 0.0 0:47 ksoftirqd_CPU1
7 root 15 0 0 0 0 SW 0.0 0.0 19:21 kswapd
8 root 15 0 0 0 0 SW 0.0 0.0 62:44 bdflush
9 root 15 0 0 0 0 SW 0.0 0.0 1:17 kupdated
10 root 25 0 0 0 0 SW 0.0 0.0 0:00 mdrecoveryd
16 root 15 0 0 0 0 SW 0.0 0.0 0:00 aacraid
17 root 25 0 0 0 0 SW 0.0 0.0 0:00 scsi_eh_0
20 root 15 0 0 0 0 SW 0.0 0.0 1:34 kjournald


6. /etc/my.cnf

[client]
port = 3306
socket = /data/mysql.sock
[mysqld]
port = 3306
socket = /data/mysql.sock
skip-innodb
skip-locking
max_connections=650
key_buffer=16M
max_allowed_packet=16M
table_cache=1024
join_buffer=2M
sort_buffer_size=2M
read_buffer_size=2M
myisam_sort_buffer_size=64M
thread_cache_size=256
wait_timeout=14400
connect_timeout=10
query_cache_limit = 2M
query_cache_size = 32M
query_cache_type = 1
[mysqldump]
quick
max_allowed_packet=16M
[isamchk]
key_buffer=128M
sort_buffer=128M
read_buffer=2M
write_buffer=2M
[myisamchk]
key_buffer=128M
sort_buffer=128M
read_buffer=2M
[mysqlhotcopy]
interactive-timeout


7. mysqlinfo.php output on web server:

Wed Jan 14 14:21:28 EST 2004 2:21pm up 204 days, 33 min, 1 user, load average: 1.50, 2.12, 4.41137 processes: 129 sleeping, 3 running, 5 zombie, 0 stoppedCPU0 states: 78.0% user, 17.0% system, 0.0% nice, 3.0% idleCPU1 states: 95.0% user, 4.0% system, 0.0% nice, 0.0% idleMem: 2064832K av, 1368584K used, 696248K free, 0K shrd, 155616K buffSwap: 2040244K av, 35140K used, 2005104K free 485384K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND21642 nobody 14 0 10856 9976 2208 R 54.0 0.4 54:23 httpd 8830 nobody 9 0 9068 8204 2212 R 8.8 0.3 27:43 httpd12014 nobody 10 0 1072 1072 828 R 0.9 0.0 0:00 topHttp processes currently running = 99Mysql processes currently running = 2Netstat information summary 1 FIN_WAIT2 6 CLOSE_WAIT 7 LISTEN 11 SYN_RECV 13 FIN_WAIT1 50 ESTABLISHED 2675 TIME_WAIT +---------------------------+-----------------+| Variable_name | Value |+---------------------------+-----------------+| Aborted_clients | 155 || Aborted_connects | 2 || Bytes_received | 4064747740 || Bytes_sent | 4171155203 || Com_admin_commands | 0 || Com_alter_table | 8 || Com_analyze | 0 || Com_backup_table | 0 || Com_begin | 0 || Com_change_db | 38203714 || Com_change_master | 0 || Com_check | 0 || Com_commit | 0 || Com_create_db | 1 || Com_create_function | 0 || Com_create_index | 0 || Com_create_table | 80 || Com_delete | 4138860 || Com_delete_multi | 0 || Com_drop_db | 1 || Com_drop_function | 0 || Com_drop_index | 0 || Com_drop_table | 40 || Com_flush | 0 || Com_grant | 2 || Com_ha_close | 0 || Com_ha_open | 0 || Com_ha_read | 0 || Com_insert | 19331176 || Com_insert_select | 231150 || Com_kill | 0 || Com_load | 0 || Com_load_master_data | 0 || Com_load_master_table | 0 || Com_lock_tables | 104 || Com_optimize | 0 || Com_purge | 0 || Com_rename_table | 0 || Com_repair | 0 || Com_replace | 1067866 || Com_replace_select | 4196 || Com_reset | 0 || Com_restore_table | 0 || Com_revoke | 0 || Com_rollback | 0 || Com_select | 330090088 || Com_set_option | 2496 || Com_show_binlog_events | 0 || Com_show_binlogs | 0 || Com_show_create | 2496 || Com_show_databases | 29 || Com_show_fields | 2524 || Com_show_grants | 6 || Com_show_keys | 10 || Com_show_logs | 0 || Com_show_master_status | 0 || Com_show_new_master | 0 || Com_show_open_tables | 0 || Com_show_processlist | 1 || Com_show_slave_hosts | 0 || Com_show_slave_status | 0 || Com_show_status | 3 || Com_show_innodb_status | 0 || Com_show_tables | 214 || Com_show_variables | 1 || Com_slave_start | 0 || Com_slave_stop | 0 || Com_truncate | 0 || Com_unlock_tables | 0 || Com_update | 75538022 || Connections | 38203706 || Created_tmp_disk_tables | 419937 || Created_tmp_tables | 19162064 || Created_tmp_files | 114 || Delayed_insert_threads | 0 || Delayed_writes | 0 || Delayed_errors | 0 || Flush_commands | 1 || Handler_commit | 0 || Handler_delete | 29483894 || Handler_read_first | 9126061 || Handler_read_key | 3242813902 || Handler_read_next | 1508159600 || Handler_read_prev | 329962039 || Handler_read_rnd | 930340180 || Handler_read_rnd_next | 3670659208 || Handler_rollback | 0 || Handler_update | 90873447 || Handler_write | 2984830271 || Key_blocks_used | 15586 || Key_read_requests | 729071866 || Key_reads | 334915954 || Key_write_requests | 103553476 || Key_writes | 99865328 || Max_used_connections | 484 || Not_flushed_key_blocks | 0 || Not_flushed_delayed_rows | 0 || Open_tables | 1024 | 100% of table_cache in use| Open_files | 1058 || Open_streams | 0 || Opened_tables | 194291 || Questions | 981673245 || Qcache_queries_in_cache | 6221 || Qcache_inserts | 329428688 || Qcache_hits | 474856074 || Qcache_lowmem_prunes | 14490691 || Qcache_not_cached | 662958 || Qcache_free_memory | 21267584 || Qcache_free_blocks | 3166 || Qcache_total_blocks | 15806 || Rpl_status | NULL || Select_full_join | 3648 || Select_full_range_join | 1543472 || Select_range | 60243060 || Select_range_check | 0 || Select_scan | 20276355 || Slave_open_temp_tables | 0 || Slave_running | OFF || Slow_launch_threads | 0 || Slow_queries | 14288 | (execution time > 10 secs)| Sort_merge_passes | 57 || Sort_range | 51225003 || Sort_rows | 1272238228 || Sort_scan | 20193788 || Table_locks_immediate | 597909933 || Table_locks_waited | 5605761 || Threads_cached | 236 || Threads_created | 4990 || Threads_connected | 21 || Threads_running | 4 || Uptime | 9251127 | 107 days 1 hr 45 mins 27 secs+---------------------------+-----------------+Key Reads/Key Read Requests = 0.459373 (Cache hit = 99.540627%)Key Writes/Key Write Requests = 0.964384Connections/second = 4.130 (/hour = 14866.658)KB received/second = 0.227 (/hour = 816.089)KB sent/second = 0.227 (/hour = 816.089)Temporary Tables Created/second = 2.071 (/hour = 7456.760)Opened Tables/second = 0.021 (/hour = 75.607)Slow Queries/second = 0.002 (/hour = 5.560)% of slow queries = 0.001%Queries/second = 106.114 (/hour = 382010.071)

Output of extended-status from MySQL server:

+--------------------------+------------+
| Variable_name | Value |
+--------------------------+------------+
| Aborted_clients | 155 |
| Aborted_connects | 2 |
| Bytes_received | 4068739274 |
| Bytes_sent | 4249566731 |
| Com_admin_commands | 0 |
| Com_alter_table | 8 |
| Com_analyze | 0 |
| Com_backup_table | 0 |
| Com_begin | 0 |
| Com_change_db | 38204530 |
| Com_change_master | 0 |
| Com_check | 0 |
| Com_commit | 0 |
| Com_create_db | 1 |
| Com_create_function | 0 |
| Com_create_index | 0 |
| Com_create_table | 80 |
| Com_delete | 4138925 |
| Com_delete_multi | 0 |
| Com_drop_db | 1 |
| Com_drop_function | 0 |
| Com_drop_index | 0 |
| Com_drop_table | 40 |
| Com_flush | 0 |
| Com_grant | 2 |
| Com_ha_close | 0 |
| Com_ha_open | 0 |
| Com_ha_read | 0 |
| Com_insert | 19331602 |
| Com_insert_select | 231158 |
| Com_kill | 0 |
| Com_load | 0 |
| Com_load_master_data | 0 |
| Com_load_master_table | 0 |
| Com_lock_tables | 104 |
| Com_optimize | 0 |
| Com_purge | 0 |
| Com_rename_table | 0 |
| Com_repair | 0 |
| Com_replace | 1067893 |
| Com_replace_select | 4196 |
| Com_reset | 0 |
| Com_restore_table | 0 |
| Com_revoke | 0 |
| Com_rollback | 0 |
| Com_select | 330096743 |
| Com_set_option | 2496 |
| Com_show_binlog_events | 0 |
| Com_show_binlogs | 0 |
| Com_show_create | 2496 |
| Com_show_databases | 29 |
| Com_show_fields | 2524 |
| Com_show_grants | 6 |
| Com_show_keys | 10 |
| Com_show_logs | 0 |
| Com_show_master_status | 0 |
| Com_show_new_master | 0 |
| Com_show_open_tables | 0 |
| Com_show_processlist | 1 |
| Com_show_slave_hosts | 0 |
| Com_show_slave_status | 0 |
| Com_show_status | 4 |
| Com_show_innodb_status | 0 |
| Com_show_tables | 214 |
| Com_show_variables | 1 |
| Com_slave_start | 0 |
| Com_slave_stop | 0 |
| Com_truncate | 0 |
| Com_unlock_tables | 0 |
| Com_update | 75539831 |
| Connections | 38204523 |
| Created_tmp_disk_tables | 419941 |
| Created_tmp_tables | 19162439 |
| Created_tmp_files | 114 |
| Delayed_insert_threads | 0 |
| Delayed_writes | 0 |
| Delayed_errors | 0 |
| Flush_commands | 1 |
| Handler_commit | 0 |
| Handler_delete | 29484274 |
| Handler_read_first | 9126248 |
| Handler_read_key | 3242998046 |
| Handler_read_next | 1508906868 |
| Handler_read_prev | 329967645 |
| Handler_read_rnd | 930462872 |
| Handler_read_rnd_next | 3671896141 |
| Handler_rollback | 0 |
| Handler_update | 90875423 |
| Handler_write | 2984902206 |
| Key_blocks_used | 15586 |
| Key_read_requests | 729734036 |
| Key_reads | 334932127 |
| Key_write_requests | 103555273 |
| Key_writes | 99867095 |
| Max_used_connections | 484 |
| Not_flushed_key_blocks | 0 |
| Not_flushed_delayed_rows | 0 |
| Open_tables | 1024 |
| Open_files | 1058 |
| Open_streams | 0 |
| Opened_tables | 194291 |
| Questions | 981693689 |
| Qcache_queries_in_cache | 6373 |
| Qcache_inserts | 329435337 |
| Qcache_hits | 474865887 |
| Qcache_lowmem_prunes | 14490691 |
| Qcache_not_cached | 662964 |
| Qcache_free_memory | 21675504 |
| Qcache_free_blocks | 3191 |
| Qcache_total_blocks | 16093 |
| Rpl_status | NULL |
| Select_full_join | 3648 |
| Select_full_range_join | 1543505 |
| Select_range | 60244354 |
| Select_range_check | 0 |
| Select_scan | 20276723 |
| Slave_open_temp_tables | 0 |
| Slave_running | OFF |
| Slow_launch_threads | 0 |
| Slow_queries | 14288 |
| Sort_merge_passes | 57 |
| Sort_range | 51226146 |
| Sort_rows | 1272370720 |
| Sort_scan | 20194177 |
| Table_locks_immediate | 597922507 |
| Table_locks_waited | 5605896 |
| Threads_cached | 243 |
| Threads_created | 4990 |
| Threads_connected | 14 |
| Threads_running | 1 |
| Uptime | 9251281 |
+--------------------------+------------+


8. vB is the only thing on the server. There is a script which dumps the database nightly for backup.

9. On an average day we peak at about 1000 concurrent users. The most users ever online was 1734.

10. http://boards1.wizards.com/phpinfo.php

11. KeepAlive = Off
MaxKeepAliveRequests = 100
KeepAliveTimeout = 15
MinSpareServers = 50
MaxSpareServers = 10
StartServers = 50
MaxClients = 256

12. vB version 2.3.0


Thanks,
Mike

eva2000
Fri 16th Jan '04, 2:42am
in this order

1. update apache to 1.3.29
2. update mysql to version 4.0.17
3. update php to 4.3.4
4. update vB to 2.3.4

5. now from mysqlinfo.php your

Qcache_lowmem_prunes | 14490691 |


along with your max_used_connections of 484, suggest your

1. lacking memory
2. probably borderline on the need for splitting from 1 server serving apache + mysql, to 2 servers where 1 is for apache and 1 is for mysql
3. what's your cookie timeout for 1,000 vB users online ?

i'd also change you my.cnf to below to see if it helps


[client]
port = 3306
socket = /data/mysql.sock

[mysqld]
port = 3306
socket = /data/mysql.sock
skip-innodb
skip-locking
max_connections = 800
key_buffer = 16M
myisam_sort_buffer_size = 64M
join_buffer_size = 1M
read_buffer_size = 1M
sort_buffer_size = 2M
table_cache = 1024
thread_cache_size = 64
wait_timeout = 1800
connect_timeout = 10
max_allowed_packet = 16M
max_connect_errors = 10
query_cache_limit = 1M
query_cache_size = 32M
query_cache_type = 1

[mysqld_safe]
open_files_limit = 8192

[mysqldump]
quick
max_allowed_packet=16M

[isamchk]
key_buffer=128M
sort_buffer=128M
read_buffer=2M
write_buffer=2M

[myisamchk]
key_buffer=128M
sort_buffer=128M
read_buffer=2M

[mysqlhotcopy]
interactive-timeout



maybe try lowering maxclients in httpd.conf from 256 to 150, 180, or 200 and restart apache after each edit

WotC_Tech
Fri 16th Jan '04, 4:17pm
Thanks for the pointers. I will schedule upgrades of those components over the next couple of weeks and let you know the results.
along with your max_used_connections of 484, suggest your

1. lacking memory
2. probably borderline on the need for splitting from 1 server serving apache + mysql, to 2 servers where 1 is for apache and 1 is for mysql
3. what's your cookie timeout for 1,000 vB users online ?

1. There is 2GB in each of the servers. It looks like the Apache server is the one that has CPU load issues, but it shows there is 500MB free. The MySQL server is using all 2GB. I am guessing that MySQL will use all the memory you can give it. Is that a correct assumption?

2. We are already split onto 2 servers. I posted the top outputs for both of the servers. It seems to me that the Apache server is the one with the performance issues.

3. The cookie timeout is set at 1200.


maybe try lowering maxclients in httpd.conf from 256 to 150, 180, or 200 and restart apache after each edit

I was running out of client connections when I had this set at a lower value. I even had to turn off keep-alives in order to keep the number of connections down. Should I lower maxclients, and play with the keep-alive timeout?

Thanks for your help.
-Mike

WotC_Tech
Thu 5th Feb '04, 7:34pm
Well, I've updated Apache, MySQL, PHP, and vB as suggested. I've also ditched PHPaccelerator and replaced it with Turck MMCache 2.4.6.

We are still seeing intermittent high load spikes on the web server. Occasionally the load on the web server will shoot up as high as 50 and not drop below 10 for 15 minutes or so. During the same period, the MySQL server load remains fairly low at around 2 or 3.

I noticed that sometimes during the spikes, Apache server-status reports that there are zero idle servers (MaxClients = 200). It takes a very long time to load a vB page when the server is in this state.

I enabled MySQL slow query logging, but haven't seen anything show up in the log yet.

I have also had reports that posting on our forums can be painfully slow. I believe this may be due to the email notifications discussed in this thread:
http://www.vbulletin.com/forum/showthread.php?t=50241
Today I modified the configuration of sendmail so that the email is queued instead of PHP having to wait for it to send right away. Hopefully this has some impact on the slow posting.

What else should I be looking at?

alexi
Thu 5th Feb '04, 9:32pm
Since you are still running VB2 have you tried the MySQL search hack? it seems to have made quite a difference for us. If you want to see if searching is causing the problem there is a small, simple hack that turns search off when you hit a certain load. Have you tried the defered threadviews hack? That also makes quite a difference.

WotC_Tech
Mon 9th Feb '04, 7:47pm
Since you are still running VB2 have you tried the MySQL search hack?
That sounds worthwhile. Does it have an impact on the front end server, or just the database server? Do you happen to have a link to the hack you are talking about?


Have you tried the defered threadviews hack? That also makes quite a difference.Yup, we have already applied the deferred threadview hack.

alexi
Sat 14th Feb '04, 10:58pm
http://www.vbulletin.org/forum/showthread.php?t=51716&page=1&pp=15

It seems to help the overall system. People are working on a VB3 version and I am holding off on upgrading to VB3 just because I can't afford to have search down!

Daijoubu
Sun 15th Feb '04, 1:42am
Like i've suggested in a thread below this one, try off loading static files to a lighter http server?
There's TUX, Boa, thttps, litespeed, etc...
So leave Apache handle vB (with keep-alive off) and host the images via one of those servers (and perhaps try keep-alive on...)

mistwang
Sun 22nd Feb '04, 1:45pm
The scalability of Apache is not very good due to its pre-fork architecture. In order to reach higher scalability, you probably have to run a cluster with Apache.
However, like Daijoubu suggested, you can try other web server with higher scalability, like Zeus, or our LiteSpeed Web Server. ;)
My suggestion is to replace Apache completely as the performance of PHP scripting is better on those servers as well.

If you need any help with LiteSpeed, I will be more than happy to help.

George
=====================
http://www.litespeedtech.com/