View RSS Feed

IBxAnders

Part 1 vB4mance: Helping communities grow, performance data model changes in vB 4.0

Rating: 12 votes, 4.67 average.
by IBxAnders on Mon 26th Oct '09 at 11:26pm (4976 Views)
The new vBulletin 4.0 boasts a lot of new changes such as CSS-friendly styles, multi-content search and other various new features. The biggest improvement in vBulletin 4.0, however – is well hidden under the hood and it is the actual changes to the database storage mechanism. While there are exceptions, this change improves the way vBulletin sites scale upwards to serve more visitors and more content without sacrificing performance. As a preface to my blog post, allow me to please explain its purpose. I'd like to very simply describe the database processes and the problems that the average forum owner is facing. I'd also like to explain the basic solutions to these problems. Furthermore, I'd like to discuss this in the context of exciting changes coming in vB 4.0. I also further understand the problem, and I realize that this change is not a complete solution to the problems that large forums are facing, but it is a really good and exciting step in the right direction towards helping vBulletin perform better.

Let's talk Simple Database Server Mechanics:
In an average installation, vBulletin typically stores data in only one database, while it uses many different tables inside that database to store different sets of data. For example, user information is stored in a table called “user”, and post text is accordingly stored in a separate table called “post”. When a new thread is added, or a reply to a post is made, a new “row” is created in these database tables to store the additional information. When a visitor wants to read a thread, they click on a link sending the database server a command to query the various database tables and show the specified results. An average active discussion forum could be seeing hundreds of these queries per second.

Defining the General Database Problem and Errors:
As your discussion forum grows, you will notice that your hosting requirements may change drastically. As the user activity on your forum increases, so is the amount of data collected in your database. In turn, as the physical size of the database increases the server uses more and more hardware resources to service each individual request, such as displaying a thread or adding a new reply. When the server becomes overwhelmed with requests it starts taking a longer time to service each query, this starts a chain reaction where even the simplest of queries aren't being serviced; this condition is called “Locked”. The best way to visualize the “Locking” issue is to imagine a road traffic jam. This problematic condition manifests itself via MySQL error messages and general site slowness. It's believed that upgrading to a better web host (ie. Dedicated server) would resolve the performance issues; and while it's generally good advice – the reality is that upgrading to a better host only increases the safety buffer before the errors will occur again. In our example of a traffic jam, a server upgrade will increase the amount of travel lanes – but the traffic jam is still possible, even though more cars are able to travel. This traffic jam condition is not exclusive to vBulletin, but rather to a default implementation of MySQL; let's further examine the problem by talking about the Search function.

Taking a Closer Look at “Search”
Let's drill down and examine the search functionality in vBulletin 3. Earlier, we talked about the “thread” and “post” tables in the forum database. When a forum user executes a search, a query is issued that checks the “post” and “thread” tables for availability of the specified keywords. This process is very intensive and it uses an internal MySQL function called “fulltext” database search. Both the “thread” and “post” tables have an index that speeds up this process, something that resembles a table of contents in a book and it's called a “fulltext index”. While this search process is happening, the database server has much less resources dedicated to serving other queries (reading and especially writing). Depending on server capacity, activity level and the size of your post table, a single simple search can cause the aforementioned traffic jam condition and bring your forum down.

Examining the Search Issue and Solutions
Currently, the most common approach to scaling a vBulletin forum to handle the load is to avoid the traffic jam condition entirely. Lets evaluate the root cause of the problem.

Issues can be found in the mechanism storing and retrieving the data inside the database tables; this storage engine is called “MyISAM”. When a database request is made on a specific table (specifically the “write” query), the MyISAM engine temporarily “locks” the specified table briefly making it unavailable to all other requests. This means that given the typical conditions while it's searching through the larger tables such as “post” all other requests are quickly queuing as well, creating the traffic jam condition, in turn spewing database errors and slowing down the site.

The solution to “table level locking” in MyISAM is as simple as converting the database table engine to type “InnoDB”. At a first glance, InnoDB is very similar to MyISAM; the main difference being that unlike MyISAM engine, InnoDB uses “row level” locking, whereas the specific table is never locked completely and is always available to service all other requests. This means that when an expensive search query is executed and and/or writing occurs, the “post” table does not lock and delay all other requests from successfully completing.

InnoDB - The important difference between vB 3.x and 4.0
The main problem is that InnoDB does not support “fulltext” searching; this means that converting to InnoDB will break the default search engine in vBulletin 3. This is due to InnoDB requiring that “fulltext index” be dropped when the conversion occurs. No full text index, no search; it's as simple as that. Currently, the solution is to use a third-party search engine such as Sphinx. vBulletin 4 data model tackles this problem head on by redesigning the search engine functionality by no longer relying on a fulltext index in the “post” and “thread” tables and giving the flexibility of optimization without sacrificing the search.

Database performance changes in vB 4.0
As I've mentioned in the beginning of my blog post – what's really cool in vBulletin 4.0 is not necessarily apparent to the untrained eye. Previously, vBulletin utilized the fulltext index on both “thread” and “post” tables making optimization and server problem resolution very expensive and complicated. vBulletin 4.0 no longer needs the fulltext index to be present in either of the “thread” or “post” tables. This means that vBulletin 4.0 database can utilize the InnoDB table engine and prevent a lot of common database issues associated with locked database queries. This is great news for small to medium board owners who will be able to resolve some performance overhead problems without having unnecessary expenditures for expensive hosting or complicated third-party search engines.


What else is new?
Another cool update in vBulletin 4.0 are the new indicies that it builds on tables such as thread and post that further optimize database performance. This is especially beneficial to forums with “long” threads where replies could reach well into the thousands.

Stay tuned for a detailed guide to InnoDB converting vBulletin 3.8 and 4.0.
Submit "Part 1 vB4mance: Helping communities grow, performance data model changes in vB 4.0" to Digg Submit "Part 1 vB4mance: Helping communities grow, performance data model changes in vB 4.0" to del.icio.us Submit "Part 1 vB4mance: Helping communities grow, performance data model changes in vB 4.0" to StumbleUpon Submit "Part 1 vB4mance: Helping communities grow, performance data model changes in vB 4.0" to Google Email Blog Entry

Updated Tue 27th Oct '09 at 5:11pm by IBxAnders

Tags
innodb, myisam
Categories
vBulletin

Comments

Page 1 of 3
FirstFirst 1 2 3 ... LastLast
  1. David Grove -
    David Grove's Avatar
    Thanks for the post. (Note- there's an extraneous "n" at the beginning of the post)
  2. deuterium -
    deuterium's Avatar
    Nice post, very informative
  3. Michael Biddle -
    Michael Biddle's Avatar
    Thanks Anders
  4. Alien -
    Alien's Avatar
    Appreciate the details, we're looking forward to the changes...
  5. veenuisthebest -
    veenuisthebest's Avatar
    Thanks for the cool write up. Looking forward to Part 2.
  6. Ryan Ashbrook -
    Ryan Ashbrook's Avatar
    Awesome stuff!
  7. ThorstenA -
    ThorstenA's Avatar
    Great blog post!
  8. Dody -
    Dody's Avatar
    Waiting for the conversion guide
    Well done Anders
  9. nuno -
    nuno's Avatar
  10. Alfa1 -
    Alfa1's Avatar
    How does this relate to large, busy boards? Could you please address this in part 2?
  11. Allan -
    Allan's Avatar
    interesting
  12. Dean C -
    Dean C's Avatar
    You're extremely naive if you think that switching to InnoDB will be the cure for all the performance related problems in vBulletin. Certain tables in vBulletin lend themselves to using MyISAM, but far more needs to be done to the database to help. For one get rid of the comma seperated lists for usergroup memberships, and switch to a properly normalised (and indexed) table for identifying usergroup memberships.

    I don't doubt your knowledge on the subject matter Anders, as you administrate some of the largest vBulletin forums, but could we also have some benchmarks on misc pages within vBulletin pre and post vB4
  13. IBxAnders -
    IBxAnders's Avatar
    Quote Originally Posted by Dean C
    You're extremely naive if you think that switching to InnoDB will be the cure for all the performance related problems in vBulletin. Certain tables in vBulletin lend themselves to using MyISAM, but far more needs to be done to the database to help. For one get rid of the comma seperated lists for usergroup memberships, and switch to a properly normalised (and indexed) table for identifying usergroup memberships.

    I don't doubt your knowledge on the subject matter Anders, as you administrate some of the largest vBulletin forums, but could we also have some benchmarks on misc pages within vBulletin pre and post vB4
    Hey Dean, I don't think I am making a bold claim anywhere in there that 4.0 will be the absolute solution to every problem. Rather, I am describing a change that takes vB 4 into the right direction of improvement.
  14. IBxAnders -
    IBxAnders's Avatar
    Quote Originally Posted by Alfa1
    How does this relate to large, busy boards? Could you please address this in part 2?
    Good question. Will add to the next post. Out of curiosity, what is your definition of "large" and "busy"; how do you personally qualify that?
  15. bahbah -
    bahbah's Avatar
    Thank you for your post Anders. This post has been the most compelling thing I've read/seen for months to make me want to upgrade to 4.x Not because I expect the solutions proposed in the post to be the "silver bullet" but because it's comforting to read that someone is actually doing something about the problems and taking them seriously.
  16. IBxAnders -
    IBxAnders's Avatar
    Quote Originally Posted by bahbah
    Thank you for your post Anders. This post has been the most compelling thing I've read/seen for months to make me want to upgrade to 4.x Not because I expect the solutions proposed in the post to be the "silver bullet" but because it's comforting to read that someone is actually doing something about the problems and taking them seriously.
    Thanks for reading. Additionally, we do run various stress tests on functionality such as search to optimize and improve the software. Other knowledge is derived from testing and tuning InternetBrands' own big-boards, for example - Honda-Tech, that has 36,000,000 posts and thousands of users online. That knowledge is starting to see its way into vBulletin software with the upcoming releases, starting with 4.0.
  17. dutchbb -
    dutchbb's Avatar
    Nice blog post, we need more of these, I like to read that stuff
  18. ---MAD--- -
    ---MAD---'s Avatar
    Very nice blog post, your examples should help those that don't understand how mysql works .

    Will sphinx search be easier to integrate now?

    You say these changes will help small to medium boards, what about larger boards with millions of posts and hundreds/thousands active users?
  19. Trevster -
    Trevster's Avatar
    Thanks Anders - and thanks especially for putting it in plain English! Too many times have I looked at Blog posts around the web and found nothing but 'Technobabble'...
  20. kmike -
    kmike's Avatar
    Alas, your post is probably too light on details for me.

    You talk mainly about the MySQL fulltext search as utilized by vBulletin, but what about the "other" vBulletin search type which used to be the default? Is it now the default? Was it reworked to be faster and more efficient?
Page 1 of 3
FirstFirst 1 2 3 ... LastLast
Total Trackbacks 0

Trackbacks

Trackback URL: