"Using filesort" on queries: a survey.

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • min:o)
    New Member
    • Oct 2002
    • 4

    "Using filesort" on queries: a survey.

    Hi,
    I'm working on a board with 8.5M posts, 400k threads and 52k members.

    I'm using several webservers behind an hardware loadbalancer and a DB server with multiple replicas to distribute readonly queries. I run VB 3.6.2 without any plugin/hacking and mysql 5.0.22

    I'm noticing major performance problems due to the filesort-based "ORDER BY" approach of mysql triggered by some queries in showthread.php.

    On our forums there are dozens of threads with more that 5k posts, ranging up to 35k replies per thread. Users are browsing these threads generate high load on the DB servers, because mysql is using "filesort" to order the result set.

    In particular, the 45% of the queries in our "slow queries log" are like the following:
    Code:
    SELECT postid 
    FROM forum_post AS post
    WHERE threadid = XX
    AND visible = 1                      
    ORDER BY dateline                      
    LIMIT XX, YY;
    that reveals to be in showthread.php at lines 849 and 875.

    With an EXPLAIN of that query, I get:
    Code:
             id: 1
      select_type: SIMPLE
            table: post
             type: ref
    possible_keys: threadid
              key: threadid
          key_len: 4
              ref: const
             rows: 59644
            Extra: Using where; Using filesort
    1 row in set (0.07 sec)
    As you can see, mysql is invoking filesort. I believe this is the cause of the slowness and that there should be a way to fix it by modifying the table indexes.

    Currently, they are the default ones:
    Code:
    mysql> SHOW INDEXES from forum_post;
    *************************** 1. row ***************************
           Table: forum_post
      Non_unique: 0
        Key_name: PRIMARY
    Seq_in_index: 1
     Column_name: postid
       Collation: A
     Cardinality: 8973231
        Sub_part: NULL
          Packed: NULL
            Null:
      Index_type: BTREE
         Comment:
    *************************** 2. row ***************************
           Table: forum_post
      Non_unique: 1
        Key_name: userid
    Seq_in_index: 1
     Column_name: userid
       Collation: A
     Cardinality: 38347
        Sub_part: NULL
          Packed: NULL
            Null:
      Index_type: BTREE
         Comment:
    *************************** 3. row ***************************
           Table: forum_post
      Non_unique: 1
        Key_name: threadid
    Seq_in_index: 1
     Column_name: threadid
       Collation: A
     Cardinality: 1281890
        Sub_part: NULL
          Packed: NULL
            Null:
      Index_type: BTREE
         Comment:
    *************************** 4. row ***************************
           Table: forum_post
      Non_unique: 1
        Key_name: threadid
    Seq_in_index: 2
     Column_name: userid
       Collation: A
     Cardinality: 8973231
        Sub_part: NULL
          Packed: NULL
            Null:
      Index_type: BTREE
         Comment:
    *************************** 5. row ***************************
           Table: forum_post
      Non_unique: 1
        Key_name: threadid
    Seq_in_index: 3
     Column_name: dateline
       Collation: A
     Cardinality: 8973231
        Sub_part: NULL
          Packed: NULL
            Null:
      Index_type: BTREE
         Comment:
    5 rows in set (0.32 sec)
    How could we optimize this table to solve the problems?
  • min:o)
    New Member
    • Oct 2002
    • 4

    #2
    I temporarily fixed that problem by creating another index "threadid2" which refers to the columns:
    - threadid
    - visible
    - dateline
    now, with an EXPLAIN I correctly get a "Using indexes".

    Anyway, this is not an optimal solution as:
    1. the column "threadid" is in more than one index.
    2. other queries in the same table, like:
    Code:
    SELECT postid, visible, userid
    FROM forum_post AS post
    WHERE threadid = XXX
    AND visible IN (1,2,0)
    ORDER BY dateline;
    still get to use the filesort.

    Any better idea, please?

    Comment

    widgetinstance 262 (Related Topics) skipped due to lack of content & hide_module_if_empty option.
    Working...