We've switched to a new bug tracking system!
Project Tools has been set to read-only. Please use the new tracker going forward.
Attachments cleanup is SLOW Issue Tools
issueid=4716 Wed 14th Apr '04 2:37pm
Senior Member
Attachments cleanup is SLOW

Attachment cleanup code in cleanup2.php is scanning whole attachment table in this query:
$attachments = $DB_site->query("
SELECT attachmentid, userid
FROM " . TABLE_PREFIX . "attachment
WHERE posthash <> '' AND
dateline < " . (TIMENOW - 3600)
);
For our 9G attachment table with 200+k records, it is painfully slow. Slow query log shows times up to 200 seconds per query.
EXPLAIN shows that posthash index is not being used at all (Mysql version 4.0.18).
Issue Details
Project vBulletin
Category Attachments
Status Fixed (Closed)
Priority 3
Affected Version 3.0.x
Fixed Version 3.0.x
Users able to reproduce bug 0
Users unable to reproduce bug 0
Assigned Users (none)
Tags (none)

Wed 14th Apr '04 4:13pm
vBulletin Developer
 
Change

WHERE posthash <> ''

to

WHERE postid = 0
Reply
Thu 15th Apr '04 6:32pm
Senior Member
 
This was not updated in vB 3.0.1. I see that the announcement for vB 3.0.1 was only about 5 hours or so after you fixed this, so maybe it didn’t get in cause I just downloaded it and the query is still:

Code:
	$DB_site->query("
		DELETE FROM " . TABLE_PREFIX . "attachment
		WHERE posthash <> '' AND
		dateline < " . (TIMENOW - 3600)
This query is killing my server every hour when this cron is running. This query is shooting iowait through the roof and all other queries get locked and things get unresponsive.

Couple questions:

1) Are “all” the indices on the attachment table necessary? DELETE is going to take longer the more indices there are on the table (Not implying you guys don’t know that, just offering a suggestion). Just wondering cause it would take a while for me to look at the entire app and determine if they’re all necessary.
2) is the dateline clause necessary?
3) In addition to # 2, if I’m to understand the purpose of this query in the first place … what is this query doing? I’m assuming it is deleting attachments that do not have a parent post they’re linked to, because someone deleted a post without deleting the attachment(s). If that’s the case, why even have the functionality to delete a post and keep attachments if they’re going to be deleted every hour anyway.

For now I’m commenting this block of code from my cron script. I suggest you guys seriously look at this for the sake of your users that run very large boards, cause it is a big problem and most people will have no idea something this hidden would be affecting their site.
Reply
Thu 15th Apr '04 6:34pm
Former vBulletin Developer
 
The fix definately made it in to 3.0.1.
Reply
Thu 15th Apr '04 6:40pm
Senior Member
 
O shoot, I apologize, I just noticed that it was changed for attachments that are stored as files on disk. The query for attachments stored in the db was not changed, which is my situation. Look at the else statement query for removing orphaned attachments in cleanup2.php
Reply
Thu 15th Apr '04 6:56pm
Former Senior Developer, vBulletin
 
Yes, the dateline part is necessary. It's not for deleted posts -- it's for people who attach files to a post, but then never actually make the post. Thus, you want the attachments to live for a while, before you assume they've been abandoned.

It should be the same fix for the else part, but I'm going to move this back to confirmed as Freddie knows the system a bit better than I do.

I will look into the index thing in a little while. Selects should be much more common than deletes and (to a certain extent) inserts, so it may be justified. (There's a good possibility that this query won't actually do any deletes, so the indexes can even help it then.)
Reply
Thu 15th Apr '04 7:08pm
Senior Member
 
OK, I see, I thought I was wrong on the intention of the query, just wasn’t making sense. Anyway, makes sense now, because yea, you have to add attachments before you actually submit the post. Gotcha.

If the query can be sped up that would be great, although, because this isn't important enough to run "every hour" maybe it belongs in it's own weekly cron late at night. I’m glad you guys thought of this and have a means to deal with it but I don’t think it happens often enough to run a check every hour. I was already thinking about just adding it to the weekly digest script. Just a suggestion because it’s such a hogger query.
Reply
Thu 15th Apr '04 9:27pm
vBulletin Developer
 
Oh shoot I missed that part, just change both parts to be postid = 0 and you'll be as optimized as you can be.

It runs every hour as originally there was no way for users to delete abandoned attachments so we needed to clean them up hourly. Now they can do it through their attachment manager so we don't need the hourly cleanup.

I will modify it so that it doesn't run so often.
Reply
Wed 21st Apr '04 12:01am
Senior Member
 
No wonder! This was killing my server every hour. I better fix this.
Reply
Wed 21st Apr '04 8:55pm
Senior Member
 
Ahmm uhm, if this is really a *known* yet not fully fixed server slow-downer, I think it deserves to be mentioned in the vBulletin Announcement section!

Now they can do it through their attachment manager so we don't need the hourly cleanup.

I will modify it so that it doesn't run so often.
What exactly is 'not so often'? Every 1 day? 1 week?
Reply
Wed 21st Apr '04 11:43pm
Senior Member
 
no wonder mysql process hits high server load... i also better fix it
Reply
Thu 22nd Apr '04 8:19am
Senior Member
 
I agree that this is an urgent fix for larger sites. My site has over 3.5 million posts now, and this was really bringing the server down on its knees.
Reply
Thu 22nd Apr '04 1:14pm
vBulletin Developer
 
Doesn't matter how many posts you have, just how MANY attachments as this query cleans up the attachment table.

http://www.vbulletin.com/forum/showt...556#post658556
Reply
Fri 23rd Apr '04 12:08am
Senior Member
 
I have a dedicated forum for photo attachments which is very popular. :) So it was very relevant for my site.
Reply
Reply