Hello! I'd like to enable users to search my vB using use my Excite for Web Servers ("EWS") search engine (EWS searches can be made in plain English, the results include "more like this one" links, and other features that I like better than those in the default vB search engine).
EWS wants local html files to read and then it digests them, once a night per a cron job.
So I need to convert vB threads to local html files (basically mirroring my site, as it exists at a particular time in the middle of each night). Then I'll have EWS read the html files.
Here's the code I came up with, based on a nice hack posted elsewhere (http://vbulletin.com/forum/showthread.php?threadid=1092) on these boards. I call it "all_threads_with_mirror.php" and I run it by typing http://mysitename.net/all_threads_with_mirror.php into a browser. I wonder if anybody could suggest a more efficient way of getting at the threads in vB and of writing the resulting data to individual HTML files on my site's hard drive. Thanks!!
I'm able to get about two HTML files written per second this way, of an average size of about 20k. But with 7,400 threads this takes about an hour. Server load on my Linux box typically goes up from about 0.2 to a shade under 2.0.
Maybe there's PHP magic I don't know about, or I'm doing it inefficiently?
[Edited by Dave Baker on 10-13-2000 at 08:42 PM]
EWS wants local html files to read and then it digests them, once a night per a cron job.
So I need to convert vB threads to local html files (basically mirroring my site, as it exists at a particular time in the middle of each night). Then I'll have EWS read the html files.
Here's the code I came up with, based on a nice hack posted elsewhere (http://vbulletin.com/forum/showthread.php?threadid=1092) on these boards. I call it "all_threads_with_mirror.php" and I run it by typing http://mysitename.net/all_threads_with_mirror.php into a browser. I wonder if anybody could suggest a more efficient way of getting at the threads in vB and of writing the resulting data to individual HTML files on my site's hard drive. Thanks!!
Code:
<? require("global.php"); $MIRROR_DIR = "/www/vhosts/mysitename/mirror"; $threads=$DB_site->query("SELECT threadid,title FROM thread WHERE visible=1 ORDER BY lastpost DESC"); while ($threadarray = $DB_site->fetch_array($threads)) { $threadid = $threadarray["threadid"]; $title = $threadarray["title"]; $title_in_html = htmlspecialchars($title); print "<a href=\"search/$threadid.php\">$title_in_html</a><br>\n"; set_time_limit(60); $thread_text = ""; if (!$file=fopen("http://mysitename.net/showthread.php?threadid=$threadid" , "r")) { echo("Could not open http://mysitename.net/showthread.php?threadid=$threadid"); // If fopen() returns 0, couldn't open file } else { while (!feof($file)) { // Continue until feof() is true $thread_text .= fgetc($file); } } if (!$filetowrite=fopen("$MIRROR_DIR/$threadid.html" , "w")) { echo("Could not open $MIRROR_DIR/$threadid.html for writing"); // If fopen() returns 0, couldn't open file } else { fputs($filetowrite, $thread_text); } } ?>
Maybe there's PHP magic I don't know about, or I'm doing it inefficiently?
[Edited by Dave Baker on 10-13-2000 at 08:42 PM]
Comment