View Full Version : SMF Encoding problem
djilou
Wed 25th Jul '07, 2:17pm
Hi
My database is using UTF-8 encoding. I want to migrate SMF forum data that contains arabic and french characters to vBulletin forum.
here is how I do configure ImpEx:
$impexconfig['target']['charset'] = 'utf8';
$impexconfig['source']['charset'] = 'utf8';
define('use_utf8_encode', true);
Langage Manager => Edit settings => HTML Character Set to UTF-8
Result :
I get Arabic Characters displayed like this:
ف?? ?*ر?ة طا?با? ???ة جد?دة ت?ت?? ا?ساعة ا??ا?*دة بعد ??تصف ا???? با?ت???ت ا???*?? ?ت?ف?ذ ?طا?ب?ا, ?إ?ا فإ??ا ست?ت? ?ز?دا ?? ا?ر?ائ?.
And the french Char : déj�* après
instead of : dj aprs
Now If I change the ImpEx config to :
$impexconfig['target']['charset'] = '';
$impexconfig['source']['charset'] = '';
define('use_utf8_encode', true);
Result :
I get Arabic Characters displayed like this: ????????
How can I fixe this please?
Jerry
Wed 25th Jul '07, 8:13pm
And both databases are set to use utf8 ?
djilou
Wed 25th Jul '07, 8:50pm
yes both are set to utf8_general_ci
during the import process I see
44.44% Forum -> طالبان
the arabic chars are displayed correctly during the import
but when I went to my new board, I see طالبان instead of طالبان
Jerry
Thu 26th Jul '07, 3:34pm
Can you check the database of the target after the import to see what the chars are like.
On line 121 of impex/systems/smf/007.php there is this :
$try->set_value('nonmandatory', 'pagetext', $this->smf_html($this->html_2_bb($post_details['body'])));
You can change that to this to remove all the parsing of the content :
$try->set_value('nonmandatory', 'pagetext', $post_details['body']);
djilou
Thu 26th Jul '07, 8:50pm
Can you check the database of the target after the import to see what the chars are like.
chars on vb database are like this:
ف *رة طابا
ة جددة تت اساعة اا*دة بعد
chars on smf database are like this :
كابل تؤكد مقتل رهينة كوري وطالبان تحدد مهلة جديدة
On line 121 of impex/systems/smf/007.php there is this :
$try->set_value('nonmandatory', 'pagetext', $this->smf_html($this->html_2_bb($post_details['body'])));
You can change that to this to remove all the parsing of the content :
$try->set_value('nonmandatory', 'pagetext', $post_details['body']);
same problem here
djilou
Thu 26th Jul '07, 8:52pm
Ok arabic chars are correctly imported with this config
// Advanced Target
$impexconfig['target']['databasetype'] = 'mysql'; // currently mysql only
$impexconfig['target']['charset'] = '';
$impexconfig['target']['persistent'] = false; // (true/false) use mysql_pconnect
// Advanced Source
$impexconfig['source']['charset'] = 'utf8';
$impexconfig['source']['persistent'] = false;
# pagespeed is the second(s) wait before the page refreshes.
$impexconfig['system']['language'] = '/impex_language.php';
$impexconfig['system']['pagespeed'] = 1;
define('impexdebug', false);
define('emailcasesensitive', false);
define('forcesqlmode', false);
define('skipparentids', false);
define('shortoutput', false);
define('do_mysql_fetch_assoc', false);
define('step_through', false);
define('lowercase_table_names', false);
define('use_utf8_encode', false);
now I need some help to get old's SMF urls redirected to vB's urls :)
http://www.vbulletin.com/forum/showpost.php?p=1396143&postcount=11
Thanks
HadiK
Sun 14th Oct '07, 9:42am
Hi..
I have the same problem ..i changed the config file but any other changes required ??
Jerry
Mon 15th Oct '07, 9:06pm
Hi..
I have the same problem ..i changed the config file but any other changes required ??
That both databases are using the same char set and that it's set in vBulletin.
HadiK
Sat 20th Oct '07, 8:20am
Hello All,
Many thanks for you help and here are the steps i used and managed to solve it :D
1- Upon the creation of the Datababase using Cpanel or PHPMyAdmin, and before anything you should the colletion to use utf_general_ci.
after creating the forums and before starting importing the data. your should take a backup of the source database (SMF users, i would recommend MySQLAdministrator)
2- Edit the Impex file as mentioned above, for more detail please check http://www.vbulletin.com/docs/html/impex
3- Follow on screen steps.
Still you need to change of the encoding of the langauege files, to use UTF-8 and upload them.
See !! easy !!:D
Powered by vBulletin™ Version 4.0.0 Beta 4 Copyright © 2009 vBulletin Solutions, Inc. All rights