I saw in Wayne's October article on mysql best practices that he recommends a database character set of utf8mb4 and a collation of utf8mb4_general_ci. From what I read online, most people recommend utf8mb4_unicode_ci; apparently this is a newer version. Is there a reason you recommend general, or can we use either?
utf8mb4_general_ci or utf8mb4_unicode_ci for collation
Collapse
X
-
The collation determines how characters are sorted, nothing more. This would be used in alphabetical sorts and unstructured data lookups in MySQL. They do not determine what is stored as that is the role of the character set. Either will work for general purposes. If your site has a lot of multi-byte characters (Arabic, Cyrillic, Hebrew, or Asian languages) then utf8mb_unicode_ci will probably suit you better. It is mostly Latin language characters, you probably won't see a difference.
You can read this thread on StackExchange for more information:
https://stackoverflow.com/questions/...tf8-unicode-ci
The character set of UTFMB4 is the most important part though. This provides support for the majority of characters in languages around the world as well as some speciality characters like the UTF-8 Emoji Standard.Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 API -
Wayne - Thanks for the info.
Regarding utf8mb8, my database is currently in latin1 which is creating random junk characters in various posts (this was discovered in my other post about ipad issues). I have no idea why the database was setup in latin1, but I know I need to change it. I plan to modify the database from latin1 to utf8mb4 this evening. Other than the standard myslq commands to modify the database char set and collation (for all tables), is there anything else that I should be mindful of when moving to utf8mb4? My biggest concern is data loss or truncation. I'm assuming I can alter the database, change the language settings in VB to UTF-8, clear the cache and everything will be good, or are there any other specific steps or cautions that I need to be aware of??
Admin for hobbysquawk.com
VB 5.74
PHP 8.2
Maria DB 10.6Comment
-
latin1 was the default format for MySQL until 5.7.something. The default for new databases is UTF8MB4 in MySQL 8.0 (there is no MySQL 6 or 7). If vBulletin's installer doesn't create the database, it will use the MySQL Server's default. In current versions of vBulletin's installer will create the database with the UTF8MB4 character set and the collation will be utf8mb4_general_ci (hence my recommendation). The installer only affects new installations though. Upgrades do not change the character set and collation.
Attached is a set of experimental tools to convert your database to UTF8MB4 with a utf8mb4_general_ci collation. The documentation is in the readme.md text file. These should be run on a copy of your database for testing purposes before trying to convert your live database. They are experimental but I have used them on several dozen databases without issue. However, they have always been English databases. These tools will only look at standard vB5 tables in their present configuration. If you have other tables, you'll have to manually change them.Attached FilesTranslations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 APIComment
-
Wayne,
I just finished running the scripts on a backup of our vbulletin database. A couple of questions:
1. A few tables remain as MyISAM. They include: tagcontent, phrase, language, impexerror, userfield and mybb_attachments. Is this OK or should I attempt to change the type?
2. See attached image. About 30 or so tables did not convert, they remain in latin1. Is this ok, or do I need to change these manually?Last edited by mark.hs; Mon 12 Nov '18, 1:24pm.Admin for hobbysquawk.com
VB 5.74
PHP 8.2
Maria DB 10.6Comment
-
1. language, phrase and userfield are left as MyISAM because our minimum required version of MySQL is 5.5.8. You need to be running MySQL 5.7 to change them and we haven't tested those changes. tagcontent and mybb_attachments are not vBulletin 5 tables.
2. The tables that weren't converted in the image do not appear to be vBulletin 5 tables. The upgrade scripts should have removed most of these if your database user has permission to delete tables.Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 APIComment
-
I just completed the conversion. Everything went smoothly. Two quick follow-up questions:
1. Do I need to anything with search, such as re-index, or will that sort itself out over time?
2. I seem to have 20 or so tables in the database that don't belong, at least they didnt convert to utf8mb4 and they little or no data within them. Many appear to be related to VB4. Do you have a list of all valid tables in VB5.x so I can compare the master list to what's inside my database?
Admin for hobbysquawk.com
VB 5.74
PHP 8.2
Maria DB 10.6Comment
-
1. Search tables should have been converted and not emptied. If you emptied the search, then you should rebuild. If you use Sphinx Search, you should rebuild the index.
2.
vb_ad vb_adcriteria vb_adminhelp vb_administrator vb_adminlog vb_adminmessage vb_adminutil vb_announcement vb_announcementread vb_apiclient vb_apiclient_devicetoken vb_apilog vb_attach vb_attachmentcategory vb_attachmentcategoryuser vb_attachmentpermission vb_attachmenttype vb_attachmentviews vb_autosavetext vb_avatar vb_bbcode vb_bbcode_video vb_cache vb_cacheevent vb_calendar vb_calendarcustomfield vb_calendarmoderator vb_calendarpermission vb_channel vb_channelprefixset vb_closure vb_contentpriority vb_contenttype vb_cpsession vb_cron vb_cronlog vb_customavatar vb_customprofile vb_customprofilepic vb_datastore vb_deletionlog vb_discussion vb_editlog vb_event vb_externalcache vb_faq vb_fcmessage vb_fcmessage_offload vb_fcmessage_queue vb_filedata vb_filedataresize vb_forumpermission vb_forumrunner_attachment vb_forumrunner_push_data vb_forumrunner_push_users vb_gallery vb_groupintopic vb_holiday vb_hook vb_humanverify vb_hvanswer vb_hvquestion vb_icon vb_imagecategory vb_imagecategorypermission vb_infraction vb_infractionban vb_infractiongroup vb_infractionlevel vb_ipaddressinfo vb_language vb_legacyevent vb_link vb_loginlibrary vb_mailqueue vb_mapiposthash vb_messagefolder vb_moderation vb_moderator vb_moderatorlog vb_node vb_nodehash vb_noderead vb_nodeview vb_notice vb_noticecriteria vb_noticedismissed vb_notification vb_notificationevent vb_notificationtype vb_package vb_page vb_pagetemplate vb_passwordhistory vb_paymentapi vb_paymentinfo vb_paymenttransaction vb_permission vb_photo vb_phrase vb_phrasetype vb_picturecomment vb_picturecomment_hash vb_poll vb_polloption vb_pollvote vb_postedithistory vb_prefix vb_prefixpermission vb_prefixset vb_privacyconsent vb_privatemessage vb_product vb_productcode vb_productdependency vb_profilefield vb_profilefieldcategory vb_profilevisitor vb_ranks vb_redirect vb_report vb_reputation vb_reputationlevel vb_routenew vb_rssfeed vb_rsslog vb_screenlayout vb_searchlog vb_searchtowords_a vb_searchtowords_b vb_searchtowords_c vb_searchtowords_d vb_searchtowords_e vb_searchtowords_f vb_searchtowords_g vb_searchtowords_h vb_searchtowords_i vb_searchtowords_j vb_searchtowords_k vb_searchtowords_l vb_searchtowords_m vb_searchtowords_n vb_searchtowords_o vb_searchtowords_other vb_searchtowords_p vb_searchtowords_q vb_searchtowords_r vb_searchtowords_s vb_searchtowords_t vb_searchtowords_u vb_searchtowords_v vb_searchtowords_w vb_searchtowords_x vb_searchtowords_y vb_searchtowords_z vb_sentto vb_session vb_sessionauth vb_setting vb_settinggroup vb_sigparsed vb_sigpic vb_sigpicnew vb_site vb_smilie vb_spamlog vb_stats vb_strikes vb_style vb_stylevar vb_stylevardfn vb_subscribediscussion vb_subscribeevent vb_subscription vb_subscriptionlog vb_subscriptionpermission vb_tag vb_tagnode vb_tagsearch vb_template vb_templatehistory vb_templatemerge vb_text vb_thread_post vb_trending vb_upgradelog vb_user vb_useractivation vb_userauth vb_userban vb_userchangelog vb_userfield vb_usergroup vb_usergroupleader vb_usergrouprequest vb_userlist vb_userloginmfa vb_usernote vb_userpromotion vb_userstylevar vb_usertextfield vb_usertitle vb_video vb_videoitem vb_widget vb_widgetchannelconfig vb_widgetdefinition vb_widgetinstance vb_widgetuserconfig vb_words Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 APIComment
-
The forumrunner tables won't have been updated and if you do not have the forumrunner package installed under Hooks & Products, you can disregard them.Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 APIComment
-
We are using Sphinx. Can we rebuild the search index by going to Admin CP > Maintenance > General Update Tools > Rebuild Search Index, or do we need to do it via the command line with the searchindex.php file?
If we can go through the Admin CP, do we need to Empty the search index first or just click Rebuild the Search Index?
Admin for hobbysquawk.com
VB 5.74
PHP 8.2
Maria DB 10.6Comment
-
Using searchindex.php will be slightly quicker as you're removing the HTML intefrace. Also it won't time_out and needs less babysitting than the AdminCP rebuild.
Deleting the Sphinx index files and restarting the server should also trigger a reindex.Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 APIComment
-
Picking up on this rather old thread. But is this still the case?
I see when I create a new database on my server running MySQL 8.0.27 that the collation defaults to utf8mb4_0900_ai_ci for that new database. But I read this collation is 'is accent-insensitive' so I suspect for my forum which uses Dutch and French as languages this isn't the way to go as for example an username Rene and René would clash.
Is my understanding correct on this? And should I still use utf8mb4_general_ci instead?Comment
-
Your collation determines how MySQL sorts query results. That is all it does. It does not determine whether a character exists or not, the character set determines which characters are available. 0900 will sort better than general, in most cases. 0900 follows the latest unicode standards whereas general may not.
The ai in the collation means accent insensitive. The ci means case insensitive. These are recommended for most languages. It means that Cafe, Café, cafe and café will all be sorted together.
As for your username example. Both would be allowed be e and é are different characters. Again the collation doesn't limit the characters that can be used. The accent just wouldn't be accounted for in sorting a list of usernames so both names should appear together (one after another) in the list instead of having René appear after Renz.Translations provided by Google.
Wayne Luke
The Rabid Badger - a vBulletin Cloud demonstration site.
vBulletin 5 API👍 2Comment
-
Related Topics
Collapse
-
by imagerI've just notice this error at the bottom ofg the AdminCP options page for Message Posting and Editing Options:
What do I need to do to troubleshoot this? It may be unconnected...-
Channel: Support Issues & Questions
-
-
I have tables created with vb3 that is Latin1-swedish and new tables created with vb5 that is UTF8
Is there any guide how to convert everything to UTF8? I have big problems using Tapatalk...-
Channel: Support Issues & Questions
-
-
by fpostmaFile /core/includes/config.php file includes an example line like this;
// $config['Mysqli']['charset'] = 'utf8';
and the readme from the utf8 toolset also says to set that...-
Channel: vBulletin 5 Connect Feedback
-
-
All of my tables inside MySQL are showing a collation type of "latin1_swedish_ci". Is this correct, or should it be utf8_general_ci?
-
Channel: Support Issues & Questions
-
-
by amonlineIs the Safari issue being worked on yet?
This happened a year or so ago and was repaired. It’s been back for a couple/few months now. Tired of members complaining about it....-
Channel: Support Issues & Questions
-
Comment