Saturday, 2 November 2024

UTF-8 All the Way Through: Ensuring Full UTF-8 Support in Your Web Application

 Setting up full UTF-8 support in a web application is essential for handling multilingual content reliably. This guide covers all the key areas—MySQL, PHP, Apache, and HTML—to help you achieve a seamless UTF-8 experience across your stack. Here’s a checklist to ensure UTF-8 is correctly set up at every layer of your web application.

1. Configuring MySQL for UTF-8

To support a full range of Unicode characters, including emojis, configure MySQL to use utf8mb4 rather than utf8, as MySQL’s utf8 only supports up to three bytes (limited to basic multilingual characters).

  • Database and Table Configuration:

    CREATE DATABASE your_database CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;
    ALTER TABLE your_table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    
  • Column Configuration:
    Set each text column to utf8mb4 to ensure character data is stored correctly:

    ALTER TABLE your_table MODIFY column_name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    
  • Connection Settings:
    Set the character set for connections to utf8mb4. This way, data exchanged between MySQL and your application retains its UTF-8 encoding. Use the following configuration depending on your PHP extension:

Read more »

Labels: