Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
The character encoding (sometimes incorrectly referred to as the 'charset') determines how a user agent interprets the 1's and 0's as 'characters'. UTF-8 is one popular encoding which uses between 1 and 6 octets (8-bit numbers) to represent any given code point in the Unicode repertoir.
An HTML page has a given character encoding, and you don't want to change that in the middle of the document.
An HTML page is also written in a primary natural language, e.g. Japanese. This is declared via the 'lang' attribute in the <html> tag:
HTML Code:
<html lang="ja">
If you include words or phrases in another language in your page, you should mark these up using another 'lang' attribute, e.g.
HTML Code:
<span lang="sv">det här är svenska</span>
It used to be common to have specific character encodings for different parts of the world, e.g. Big5 in China, EUC in Japan, ISO 8859-1 in Europe. In order to get away from the problems with these incompatible encodings, we now have UTF-8 and UTF-16.
A page written mainly in a Western language, with some parts in an Oriental language, should probably use UTF-8. A page written mainly in an Oriental language might be more efficient to encode with UTF-16. UTF-8 would still work, but it might make the document larger (since most ideographs might require 3-4 octets, whereas UTF-16 uses 2 octets for any code point).