1. Introduction 2. Getting Started 3. Attachments 4. Embedded Images 5. Security 6. Unicode 7. Queuing Part I 8. Queuing Part II Object Reference |
Chapter 6. Unicode and Non-ASCII Support
Quoted-Printable Format
AspEmail encodes the message body in the Quoted-Printable format
automatically if the ContentTransferEncoding property is set to
the string "Quoted-Printable" (letter case is immaterial).
You may also set the Charset property
to the appropriate character set. The following code snippet sends
a message in Russian:
<% @codepage=1251 %>
<% The directive <% @codepage=1251 %> instructs
the ASP interpreter to treat the hard-coded characters in the script
as Russian symbols (1251 is the Russian code page). As a result,
the Body property will receive a Russian Unicode string.
Non-ASCII Characters in Headers
<% @codepage=1251 %>
<%
Unicode and UTF-8
From Unicode.org: "Computers ... store letters and other characters by
assigning a number for each one. Before Unicode was invented, there were
hundreds of different encoding systems for assigning these numbers.
No single encoding could contain enough characters...
Unicode provides a unique number for every character,
no matter what the platform, no matter what the program, no matter what the language."
For example, the basic Latin letter "A" has the code Hex 0041 (65), the Russian
letter has the code Hex 0416 (1046), and the Chinese character
has the code Hex 32A5 (12965).
UTF-8 (Unicode Transformation Format, 8-bit encoding form) is the recommended
format to be used to send Unicode-based data across networks, in particular the Internet.
UTF-8 represents a Unicode value as a sequence of 1, 2, or 3 bytes.
Unicode characters in the range Hex 0000 to 007F are encoded simply as bytes
00 to 7F. This means that files and strings which contain only 7-bit ASCII
characters have the same encoding under both ASCII and UTF-8.
Therefore, the Unicode 0041 ("A") in UTF-8 is Hex 41.
Unicode characters in the range Hex 0080 to 07FF are encoded as a sequence of two bytes
For example, the Unicode 0416 ()
is encoded as Hex D0 96. Unicode characters in the range Hex 0800 to FFFF are encoded
as a sequence of three bytes. For example the Unicode 32A5 ()
is encoded as Hex E3 8A A5.
UTF-8 Support in AspEmail
The following code sample demonstrates the UTF-8 usage:
' Enable UTF-8 -> Unicode translation for form items
If Request("Send") <> "" Then
Mail.From = "info@aspemail.com" ' From address
' message subject
' message body
' UTF-8 parameters
<HTML>
<FORM METHOD="POST" ACTION="Unicode.asp">
This code sample has several important elements you must not overlook:
<META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
This META tag specifies the character set for this page to be UTF-8.
This, among other things, instructs the browser to UTF8-encode all form items
when the form is submitted.
Session.CodePage = 65001
This line instructs our ASP script to convert UTF8-encoded form items
(returned by the Request.Form collection) back to regular Unicode strings. The number
65001 is the UTF-8 code page.
Mail.Subject = Mail.EncodeHeader( Request("Subject"), "utf-8")
The second optional argument is set to "UTF-8" for proper encoding of the header.
Mail.CharSet = "UTF-8"
These two lines ensure proper UTF-8 encoding of the message body.
Click the links below to run this code sample:
http://localhost/aspemail/NonAscii/Unicode.asp
Valid CharSet Values
Copyright © 1999 - 2003 Persits Software, Inc. All Rights Reserved Questions? Comments? Write us! |