On 2007-05-30 00:00, cc96ai <calvin.chan.@gmail.com> wrote:
> I got UTF8 value %C3%A9 Thats's not UTF-8. That's URL-encoded UTF-8.
> how could I encode it become ? You have *decode* it to get . And since it is encoded twice, you have
to decode it twice.
First decode the URL-Encoding:
$s = "%C3%A9";
$s =~ s/%([0-9A-F][0-9A-F])/chr(hex($1))/eg;
(there is almost certainly a module on CPAN which provides a
function to do that - but (to my surprise) neither CGI nor URI
contain such a function, ans its a simple one-liner)
Now you have UTF-8, which you can decode to a "perl character string":
$s = decode('utf-8', $s);
Now you have a string with a single character "".
Now, how does MIME get into it?
For MIME, you again have to decide on a specific character encoding
(e.g., UTF-8, or ISO-8859-1, or whatever), and then possibly on a
specific transport encoding (base64 or quoted-printable).
So you have to encode it in your character encoding first, and then
possibly encode the result again with the transport encoding.
Note that the MIME is a quite complex format (especially the encoding of
header fields described in RFC 2047 and RFC 2231), so I won't go into
more detail unless you tell us exactly what you need. Any advice I can
give (except "use existing modules" and "read the RFCs") is almost
certainly incomplete and will cause you to produce ill-formed messages
if follow it blindly.
_ | Peter J. Holzer | I know I'd be respectful of a pirate
|_|_) | Sysadmin WSR | with an emu on his shoulder.
| | | email@example.com |
__/ | http://www.hjp.at/ | -- Sam in "Freefall"