Decoding "quoted-printable" -- Help needed

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Decoding "quoted-printable" -- Help needed

JJS via use-livecode
Even with a lot of research and comparing functions in C# and Javascript, I
do understand it yet.

In E-Mail-bodies, the content parts are often either based64-encoded, no
problem with that, but there are also other encodings called
"quoted-printable". This is text that in my case needs to be converted to

Now, here all characters that are not pure ASCII are marked with a equal
sign "=" (similar to the "%" in an URL encoded string) and the following
two characters define the byte value in Hex notation. There can be one, two
and even three separate byte values for a character encoded in UTF-8.

Example: "F=C3=BCr". This translates to the German Umlaut and would render
to the string "für". The "ü" is not part of the pure ASCII and therefore it
is encoded this way. It is an encoding specific for UTF-8.

Now, as you can see, there is not just one byte represented with "=C3".
There are actually two bytes "=C3=BC": represented in Hex by "C3" and "BC"
each individually converted to decimal notation as 195 and 188. If you
URL-encode the single bytes using "%" instead of "=" such as "%U3" it will
give it's own character whith will be "À". The URL-encoding of "%BC" gives
"Ä". So, this does not help. I have to somenow look at the two bytes

Converting pure ASCI to Hex gives the correct result in other programs:
-- Link:
-- Enter: "ü"
-- Result: "C3,BC" --- what we are looking for when encoding: Two separate
byte representations.
-- But it only works when the character encoding is UTF-8.

How do I come from "=C3=BC" to codepoint("ü") = 252? What do I need to
How do we  decode such "quoted-printable" encoded string to UTF-8?

Thanks in advance...)
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences: