encoding woes!?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

encoding woes!?

Stephen Barncard via use-livecode
Hi all,

macOS 10.14.6, LC 9.5

I have a file created with BBEdit with this content:
------------------------------------------
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>-</title>
</head>
<body>
<p>äÄüÜöÖßßß</p>
</body></html>
-------------------------------------------
Then I set the url of a browser widget to that file.
All fine so far...

Now I want to add some more text with umlauts to that file:
...
put the htmltext of widget "browser" into tText
## Looks exactly like the above in the debugger!

put "<p>ööääüü</p>" & CR before line -1 of tText
## Last line is the footer -> </body></html>
## And when I set the HTMLtext of widget "browser" to tText, all is fine, too
set the htmltext of widget "browser" to tText

## Then I do:
put textencode(tTExt,"UTF8") into url("file:" & specialfolderpath("desktop") & "/test.html")
...

When I now load that file into the widget:
...
set the url of widget "browser" to (specialfolderpath("desktop") & "/test.html")
...
Umlauts are gone like -> äÄüÜöÖßßß or worse

If I paste the same content (of tText copied from LC) in BBEdit and save the file it looks great
in the browser widget and Safari.

What am I missing or doing wrong?
Clueless... :-/

Thanks a lot in advance!


Best

Klaus
--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode


> Am 29.10.2019 um 18:24 schrieb Klaus major-k via use-livecode <[hidden email]>:
>
> Hi all,
>
> macOS 10.14.6, LC 9.5
>
> I have a file created with BBEdit with this content:
> ------------------------------------------
> <html><head>
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> <title>-</title>
> </head>
> <body>
> <p>äÄüÜöÖßßß</p>
> </body></html>
> -------------------------------------------
> Then I set the url of a browser widget to that file.
> All fine so far...
>
> Now I want to add some more text with umlauts to that file:
> ...
> ## put the htmltext of widget "browser" into tText
> ## Looks exactly like the above in the debugger!

## it makes NO difference if I first do this instead:
put textdecode( the htmltext of widget "browser","Native") into tText
...

--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Binfile?

Thanks,
Brian
On Oct 29, 2019, 1:33 PM -0400, Klaus major-k via use-livecode <[hidden email]>, wrote:

>
>
> > Am 29.10.2019 um 18:24 schrieb Klaus major-k via use-livecode <[hidden email]>:
> >
> > Hi all,
> >
> > macOS 10.14.6, LC 9.5
> >
> > I have a file created with BBEdit with this content:
> > ------------------------------------------
> > <html><head>
> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> > <title>-</title>
> > </head>
> > <body>
> > <p>äÄüÜöÖßßß</p>
> > </body></html>
> > -------------------------------------------
> > Then I set the url of a browser widget to that file.
> > All fine so far...
> >
> > Now I want to add some more text with umlauts to that file:
> > ...
> > ## put the htmltext of widget "browser" into tText
> > ## Looks exactly like the above in the debugger!
>
> ## it makes NO difference if I first do this instead:
> put textdecode( the htmltext of widget "browser","Native") into tText
> ...
>
> --
> Klaus Major
> https://www.major-k.de
> [hidden email]
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Hi Brian,

> Am 29.10.2019 um 18:44 schrieb Brian Milby via use-livecode <[hidden email]>:
>
> Binfile?

tried that, no difference.
The dictionary als uses just FILE in the examples.

> Thanks,
> Brian
> On Oct 29, 2019, 1:33 PM -0400, Klaus major-k via use-livecode <[hidden email]>, wrote:
>>
>>
>>> Am 29.10.2019 um 18:24 schrieb Klaus major-k via use-livecode <[hidden email]>:
>>>
>>> Hi all,
>>>
>>> macOS 10.14.6, LC 9.5
>>>
>>> I have a file created with BBEdit with this content:
>>> ------------------------------------------
>>> <html><head>
>>> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
>>> <title>-</title>
>>> </head>
>>> <body>
>>> <p>äÄüÜöÖßßß</p>
>>> </body></html>
>>> -------------------------------------------
>>> Then I set the url of a browser widget to that file.
>>> All fine so far...
>>>
>>> Now I want to add some more text with umlauts to that file:
>>> ...
>>> ## put the htmltext of widget "browser" into tText
>>> ## Looks exactly like the above in the debugger!
>>
>> ## it makes NO difference if I first do this instead:
>> put textdecode( the htmltext of widget "browser","Native") into tText
>> ...

Best

Klaus

--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
In reply to this post by Stephen Barncard via use-livecode
Did you check the encoding of test.html
with BBEdit (should be Unicode (UTF-8))?



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Hallo Hermann,

> Am 29.10.2019 um 19:14 schrieb hh via use-livecode <[hidden email]>:
>
> Did you check the encoding of test.html
> with BBEdit (should be Unicode (UTF-8))?

yes, BBEdit shows "Unicode (UTF-8)" in the bottom bar.
That is what's puzzling me...


Best

Klaus
--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
In reply to this post by Stephen Barncard via use-livecode
Just saw it now, overlooked that first:

You script
put textencode(tTExt,"UTF8") into url ("file": ...)

This should read
put textDecode(tTExt,"UTF8") into url ("file": ...)

=====

But why using files?
If you don't want to type directly in the browser widget
you can make it "mirroring" a text field.

Example for a primitve HTML editor:

1. Write into fld "HTML"
<html><head><meta charset="utf-8"></head>
<body>[[ht]]</body></html>
(You can now hide fld "HTML").

2. Then make a field "Tippse" and script it:

on textchanged
  put the htmltext of me into ht
  replace "</p>" with "<br>" in ht
  replace "<p>" with empty in ht
  set htmltext of widget "browser" to merge(fld "HTML")
end textchanged

on enterInField
  textchanged
end enterInField

Then you can do what your textfield allows
set textfont (bears some problems), textsize, styles etc.
and the widget will display it (well, approximately...)

[Another variant is to write directly into the DOM structure
of the current HTMLtext.]

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Hallo Hermann,

> Am 29.10.2019 um 20:02 schrieb hh via use-livecode <[hidden email]>:
>
> Just saw it now, overlooked that first:
>
> You script
> put textencode(tTExt,"UTF8") into url ("file": ...)
>
> This should read
> put textDecode(tTExt,"UTF8") into url ("file": ...)

mhhh, when I have this in fld 1:
########################################
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>-</title>
</head>
<body>
<p>äÄüÜöÖßßß</p>
</body></html>
###########################################

And do
...
put textDecode(tTExt,"UTF8") into url ("file": ...)
...

The resulting file looks like:
########################################
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>-</title>
</head>
<body>
<p></p>
</body></html>
###########################################
All umlauts are gone!?

With:
...
put textencode(tText,"UTF8") into url ("file": ...)
...
It looks fine in Safari and a browser widget!?

> But why using files?

Das geht Dich gar nichts an! :-D

Joke aside, the file gets saved in my Dropbox and I share it
with another user of LC!

> If you don't want to type directly in the browser widget
> you can make it "mirroring" a text field.
>
> Example for a primitve HTML editor:
>
> 1. Write into fld "HTML"
> <html><head><meta charset="utf-8"></head>
> <body>[[ht]]</body></html>
> (You can now hide fld "HTML").
>
> 2. Then make a field "Tippse" and script it:
>
> on textchanged
>  put the htmltext of me into ht
>  replace "</p>" with "<br>" in ht
>  replace "<p>" with empty in ht
>  set htmltext of widget "browser" to merge(fld "HTML")
> end textchanged
>
> on enterInField
>  textchanged
> end enterInField
>
> Then you can do what your textfield allows
> set textfont (bears some problems), textsize, styles etc.
> and the widget will display it (well, approximately...)
>
> [Another variant is to write directly into the DOM structure
> of the current HTMLtext.]

Thanks, stored for possible future use. :-)


Best

Klaus
--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
In reply to this post by Stephen Barncard via use-livecode
Hi all,

I don't really understand this, but this finally does the job:
...
put "</body></html>" after tText
## NO encoding before setting the htmltext!?
set the htmltext of widget "source" to tText

put textencode(tText,"UTF-8") into tNewText
put tNewText into url("file:" & Kommunikations_ordner() & "kommunikation.html")
...
Now loading that file again into the widget looks as exspected, pheeeewwww! :-)

> Am 29.10.2019 um 19:44 schrieb Klaus major-k via use-livecode <[hidden email]>:
>
> Hallo Hermann,
>
>> Am 29.10.2019 um 19:14 schrieb hh via use-livecode <[hidden email]>:
>>
>> Did you check the encoding of test.html
>> with BBEdit (should be Unicode (UTF-8))?
> yes, BBEdit shows "Unicode (UTF-8)" in the bottom bar.
> That is what's puzzling me...

Best and thanks to all

Klaus

--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
In reply to this post by Stephen Barncard via use-livecode
Hi Klaus,

this is dangerous because the code could have "mixed" encodings
if you (or your partner) edits the code in texteditors with
different encoding settings.

I looked again carefully into your first post.
You use the htmltext of the widget, so your original code is correct.
It works here as you would like.

So delete the file test.html and write it new by LC using
put textencode(the htmltext of widget "browser","UTF8") \
  into url("file:" & specialfolderpath("desktop") & "/test.html")
 
Possibly there was some problem when you self-created the file.
I set the default encoding in BBEdit to UTF-8.

Also LiveCode uses a BOM. If there is no UTF-8 BOM it uses
Mac OS Roman. That is what you see when you type into msg
put textEncode(the htmltext of widget "Browser", "UTF-8").

If you use later on text from the DOM structure of your HTML,
have to decode before using in LC/writing to a file.
handlers.

I use this always together with base64Encoding (because javascript
handlers need ONE line of code in quotes).

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Hello Hermann,

> Am 29.10.2019 um 22:06 schrieb hh via use-livecode <[hidden email]>:
>
> Hi Klaus,
>
> this is dangerous because the code could have "mixed" encodings
> if you (or your partner) edits the code in texteditors with
> different encoding settings.

we both use the same stack, so no text editor involved.

> I looked again carefully into your first post.
> You use the htmltext of the widget, so your original code is correct.
> It works here as you would like.
>
> So delete the file test.html and write it new by LC using
> put textencode(the htmltext of widget "browser","UTF8") \
>  into url("file:" & specialfolderpath("desktop") & "/test.html")
>
> Possibly there was some problem when you self-created the file.
> I set the default encoding in BBEdit to UTF-8.

I used BBEdit with encoding set to UTF-8 for the "skeletal structure"
of the HTML file, HEADER and a little inline CSS.

> Also LiveCode uses a BOM. If there is no UTF-8 BOM it uses
> Mac OS Roman. That is what you see when you type into msg
> put textEncode(the htmltext of widget "Browser", "UTF-8").

Which shows exactly the same (with umlauts and stuff) as simply:
put the htmltext of widget "Browser"

> If you use later on text from the DOM structure of your HTML,
> have to decode before using in LC/writing to a file.
> handlers.
> I use this always together with base64Encoding (because javascript
> handlers need ONE line of code in quotes).

Well you have a much deeper knowledge about all this stuff than I will ever have
and I hardly understand what you are trying to tell me with these last two sentences. :-D

Thanks for you help, very appreciated!


Best

Klaus
--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
In reply to this post by Stephen Barncard via use-livecode
>> Also LiveCode uses a BOM. If there is no UTF-8 BOM it uses
>> Mac OS Roman. That is what you see when you type into msg
>> put textEncode(the htmltext of widget "Browser", "UTF-8").
>
> Which shows exactly the same (with umlauts and stuff) as simply:
> put the htmltext of widget "Browser"

I don't mean the htmltext of the widget that has the file as url.

Set the htmltext of the widget by script to

<html><head><meta charset="utf-8"></head><body><p>
äöüß
</p></body></html>

Then do
put textencode(the htmltext of widget browser,"UTF-8")

I get then here the MacOS Roman encoding:

<html><head><meta charset="utf-8"></head><body><p>
üā
</p></body></html>

That describes (somehow) the problem you had and others will have.
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: encoding woes!?

Stephen Barncard via use-livecode
Hello Hermann,

> Am 29.10.2019 um 23:01 schrieb hh via use-livecode <[hidden email]>:
>
>>> Also LiveCode uses a BOM. If there is no UTF-8 BOM it uses
>>> Mac OS Roman. That is what you see when you type into msg
>>> put textEncode(the htmltext of widget "Browser", "UTF-8").
>>
>> Which shows exactly the same (with umlauts and stuff) as simply:
>> put the htmltext of widget "Browser"
> I don't mean the htmltext of the widget that has the file as url.
> Set the htmltext of the widget by script to
> <html><head><meta charset="utf-8"></head><body><p>
> äöüß
> </p></body></html>
> Then do
> put textencode(the htmltext of widget browser,"UTF-8")
> I get then here the MacOS Roman encoding:
> <html><head><meta charset="utf-8"></head><body><p>
> üā
> </p></body></html>
>
> That describes (somehow) the problem you had and others will have.

thank you, will try to digest this. :-)

At least I got the stack working as I need it!


Best

Klaus

--
Klaus Major
https://www.major-k.de
[hidden email]


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode