LC9 and Windows Unicode bug?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

LC9 and Windows Unicode bug?

Tore Nilsen via use-livecode
I have found a bug in LC9.0.0 under Windows 8.1 that does not appear
under OSX (Mavericks).

When reading a UTF-8 file via Open File, Read From File, Close File (vs
say a Put URL ...into tVar; put textDecode(tVar,"UTF-8") into ...) I am
seeing a problem decoding Traditional Chinese and Japanese characters.

The bug report below has a ZIP file with a test stack and a UTF-8 file
to show the problem

https://quality.livecode.com/show_bug.cgi?id=21316

Can any one on this list with a Windows 8.1 or Windows 10 system and
LC9.0.0 (stable) spare a few minutes and verify this bug for me?

Thank you in advance.


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: LC9 and Windows Unicode bug?

Tore Nilsen via use-livecode
I'm seeing the same results Mac Sierra / Win 10.  But, you can open the
file as binary.

...
open file theFile for binary read
...

Function Readln FILE_NAME
   Read from file FILE_NAME until Return
   Delete last char of it
   Return textDecode(it,"UTF-8")
End Readln

Function Readbk FILE_NAME
   Read from file FILE_NAME until numToChar(1)
   Delete last char of it
   Return textDecode(it,"UTF-8")
End Readbk

When I change to reading the file as binary, then I get the same (as far as
I can tell) results left/right fields.

On Tue, May 29, 2018 at 4:35 PM, Paul Dupuis via use-livecode <
[hidden email]> wrote:

> I have found a bug in LC9.0.0 under Windows 8.1 that does not appear
> under OSX (Mavericks).
>
> When reading a UTF-8 file via Open File, Read From File, Close File (vs
> say a Put URL ...into tVar; put textDecode(tVar,"UTF-8") into ...) I am
> seeing a problem decoding Traditional Chinese and Japanese characters.
>
> The bug report below has a ZIP file with a test stack and a UTF-8 file
> to show the problem
>
> https://quality.livecode.com/show_bug.cgi?id=21316
>
> Can any one on this list with a Windows 8.1 or Windows 10 system and
> LC9.0.0 (stable) spare a few minutes and verify this bug for me?
>
> Thank you in advance.
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Null character in text fields

Tore Nilsen via use-livecode
Hi,

I have a problem with “funny” characters showing in fields of Chinese text in LC projects on PCs with a Chinese version of Windows (7 & 10) installed. Non-Chinese, standard version of Windows don’t show this. The character displays as a rectangular box. LiveCode 8.3 & 8.9.

When in a script I do:

put nativeCharToNum(char 22 of line 66 of field "Contents”)

it returns “0” (a zero).

The htmlText of the field seems to show an entity:

&#0;

On the Web, I see that char 0 is the null character. A search for &#0; returns no results.

So, I’m going to write a script to replace &#0; with empty in the htmlText. Hopefully this will remove it. Our translator in China will have to test the script to see if it works.

Has anyone seen this before? Or know how it could have sneaked into the text? The original text was brought in from English Word files of 1990s vintage, and modified and translated since then.

It is interesting that these LC/Windows renders this character this way only on Chinese Windows systems, and gives it this HTML entity.

Peter Bogdanoff
ArtsInteractive



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Null character in text fields

Tore Nilsen via use-livecode
Right . . .

On 30/5/2018 5:14 am, Peter Bogdanoff via use-livecode wrote:
> Hi,
>
> I have a problem with “funny” characters showing in fields of Chinese text in LC projects on PCs with a Chinese version of Windows (7 & 10) installed. Non-Chinese, standard version of Windows don’t show this. The character displays as a rectangular box. LiveCode 8.3 & 8.9.
>
> When in a script I do:
>
> put nativeCharToNum(char 22 of line 66 of field "Contents”)

#1. Why are you using nativeCharToNum instead of codePointToNum ?

Presumably (?) your field "Contents" contains Chinese text.

I am assuming that the codePointToNum of your "0" is 48.

It would help in a big way if you could post "char 22" here.
>
> it returns “0” (a zero).
>
> The htmlText of the field seems to show an entity:
>
> &#0;

When I do this sort of thing:

set the htmlText of fld "htmlF" to numToCodePoint(48)

the field shows a "0"

and

set the htmlText of fld "htmlF" to the text of fld "textF"

produces exactly the same result.
>
> On the Web, I see that char 0 is the null character. A search for &#0; returns no results.
>
> So, I’m going to write a script to replace &#0; with empty in the htmlText. Hopefully this will remove it. Our translator in China will have to test the script to see if it works.

I really wonder what this "&#0" is?
>
> Has anyone seen this before? Or know how it could have sneaked into the text? The original text was brought in from English Word files of 1990s vintage, and modified and translated since then.
>
> It is interesting that these LC/Windows renders this character this way only on Chinese Windows systems, and gives it this HTML entity.
>
> Peter Bogdanoff
> ArtsInteractive

I shall be wandering around the town most of the day doing silly admin.
guff for my school,

but will try to check back hear about every 60 mins!

Best, Richmond.
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: LC9 and Windows Unicode bug?

Tore Nilsen via use-livecode
In reply to this post by Tore Nilsen via use-livecode
Brian,

Thank you for confirming the bug ... and a super thank you for the work
around! It is brilliant.

On 5/29/2018 8:04 PM, Brian Milby via use-livecode wrote:

> I'm seeing the same results Mac Sierra / Win 10.  But, you can open the
> file as binary.
>
> ...
> open file theFile for binary read
> ...
>
> Function Readln FILE_NAME
>    Read from file FILE_NAME until Return
>    Delete last char of it
>    Return textDecode(it,"UTF-8")
> End Readln
>
> Function Readbk FILE_NAME
>    Read from file FILE_NAME until numToChar(1)
>    Delete last char of it
>    Return textDecode(it,"UTF-8")
> End Readbk
>
> When I change to reading the file as binary, then I get the same (as far as
> I can tell) results left/right fields.
>
> On Tue, May 29, 2018 at 4:35 PM, Paul Dupuis via use-livecode <
> [hidden email]> wrote:
>
>> I have found a bug in LC9.0.0 under Windows 8.1 that does not appear
>> under OSX (Mavericks).
>>
>> When reading a UTF-8 file via Open File, Read From File, Close File (vs
>> say a Put URL ...into tVar; put textDecode(tVar,"UTF-8") into ...) I am
>> seeing a problem decoding Traditional Chinese and Japanese characters.
>>
>> The bug report below has a ZIP file with a test stack and a UTF-8 file
>> to show the problem
>>
>> https://quality.livecode.com/show_bug.cgi?id=21316
>>
>> Can any one on this list with a Windows 8.1 or Windows 10 system and
>> LC9.0.0 (stable) spare a few minutes and verify this bug for me?
>>
>> Thank you in advance.
>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode