Unicode woes

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Unicode woes

Richmond Mathewson-2
Anybody know why this doesn't work:

on mouseUp
set the useUnicode to true
set the unicodeText of fld "XXXX" to the first char of the unicodeText
of fld "ZZZZ"
end mouseUp

???

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

xtalkprogrammer
Hi Richmond,

It doesn't work because the first char actually gets the first byte and thus you only get the first half of the first character.

This might work, depending on the textFont setting and maybe writing direction:

on mouseUp
        set the unicodetext of fld "xxx" to char 1 to 2 of the unicodetext of fld "zzz"
end mouseUp

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

What does that error mean? Buy LiveCodeErrors for iPhone now http://qery.us/v4 A must-have for LiveCode programmers.

On 13 aug 2011, at 15:43, Richmond Mathewson wrote:

> Anybody know why this doesn't work:
>
> on mouseUp
> set the useUnicode to true
> set the unicodeText of fld "XXXX" to the first char of the unicodeText of fld "ZZZZ"
> end mouseUp
>
> ???



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Richmond Mathewson-2
On 08/13/2011 05:04 PM, Mark Schonewille wrote:

> Hi Richmond,
>
> It doesn't work because the first char actually gets the first byte and thus you only get the first half of the first character.
>
> This might work, depending on the textFont setting and maybe writing direction:
>
> on mouseUp
> set the unicodetext of fld "xxx" to char 1 to 2 of the unicodetext of fld "zzz"
> end mouseUp
>
> --
> Best regards,
>
> Mark Schonewille
>
>

Thank you very much, I'll give it a go.

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Richmond Mathewson-2
In reply to this post by xtalkprogrammer
On 08/13/2011 05:04 PM, Mark Schonewille wrote:
> Hi Richmond,
>
> It doesn't work because the first char actually gets the first byte and thus you only get the first half of the first character.
>
> This might work, depending on the textFont setting and maybe writing direction:
>
> on mouseUp
> set the unicodetext of fld "xxx" to char 1 to 2 of the unicodetext of fld "zzz"
> end mouseUp

That works very well indeed.

HOWEVER . . . my textField contains mixed Unicode and ASCII text [spaces,
Latin letters] . . . so I am obviously going to have a jolly time
detecting whether each char falls below the 255 ASCII limit and then, if
it does, treating it as an ASCII char, and if it doesn't lining it up
with the next char to give a double-byte char.

> --
> Best regards,
>
> Mark Schonewille
>
>
>
> On 13 aug 2011, at 15:43, Richmond Mathewson wrote:
>
>> Anybody know why this doesn't work:
>>
>> on mouseUp
>> set the useUnicode to true
>> set the unicodeText of fld "XXXX" to the first char of the unicodeText of fld "ZZZZ"
>> end mouseUp
>>
>> ???
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Malte Brill
In reply to this post by Richmond Mathewson-2

Hey Richmond,

maybe these might come in handy:

setprop cUTF8Text pUTF8String
    if word 1 of the name of the target<>"field" and word 1 of the name of the target<>"button" then
        if "dev" is in the environment then
            throw "cUTF8Text: Target is not a field or Button"
        end if
        exit cUTF8Text
    end if
    if word 1 of the name of the target = "field" then
        set the unicodetext of the target to uniencode(pUTF8String,"UTF8")
    else
        set the text of the target to uniencode(pUTF8String,"UTF8")
        set the textFont of the target to ",UNICODE"
    end if
end cUTF8Text

getprop cUTF8Text
    if word 1 of the name of the target<>"field" and  word 1 of the name of the target<>"button" then
        if "dev" is in the environment then
            throw "cUTF8Text: Target is not a field or button"
        end if
        exit cUTF8Text
    end if
    if word 1 of the name of the target="field" then
        return unidecode(the unicodetext of the target,"UTF8")
    else
      if ",UNICODE" is in the textfont of the target then
        return unidecode(the text of the target,"UTF8")
     end if
    end if
end cUTF8Text

Keep them at stack level or in a library stack. Use them the following way:

put the cUTF8Text of field "ContainsMixedChars" into tUTF8
--> Now holds a correctly UTF8 encoded string
--> UTF8 is double byte only when needed and things like the itemdel
--> do work as expected on UTF8 text

put "aoi" after tUTF8

set the cUTF8Text of field "ContainsMixedChars" to tUTF8

I have taken these scripts from the unicode lib that I am still trying to set up. If there is any interested, I'll upload it to revOnline and announce it on the list.

Cheers,

Malte


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Ideal Unicode?

kee nethery
In reply to this post by Richmond Mathewson-2
In my perfect programming world ...

I'd want all characters all the time for any place characters are displayed to be displayed and entered as unicode characters and represented as UTF8 bytes.

If the display version has "割劥" I'd want the language to recognize those as two characters and as 6 bytes.

I want UTF8 instead of UTF16 because UTF8 is the same byte stream regardless of processor endian-ness and more importantly, the entire web uses UTF8.

Is this crazy talk or would this be your ideal programming system for unicode?

Kee Nethery
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Ideal Unicode?

xtalkprogrammer
Hi Kee,

No, this isn't crazy at all. In fact, this is pretty standard amongst nowadays software products, particularly text editors. Most programming environments are capable of doing this. The availability of both char and byte in the LiveCode language indicates that RunRev plans to do the same for LiveCode, but so far it hasn't happened.

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

What does that error mean? Buy LiveCodeErrors for iPhone now http://qery.us/v4 A must-have for LiveCode programmers.

On 15 aug 2011, at 23:32, Kee Nethery wrote:

> In my perfect programming world ...
>
> I'd want all characters all the time for any place characters are displayed to be displayed and entered as unicode characters and represented as UTF8 bytes.
>
> If the display version has "割劥" I'd want the language to recognize those as two characters and as 6 bytes.
>
> I want UTF8 instead of UTF16 because UTF8 is the same byte stream regardless of processor endian-ness and more importantly, the entire web uses UTF8.
>
> Is this crazy talk or would this be your ideal programming system for unicode?
>
> Kee Nethery


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: Ideal Unicode?

Jeffrey Massung
In reply to this post by kee nethery
On Mon, Aug 15, 2011 at 3:32 PM, Kee Nethery <[hidden email]> wrote:

> In my perfect programming world ...
>
> I'd want all characters all the time for any place characters are displayed
> to be displayed and entered as unicode characters and represented as UTF8
> bytes.
>
> If the display version has "割劥" I'd want the language to recognize those as
> two characters and as 6 bytes.
>
> I want UTF8 instead of UTF16 because UTF8 is the same byte stream
> regardless of processor endian-ness and more importantly, the entire web
> uses UTF8.
>
> Is this crazy talk or would this be your ideal programming system for
> unicode?
>
>
You had me up until UTF8 for everything. While I understand the sentiment,
this has the potential to absolutely *suck* the performance out of LC apps.
UTF8 is great because it's indistinguishable from ASCII. Other than that,
it's an absolute PITA to work with because you can't just grab data. You
can't say "give me the 104th character" of a string w/o traversing the 103
characters preceding it, because they may be 1, 2, or more bytes long each.

Now, think about all the LC out there that do things like "replace the
second character of the fourth word of the fifth line of myString with ...."
There's a lot. Similarly, getting the length of a string would require going
through the string to do so.

Now, there's ways around this performance hit, but they all require using
more memory. And if you are already using more memory, why not just use wide
characters for everything anyway? Your end-user will never know the
difference, and you can write data out using UTF8 if it's more convenient,
and read it in that way. But, internally, I'd prefer using fixed 16- or
32-bits for each character and knowing that when I ask for character 103843
of a very large buffer, I get it back in O(1) time.

But that's just me. ;-)

Of course, this all assumes Rev is moving towards 5.0 and the new field
which is 100% unicode. But over the past month or so several questions have
been asked about progress on this front with zero feedback - not even an
acknowledgement that the question was asked.

/shrug

Jeff M.
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Ideal Unicode?

Richmond Mathewson-2
"Ideal Unicode'

Will never exist for a few simple reasons:

1. Unicode is a moving target (the Unicode commission keep producing
updated versions).

2. Computers and computer operating systems are moving targets.

If one is prepared to invest one hell of a lot of time (and I have)
poking around in Livecode one can do almost everything with unicode;
admittedly often in a cumbersome and counter intuitive sort of way.

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Richmond Mathewson-2
In reply to this post by Malte Brill
On 08/15/2011 10:51 PM, Malte Brill wrote:

> Hey Richmond,
>
> maybe these might come in handy:
>
> setprop cUTF8Text pUTF8String
>      if word 1 of the name of the target<>"field" and word 1 of the name of the target<>"button" then
>          if "dev" is in the environment then
>              throw "cUTF8Text: Target is not a field or Button"
>          end if
>          exit cUTF8Text
>      end if
>      if word 1 of the name of the target = "field" then
>          set the unicodetext of the target to uniencode(pUTF8String,"UTF8")
>      else
>          set the text of the target to uniencode(pUTF8String,"UTF8")
>          set the textFont of the target to ",UNICODE"
>      end if
> end cUTF8Text
>
> getprop cUTF8Text
>      if word 1 of the name of the target<>"field" and  word 1 of the name of the target<>"button" then
>          if "dev" is in the environment then
>              throw "cUTF8Text: Target is not a field or button"
>          end if
>          exit cUTF8Text
>      end if
>      if word 1 of the name of the target="field" then
>          return unidecode(the unicodetext of the target,"UTF8")
>      else
>        if ",UNICODE" is in the textfont of the target then
>          return unidecode(the text of the target,"UTF8")
>       end if
>      end if
> end cUTF8Text
>
> Keep them at stack level or in a library stack. Use them the following way:
>
> put the cUTF8Text of field "ContainsMixedChars" into tUTF8
> -->  Now holds a correctly UTF8 encoded string
> -->  UTF8 is double byte only when needed and things like the itemdel
> -->  do work as expected on UTF8 text
>
> put "aoi" after tUTF8
>
> set the cUTF8Text of field "ContainsMixedChars" to tUTF8
>
> I have taken these scripts from the unicode lib that I am still trying to set up. If there is any interested, I'll upload it to revOnline and announce it on the list.
>
> Cheers,
>
> Malte
>
>
>

Pour quoi?

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Paul Dupuis
In reply to this post by Malte Brill
On 8/15/2011 3:51 PM, Malte Brill wrote:
> I have taken these scripts from the unicode lib that I am still trying to set up. If there is any interested, I'll upload it to revOnline and announce it on the list.
>
> Cheers,
>
> Malte

Malte,

Thank you. I for one would love any Unicode library any one cares to
post. Currently Unicode in LiveCode is one of the technology items that
requires a LOT of fussing with to get anything to work.

I'd also ask Kevin (RunRev) to please consider posting an update on the
new Field object and "Unicode that just works" as we approach the end of
Q3 2011 next month. The guestimated time frame for 5.0, the version that
will hopefully contain the new field and Unicode support, was suggested
as late Q4 2011 or Q1 2012. As Q3 2011 comes to an end in September,
some update would really help those of use who have product development
plans tied to that next LiveCode version (or at least the features we
hope will be in it).

--
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Stephen Barncard-4
Roadmaps shouldn't be discussed on the use-list...  nda ...



On 16 August 2011 11:06, Paul Dupuis <[hidden email]> wrote:

> On 8/15/2011 3:51 PM, Malte Brill wrote:
>
>> I have taken these scripts from the unicode lib that I am still trying to
>> set up. If there is any interested, I'll upload it to revOnline and announce
>> it on the list.
>>
>> Cheers,
>>
>> Malte
>>
>
> Malte,
>
> Thank you. I for one would love any Unicode library any one cares to post.
> Currently Unicode in LiveCode is one of the technology items that requires a
> LOT of fussing with to get anything to work.
>
> I'd also ask Kevin (RunRev) to please consider posting an update on the new
> Field object and "Unicode that just works" as we approach the end of Q3 2011
> next month. The guestimated time frame for 5.0, the version that will
> hopefully contain the new field and Unicode support, was suggested as late
> Q4 2011 or Q1 2012. As Q3 2011 comes to an end in September, some update
> would really help those of use who have product development plans tied to
> that next LiveCode version (or at least the features we hope will be in it).
>
> --
> Paul Dupuis
> Cofounder
> Researchware, Inc.
> http://www.researchware.com/
>
>
>
> ______________________________**_________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>



--



Stephen Barncard
San Francisco Ca. USA

more about sqb  <http://www.google.com/profiles/sbarncar>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Unicode woes

Paul Dupuis
Err - my bad. Posted to the wrong list! Arrgh!

On 8/16/2011 2:24 PM, stephen barncard wrote:

> Roadmaps shouldn't be discussed on the use-list...  nda ...
>
>
>
> On 16 August 2011 11:06, Paul Dupuis<[hidden email]>  wrote:
>
>> On 8/15/2011 3:51 PM, Malte Brill wrote:
>>
>>> I have taken these scripts from the unicode lib that I am still trying to
>>> set up. If there is any interested, I'll upload it to revOnline and announce
>>> it on the list.
>>>
>>> Cheers,
>>>
>>> Malte
>>>
>> Malte,
>>
>> Thank you. I for one would love any Unicode library any one cares to post.
>> Currently Unicode in LiveCode is one of the technology items that requires a
>> LOT of fussing with to get anything to work.
>>
>> I'd also ask Kevin (RunRev) to please consider posting an update on the new
>> Field object and "Unicode that just works" as we approach the end of Q3 2011
>> next month. The guestimated time frame for 5.0, the version that will
>> hopefully contain the new field and Unicode support, was suggested as late
>> Q4 2011 or Q1 2012. As Q3 2011 comes to an end in September, some update
>> would really help those of use who have product development plans tied to
>> that next LiveCode version (or at least the features we hope will be in it).
>>
>> --
>> Paul Dupuis
>> Cofounder
>> Researchware, Inc.
>> http://www.researchware.com/
>>
>>
>>
>> ______________________________**_________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/**mailman/listinfo/use-livecode<http://lists.runrev.com/mailman/listinfo/use-livecode>
>>
>
>

--
Paul Dupuis
Cofounder
Researchware, Inc.
http://www.researchware.com/


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode