the mouseText and Unicode

classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

the mouseText and Unicode

Slava Paperno
Hi, List! It's me again with my Unicode problems in bilingual fields. I just
spent four hours on this problem and gave up.

I'm trying to retrieve the text of the word the user clicked on in a locked
field. As far as I can tell, the MouseChunk and the mouseText are useless
with Unicode, and especially with a mixed Unicode and Roman text.

I tried using the mouseCharChunk, and then to decrement the char position
until I hit a word delimiter and to increment the char position until I hit
a word delimiter, but the problem is that a Roman character (including
punctuation) is one byte long while a non-Roman character is two bytes, and
I cannot interpret the two positions that the mouseCharChunk returns unless
I know which characters in the field are Roman, and which are non-Roman--and
of course I can't know that.

Is there an equivalent to the mouseText, or the mouseChunk, or the
mouseCharChunk that is Unicode-aware? Or some ingenious way to retrieve the
word that the user clicked?

I see a similar problem coming up when I start using hyperlinks in bilingual
texts: retrieving the text of a chunk whose textStyle is set to "link"
involves a very similar issue.

Slava



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

J. Landman Gay
On 6/17/11 11:07 PM, Slava Paperno wrote:

> Is there an equivalent to the mouseText, or the mouseChunk, or the
> mouseCharChunk that is Unicode-aware? Or some ingenious way to retrieve the
> word that the user clicked?

Does the "word" delimiter work? If so, you could try this in the field:

on mouseUp
   get word 2 of the mouseCharChunk
   put the number of words in char 1 to it of me into tWordNum
   put word tWordNum of me into tWordClicked
   -- do something with tWordClicked
end mouseUp

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

the mouseText and Unicode

Slava Paperno (Bridge)
Aha! The idea of using "the number of words" is brilliant.

Unfortunately, the word boundaries don't "quite" work, and I think it's for
the same reason: I think the mouseCharChunk reports bytes, not characters.
It is Unicode-smart in some ways: when you click a non-Roman letter, it
returns "char N to N+1 of field X." When you click a Roman letter in a
double-byte field it returns "char N to N of field X."
 
In this string, for example:

Саша, Наташа, Митя, Роберт, Robert, Jeffrey, and Соня Петрова, Слава
Паперно, Лора Баглай, Макс Паперно

clicking on the first letter in word 4 returns 3 for the number of words.

This is what I have in the mouseUp handler of the field:

put word 2 of the mouseCharChunk into locStart
put locStart && the number of words in char 1 to locStart of me

Clicking on the space between word 1 and word 2 (the space after the first
comma) displays 10 for locStart. It is indeed byte 10, but if the entire
field were treated as double-byte text, it should be 11. So each Russian
letter is counted as two bytes and two characters, but the comma is counted
as one byte and one character. This screws up my math.

Strangely, clicking anywhere other than the first letter of the fourth word
in that sample string returns the correct word position.

So I'm still in the dark.

Thanks, Jacqueline!

Slava

> -----Original Message-----
> From: [hidden email] [mailto:use-livecode-
> [hidden email]] On Behalf Of J. Landman Gay
> Sent: Saturday, June 18, 2011 12:34 AM
> To: How to use LiveCode
> Subject: Re: the mouseText and Unicode
>
> On 6/17/11 11:07 PM, Slava Paperno wrote:
>
> > Is there an equivalent to the mouseText, or the mouseChunk, or the
> > mouseCharChunk that is Unicode-aware? Or some ingenious way to
> retrieve the
> > word that the user clicked?
>
> Does the "word" delimiter work? If so, you could try this in the field:
>
> on mouseUp
>    get word 2 of the mouseCharChunk
>    put the number of words in char 1 to it of me into tWordNum
>    put word tWordNum of me into tWordClicked
>    -- do something with tWordClicked
> end mouseUp
>
> --
> Jacqueline Landman Gay         |     [hidden email]
> HyperActive Software           |     http://www.hyperactivesw.com
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

Richmond Mathewson-2
On 06/18/2011 09:07 AM, Slava Paperno wrote:
> Aha! The idea of using "the number of words" is brilliant.
>
> Unfortunately, the word boundaries don't "quite" work, and I think it's for
> the same reason: I think the mouseCharChunk reports bytes, not characters.
> It is Unicode-smart in some ways: when you click a non-Roman letter, it
> returns "char N to N+1 of field X." When you click a Roman letter in a
> double-byte field it returns "char N to N of field X."

I think you may be right about double-byte stuff; or that Livecode looks
for things encoded in the ASCII set, and treats everything "else" as space.

However (not having any Russian I have used Bulgarian) I set up a stack
just now with
2 flds: "fSTART" and "fEND" and typed 'Всичко е глупости или лайно.'
into fld "fSTART"

I then typed this into a button:

on mouseUp
   set the useUnicode to true
   set the unicodeText of fld "fEND" to the unicodeText of word 3 of fld
"fSTART"
end mouseUp

and ended up with 'глупости' in fld "fEND".

Try it . . .  :)

Love, Richmond.


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

Richmond Mathewson-2
In reply to this post by Slava Paperno (Bridge)
http://andregarzia.on-rev.com/richmond/BOX/Paperno.rev.zip

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

xtalkprogrammer
In reply to this post by J. Landman Gay
Hi Jacque,

That's wrong for unicode, because it will catch one NULL too many, either at the beginning or the end of the string, depending on whether it is small or big endian (which depends on the processor that LiveCode is running on).

If you're really desperate, you might use this approach and and delete that extra NULL.

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

New: Download the Installer Maker Plugin 1.6 for LiveCode here http://qery.us/ce

On 18 jun 2011, at 06:33, J. Landman Gay wrote:

> On 6/17/11 11:07 PM, Slava Paperno wrote:
>
>> Is there an equivalent to the mouseText, or the mouseChunk, or the
>> mouseCharChunk that is Unicode-aware? Or some ingenious way to retrieve the
>> word that the user clicked?
>
> Does the "word" delimiter work? If so, you could try this in the field:
>
> on mouseUp
>  get word 2 of the mouseCharChunk
>  put the number of words in char 1 to it of me into tWordNum
>  put word tWordNum of me into tWordClicked
>  -- do something with tWordClicked
> end mouseUp


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

xtalkprogrammer
In reply to this post by Slava Paperno (Bridge)
Hi Slava,

Instead of simply pasting the text into LiveCode, do the following:

set the unicodeText of fld x to the clipboarddata["unicode"]

Now you can do this:

on mouseDown
  select the clickText
  set the unicodeText of fld 2 to the unicodeText of the selection
end mouseDown

or whatever you would like to do with the clickText. This works fine on Intel processors, I'm not sure wha happens if you try this on a PPC processor.

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

New: Download the Installer Maker Plugin 1.6 for LiveCode here http://qery.us/ce

On 18 jun 2011, at 08:07, Slava Paperno wrote:

> Aha! The idea of using "the number of words" is brilliant.
>
> Unfortunately, the word boundaries don't "quite" work, and I think it's for
> the same reason: I think the mouseCharChunk reports bytes, not characters.
> It is Unicode-smart in some ways: when you click a non-Roman letter, it
> returns "char N to N+1 of field X." When you click a Roman letter in a
> double-byte field it returns "char N to N of field X."
>
> In this string, for example:
>
> Саша, Наташа, Митя, Роберт, Robert, Jeffrey, and Соня Петрова, Слава
> Паперно, Лора Баглай, Макс Паперно
>
> clicking on the first letter in word 4 returns 3 for the number of words.
>
> This is what I have in the mouseUp handler of the field:
>
> put word 2 of the mouseCharChunk into locStart
> put locStart && the number of words in char 1 to locStart of me
>
> Clicking on the space between word 1 and word 2 (the space after the first
> comma) displays 10 for locStart. It is indeed byte 10, but if the entire
> field were treated as double-byte text, it should be 11. So each Russian
> letter is counted as two bytes and two characters, but the comma is counted
> as one byte and one character. This screws up my math.
>
> Strangely, clicking anywhere other than the first letter of the fourth word
> in that sample string returns the correct word position.
>
> So I'm still in the dark.
>
> Thanks, Jacqueline!
>
> Slava


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

Richmond Mathewson-2
In reply to this post by xtalkprogrammer

>>> Is there an equivalent to the mouseText, or the mouseChunk, or the
>>> mouseCharChunk that is Unicode-aware? Or some ingenious way to retrieve the
>>> word that the user clicked?

doing THIS:

on mouseDown
   set the useUnicode to true
   set the unicodeText of fld "fEND" to the unicodeText of the clickText
end mouseDown

only works with the FIRST word!

changing  clickText  to  clickChunk  gets exactly the same result.
--------------------------------------------------------

so, here I am mucking around with itemDelimiter..........................

the Documentation states that I can write this:

set the itemDelimiter to numToChar(32)

but this throws a bug message: " execution eror . . . (Chunk: source is
not a character) near " ", char 4

which would suggest the Docs are wrong ???????

these:  set the itemDelimiter to " "  / set the itemDelimiter to space

also result in NO JOY.

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

xtalkprogrammer
Hi Richmond,

The itemDel can be only one char/byte long. If the useUnicode is true, then numToChar(32) returns 2 bytes. Hence the error. The docs are correct.

--
Best regards,

Mark Schonewille

Economy-x-Talk Consulting and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
KvK: 50277553

New: Download the Installer Maker Plugin 1.6 for LiveCode here http://qery.us/ce

On 18 jun 2011, at 12:28, Richmond Mathewson wrote:

>
>>>> Is there an equivalent to the mouseText, or the mouseChunk, or the
>>>> mouseCharChunk that is Unicode-aware? Or some ingenious way to retrieve the
>>>> word that the user clicked?
>
> doing THIS:
>
> on mouseDown
>  set the useUnicode to true
>  set the unicodeText of fld "fEND" to the unicodeText of the clickText
> end mouseDown
>
> only works with the FIRST word!
>
> changing  clickText  to  clickChunk  gets exactly the same result.
> --------------------------------------------------------
>
> so, here I am mucking around with itemDelimiter..........................
>
> the Documentation states that I can write this:
>
> set the itemDelimiter to numToChar(32)
>
> but this throws a bug message: " execution eror . . . (Chunk: source is not a character) near " ", char 4
>
> which would suggest the Docs are wrong ???????
>
> these:  set the itemDelimiter to " "  / set the itemDelimiter to space
>
> also result in NO JOY.


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

RE: the mouseText and Unicode

Slava Paperno (Bridge)
In reply to this post by xtalkprogrammer
A very interesting discussion! Thanks, Richmond--I tried all the same
tricks, with the same baffling results you're reporting. First work always
works.

Mark--I've used you suggestion about "the unicodeText of selection" to great
advantage in a different context (when the user has selected some text
manually), but I can't figure out how to use it here. "select the clickText"
and "select the mouseText" work correctly only "half the time": In my sample
string, clicking words 1, 2, 3, or 4 selects word 1. Clicking words 5, 7, 9,
and 11 selects the word I click, but clicking other words doesn't.

Maybe that's because I haven't followed your first direction, "set the
unicodeText of fld x to the clipboarddata["unicode"]". I'm not sure how I
would do that. My texts never come from the clipboard. They are either typed
in the field or retrieved from a database or read from text files on the
local disc.

You also said "if you're desperate, delete that extra null byte." That's
intriguing. I don't care about PPC machines. But I can't find any null bytes
in my fields. I expected a null byte to be added to each Roman character,
e.g. 0x00AF. But when I use "put byte N of field X" for the Roman
characters, I can't find a null byte anywhere. Roman characters seem to be
represented by one byte in my two-bytes fields. I know this doesn't sound
right, so I must be missing something.

Slava

> -----Original Message-----
> From: [hidden email] [mailto:use-livecode-
> [hidden email]] On Behalf Of Mark Schonewille
> Sent: Saturday, June 18, 2011 6:24 AM
> To: How to use LiveCode
> Subject: Re: the mouseText and Unicode
>
> Hi Slava,
>
> Instead of simply pasting the text into LiveCode, do the following:
>
> set the unicodeText of fld x to the clipboarddata["unicode"]
>
> Now you can do this:
>
> on mouseDown
>   select the clickText
>   set the unicodeText of fld 2 to the unicodeText of the selection
> end mouseDown
>
> or whatever you would like to do with the clickText. This works fine on
> Intel processors, I'm not sure wha happens if you try this on a PPC
> processor.
>
> --
> Best regards,
>
> Mark Schonewille
>
> Economy-x-Talk Consulting and Software Engineering
> Homepage: http://economy-x-talk.com
> Twitter: http://twitter.com/xtalkprogrammer
> KvK: 50277553
>
> New: Download the Installer Maker Plugin 1.6 for LiveCode here
> http://qery.us/ce
>
> On 18 jun 2011, at 08:07, Slava Paperno wrote:
>
> > Aha! The idea of using "the number of words" is brilliant.
> >
> > Unfortunately, the word boundaries don't "quite" work, and I think
> it's for
> > the same reason: I think the mouseCharChunk reports bytes, not
> characters.
> > It is Unicode-smart in some ways: when you click a non-Roman letter,
> it
> > returns "char N to N+1 of field X." When you click a Roman letter in
> a
> > double-byte field it returns "char N to N of field X."
> >
> > In this string, for example:
> >
> > Саша, Наташа, Митя, Роберт, Robert, Jeffrey, and Соня Петрова, Слава
> > Паперно, Лора Баглай, Макс Паперно
> >
> > clicking on the first letter in word 4 returns 3 for the number of
> words.
> >
> > This is what I have in the mouseUp handler of the field:
> >
> > put word 2 of the mouseCharChunk into locStart
> > put locStart && the number of words in char 1 to locStart of me
> >
> > Clicking on the space between word 1 and word 2 (the space after the
> first
> > comma) displays 10 for locStart. It is indeed byte 10, but if the
> entire
> > field were treated as double-byte text, it should be 11. So each
> Russian
> > letter is counted as two bytes and two characters, but the comma is
> counted
> > as one byte and one character. This screws up my math.
> >
> > Strangely, clicking anywhere other than the first letter of the
> fourth word
> > in that sample string returns the correct word position.
> >
> > So I'm still in the dark.
> >
> > Thanks, Jacqueline!
> >
> > Slava
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

RE: the mouseText and Unicode

BNig
Hi Slava,

I tried your example of mixed unicode and ASCII words. Using the word technique and the html I did this:

-----------------------------
on mouseUp
   get word 2 of the clickCharChunk
   put the number of words in char 1 to it of me into tWordNum
   put word 1 to (the number of words in char 1 to it of me) of me into tWords
   put the htmlText of  word tWordNum of me into tWordClicked
   put tWordClicked into field 3
   set the htmlText of field 2 to tWordClicked
end mouseUp
-------------------------------
3 fields, the first contained the unicode/ASCII mix of your example I pasted it into the field from the browser, looked good.
second field is where your clicked word goes
third field where the html of the clicked word goes.

worked quite well except for Роберт he must have chinese ancestors :)

all other words came over as they should. The html of the word makes it even easy to parse out the comma.

maybe this is a lead to your problem? Or I am way off, don't know anything about unicode.

Kind regards

Bernd
Reply | Threaded
Open this post in threaded view
|

RE: the mouseText and Unicode

Slava Paperno (Bridge)
Thanks, Bernd! Using the html is something I didn't try, but otherwise your results are exactly the same as mine: The Russian Robert (Роберт) is the fourth word, yet clicking its first letter reports it as word 3.

Yes, Robert's Chinese ancestor are the culprit here, of course :)

The Chinese characters are displayed whenever you get the bytes wrong, e.g. try to display char 10 to char 11 when the actual double-byte character is char 11 to 12. When all text is non-Roman in a double-byte field, it's easy not to make that mistake (always start with an odd number) , but when some characters are Roman (like that first comma in my example), the mouseCharChunk fails to account for the null byte in front of it, and reports the next character (space) incorrectly. That's my theory at this point... it may be wrong.

The exasperating thing, for me, is that getting word N of a string is not a problem, and neither is locating the position of a word (once you know its characters) . It's identifying the word-position of the mouse-click that is screwed up.

Thanks again... If we ever find a sure-fire way to do this, I'll post the solution here.

Enjoy your weekend,

Slava

> -----Original Message-----
> From: [hidden email] [mailto:use-livecode-
> [hidden email]] On Behalf Of BNig
> Sent: Saturday, June 18, 2011 11:53 AM
> To: [hidden email]
> Subject: RE: the mouseText and Unicode
>
> Hi Slava,
>
> I tried your example of mixed unicode and ASCII words. Using the word
> technique and the html I did this:
>
> -----------------------------
> on mouseUp
>    get word 2 of the clickCharChunk
>    put the number of words in char 1 to it of me into tWordNum
>    put word 1 to (the number of words in char 1 to it of me) of me into tWords
>    put the htmlText of  word tWordNum of me into tWordClicked
>    put tWordClicked into field 3
>    set the htmlText of field 2 to tWordClicked end mouseUp
> -------------------------------
> 3 fields, the first contained the unicode/ASCII mix of your example I pasted it
> into the field from the browser, looked good.
> second field is where your clicked word goes third field where the html of the
> clicked word goes.
>
> worked quite well except for Роберт he must have chinese ancestors :)
>
> all other words came over as they should. The html of the word makes it even
> easy to parse out the comma.
>
> maybe this is a lead to your problem? Or I am way off, don't know anything
> about unicode.
>
> Kind regards
>
> Bernd
>
>
> --
> View this message in context: http://runtime-
> revolution.278305.n4.nabble.com/the-mouseText-and-Unicode-
> tp3607206p3607973.html
> Sent from the Revolution - User mailing list archive at Nabble.com.
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

RE: the mouseText and Unicode

BNig
Slava,

Роберт gave up:
-----------------------
on mouseUp
   get word 4 of the clickCharChunk
   put the number of words in char 1 to it of me into tWordNum
   select word tWordNum of me
   put the htmlText of  the selectedtext into tWordClicked
   put tWordClicked into field 3
   set the htmlText of field 2 to tWordClicked
end mouseUp
----------------------

note: get word 4 of the clickcharChunk

without selecting the word it did not work

this works but don't ask me why

Kind regards
Bernd
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

J. Landman Gay
In reply to this post by Slava Paperno (Bridge)
On 6/18/11 7:32 AM, Slava Paperno wrote:

> You also said "if you're desperate, delete that extra null byte." That's
> intriguing. I don't care about PPC machines. But I can't find any null bytes
> in my fields. I expected a null byte to be added to each Roman character,
> e.g. 0x00AF. But when I use "put byte N of field X" for the Roman
> characters, I can't find a null byte anywhere. Roman characters seem to be
> represented by one byte in my two-bytes fields. I know this doesn't sound
> right, so I must be missing something.

You aren't missing anything, that's how it works right now. If it's
ascii, it doesn't get that extra byte. We will all be so happy with real
unicode gets implemented.

Here is a shudderingly, terrifyingly ugly hack that seems to work, at
least with the sample text you posted:

on mouseUp
   get word 2 of the mouseCharChunk
   lock messages
   lock screen
   put the unicodeText of me into tOrigText
   replace space with "|" in me
   set the itemdel to "|"
   put the number of items in char 1 to it of me into tWordNum
   set the unicodeText of me to tOrigText
   put word tWordNum of me && tWordNum
   unlock screen
   unlock messages
end mouseUp

If you don't lock messages, it takes forever. There may be a way to
improve this, but that's the idea.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode

J. Landman Gay
On 6/18/11 1:34 PM, J. Landman Gay wrote:

> Here is a shudderingly, terrifyingly ugly hack that seems to work, at
> least with the sample text you posted:

Never mind, it's off by one. I didn't test each letter. :(

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

RE: the mouse Text and Unicode

Slava Paperno (Bridge)
When I use word 4 of the mouseCharChunk (instead of word 2), following
Bernd's procedure, then your first method "the number of words in char 1 to
(word 4 of the mouseCharChunk) of me" works for all words in my sample, and
I think so does this "terrifying hack." I'm still testing with longer and
more varied texts...

Thanks to everyone!

Slava

> -----Original Message-----
> From: [hidden email] [mailto:use-livecode-
> [hidden email]] On Behalf Of J. Landman Gay
> Sent: Saturday, June 18, 2011 2:45 PM
> To: How to use LiveCode
> Subject: Re: the mouseText and Unicode
>
> On 6/18/11 1:34 PM, J. Landman Gay wrote:
>
> > Here is a shudderingly, terrifyingly ugly hack that seems to work, at
> > least with the sample text you posted:
>
> Never mind, it's off by one. I didn't test each letter. :(
>
> --
> Jacqueline Landman Gay         |     [hidden email]
> HyperActive Software           |     http://www.hyperactivesw.com
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

RE: the mouseText and Unicode

BNig
In reply to this post by BNig
Slava,

this also works

-----------------------------
on mouseUp
   get word 4 of the clickCharChunk
   put the number of words in char 1 to it of me into tWordNum
   select word tWordNum of me
   put the unicodeText of  the selectedtext into tWordClicked
   set the unicodeText of field 2 to tWordClicked
end mouseUp
------------------------------

Kind regards

Bernd
Reply | Threaded
Open this post in threaded view
|

the mouseText and Unicode: the Russian letter R

Slava Paperno (Bridge)
In reply to this post by BNig
Something is different about the Russian upper case R (Р, decimal 1056, hex 0420, you can copy it from Character Map in Windows). That's why Robert misbehaves, and so do Ruanda, Rodesia, and even USSR.

Seriously, when the word СССР is clicked on any of the first three letters, and I use one of the hacks we've been discussing, the first three letters of the word are displayed (but not the fourth one, the R). When I click the fourth letter (the Russian R) a rectangle is displayed instead of the word--a placeholder we often see for a character that is missing in the font. It is not a font problem--the word is displayed fine in the original field where I click it, and the same font is used in all fields.
 
Curiouser and curiouser...

Slava

> -----Original Message-----
> From: [hidden email] [mailto:use-livecode-
> [hidden email]] On Behalf Of BNig
> Sent: Saturday, June 18, 2011 1:10 PM
> To: [hidden email]
> Subject: RE: the mouseText and Unicode
>
> Slava,
>
> Роберт gave up:
> -----------------------
> on mouseUp
>    get word 4 of the clickCharChunk
>    put the number of words in char 1 to it of me into tWordNum
>    select word tWordNum of me
>    put the htmlText of  the selectedtext into tWordClicked
>    put tWordClicked into field 3
>    set the htmlText of field 2 to tWordClicked
> end mouseUp
> ----------------------
>
> note: get word 4 of the clickcharChunk
>
> without selecting the word it did not work
>
> this works but don't ask me why
>
> Kind regards
> Bernd
>
> --
> View this message in context: http://runtime-
> revolution.278305.n4.nabble.com/the-mouseText-and-Unicode-
> tp3607206p3608073.html
> Sent from the Revolution - User mailing list archive at Nabble.com.
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode: the Russian letter R

Malte Brill
> Something is different about the Russian upper case R
byteToNum(byte 1 of the russian letter R) is 32 which is (TADA) SPACE and thus a word delimiter. Did I mention I hate unicode and the way it currently (not) works?

Cries bitter tears.

Malte


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: the mouseText and Unicode: the Russian letter R

BNig
In reply to this post by Slava Paperno (Bridge)
Slava,

although Роберт is a nice guy he must give in:

I tried with

Саша, Наташа, Митя, Роберт, Robert, Jeffrey, and Соня Петрова, СССР, ССРРСС Слава
Паперно, Лора Баглай, Макс, Паперно
 Роберт, СССР, ССРРСС

-------------------
on mouseUp
   get word 4 of the clickCharChunk
   put it into tSelPos
   put 0 into tStartSel
   
   repeat with i = tSelPos down to 1
      put the htmlText of char i of field 1 into tHTML
      if  (tHTML contains  "<p> <" or tHTML is "<p></p>" or tHTML contains ">,<") then
         put i into tStartSel
         exit repeat
      end if
   end repeat
   
   put the number of chars of field 1 into tEndSel
   repeat with i = tSelPos to the number of chars of field 1
      put the htmlText of char i of field 1 into tHTML
      put char i of tData into taChar
      if  (tHTML contains  "<p> <" or tHTML is "<p></p>" or tHTML contains ">,<") then
         put i into tEndSel
         exit repeat
      end if
   end repeat
   
   select char tStartSel + 1 to tEndSel -1 of me
   
   put the htmlText of  the selectedtext into tWordClicked
   -- put tWordClicked into field 3
   set the htmlText of field 2 to tWordClicked
end mouseUp
-------------------

please watch out for linebreaks

selecting a word of either the unicode kind or the roman kind works with above code.


I now test the htmlText for space, return and comma. I scan from the clickCharChunk up and down until any of these are true. Then I exit the scan and 'declare' what is between a word, select the word in the field and get the html of the selectedText.
Should also work with the unicodeText of the selectedText instead of the htmlText I am using now.

If I look at the chartoNum of Р (russian R) I see it is made of ascii 32 and ascii 9. ASCII 32 being a space maybe that is a clue to why it throws Livecode off. I would consider this a, well, anomaly and can only hope for Livecode to eventually support Unicode more completely.

you said "Curiouser and curiouser..."
I would say "uglier and uglier..."

Kind regards

Bernd
12