Jumping cursors

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Jumping cursors

Richmond Mathewson-2
Hey-Ho: more fun with my endless Devawriter Pro (Grantha Samyuktaksharas
for those who
care about that sort of thing).

So, I have a line of code that says this:

set the text of the selectedText to numToCodePoint(0xFF001)

and it does do what it is meant to do; i.e. pops character 0xFF001 at
the end of the line in my
text entry field, but I really want the cursor to end up after that
character not in front of it

"raw" the cursor ends up in front of 0xFF001

if the line is followed up by something of this sort:

set the text of the selectedText to  "XXX"

the cursor ends up at the end of the line, after the triple Xs . . .

However: if I do this instead

select after fld "fRESULT" (that's the name of the text entry field) the
cursor ends up in front of the 0xFF001
character . . . and, what is more does not allow me to move the cursor
in anyway whatsoever after
the 0xFF001 character.

Richmond.
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

J. Landman Gay
On 1/4/17 10:33 AM, Richmond Mathewson wrote:
>
> set the text of the selectedText to numToCodePoint(0xFF001)
>
> and it does do what it is meant to do; i.e. pops character 0xFF001 at
> the end of the line in my
> text entry field, but I really want the cursor to end up after that
> character not in front of it

I'm a little surprised that works at all. The "selectedtext" returns a
string, not a position. I'd use "selectedChunk" which would provide a
character location, enabling you to set the cursor at a specific position.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Richmond Mathewson-2
I have been using "the selectedText" without a backward glance extremely
successfully for 6 years.

Richmond.

On 1/4/17 11:26 pm, J. Landman Gay wrote:

> On 1/4/17 10:33 AM, Richmond Mathewson wrote:
>>
>> set the text of the selectedText to numToCodePoint(0xFF001)
>>
>> and it does do what it is meant to do; i.e. pops character 0xFF001 at
>> the end of the line in my
>> text entry field, but I really want the cursor to end up after that
>> character not in front of it
>
> I'm a little surprised that works at all. The "selectedtext" returns a
> string, not a position. I'd use "selectedChunk" which would provide a
> character location, enabling you to set the cursor at a specific
> position.
>

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

dunbarxx
Neither construction works in HC. You would have had to:

put "foo" into the selectedChunk.

Both work in LC, though, and it is a case where LC is more forgiving than HC.

Craig
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Kay C Lan
In reply to this post by J. Landman Gay
On Thu, Jan 5, 2017 at 5:26 AM, J. Landman Gay <[hidden email]> wrote:
>
> I'm a little surprised that works at all. The "selectedtext" returns a
> string, not a position. I'd use "selectedChunk" which would provide a
> character location, enabling you to set the cursor at a specific position.
>
Whilst your definitions of 'selectedText' and 'selectedChunk' are
correct, the fact is that 'set the text of the selectedText to "abc"'
does replace whatever text you've hilited with whatever text you've
specified regardless of whether the text you've hilighted is a long
string, a short string or an empty string. The 'normal' result of
doing such is that the cursor ends up and the right hand end of the
new text, but apparently not so if the new text is
numToCodePoint(0xFF001)

I think Richmond should file a Bug report because it does seem he's
found an anomaly, or at the very least, if there is a valid reason why
this is the case for 0xFF001 (and possibly others) then maybe a Note
in the Dictionary describing this situation would be useful.

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Ali Lloyd-2
0xFF001 appears to be an invalid unicode character, residing in the private
use area. When I try
> set the text of the selectedText to numToCodepoint(0xff001)
The text is replaced by a character that LiveCode appears to think is RTL,
and the cursor splits as it does when placed 'ahead' of an RTL character in
mixed text. If you replace all the text of a line with it, it will
therefore place the cursor to the left of the character.

Whether this is a bug or not depends on whether 0xFF001 *should* be treated
as RTL or not. I kind of suspect it isn't, but making sure codepoints from
the private use area behave correctly in a field is unlikely to be a high
priority fix, unless you have a good reason for doing it!

On Wed, Jan 4, 2017 at 11:49 PM Kay C Lan <[hidden email]> wrote:

> On Thu, Jan 5, 2017 at 5:26 AM, J. Landman Gay <[hidden email]>
> wrote:
> >
> > I'm a little surprised that works at all. The "selectedtext" returns a
> > string, not a position. I'd use "selectedChunk" which would provide a
> > character location, enabling you to set the cursor at a specific
> position.
> >
> Whilst your definitions of 'selectedText' and 'selectedChunk' are
> correct, the fact is that 'set the text of the selectedText to "abc"'
> does replace whatever text you've hilited with whatever text you've
> specified regardless of whether the text you've hilighted is a long
> string, a short string or an empty string. The 'normal' result of
> doing such is that the cursor ends up and the right hand end of the
> new text, but apparently not so if the new text is
> numToCodePoint(0xFF001)
>
> I think Richmond should file a Bug report because it does seem he's
> found an anomaly, or at the very least, if there is a valid reason why
> this is the case for 0xFF001 (and possibly others) then maybe a Note
> in the Dictionary describing this situation would be useful.
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Richmond Mathewson-2
Um: this could be a "stupid Richmond" case rather than anything else as
I populated cells FF001
to FF01E with Grantha Samyuktaksharas: and those in FF002 and so on
behave perfectly well;
but FF001 could be a non-character which I had overlooked: If one goes
here:

http://www.unicode.org/charts/PDF/UF0000.pdf

information regarding FF001 is not much use . . .

The Range: F0000 - FFFFF is the Unicode Supplementary Private Use
Area-A; a bit like that area
in New Mexico.

Although "The entire plane is dedicated to private use with the
exception of the last two code points."
would seem to imply that FF001 should cause me no problems.

The difficulty with Unicode is that as it is a standard that is always
changing, and that document about
the Unicode Supplementary Private Use Area-A is from Unicode version 6
(the current one is version 9).

So, back to the font editor and shift that character down to the other
end of the list . . .

The Unicode convention's website suffers from a prolixity that largely
serves to obfuscate rather than
explain, indulging in long sentences full of Latin neologisms.

Richmond.

On 1/5/17 10:36 am, Ali Lloyd wrote:

> 0xFF001 appears to be an invalid unicode character, residing in the private
> use area. When I try
>> set the text of the selectedText to numToCodepoint(0xff001)
> The text is replaced by a character that LiveCode appears to think is RTL,
> and the cursor splits as it does when placed 'ahead' of an RTL character in
> mixed text. If you replace all the text of a line with it, it will
> therefore place the cursor to the left of the character.
>
> Whether this is a bug or not depends on whether 0xFF001 *should* be treated
> as RTL or not. I kind of suspect it isn't, but making sure codepoints from
> the private use area behave correctly in a field is unlikely to be a high
> priority fix, unless you have a good reason for doing it!
>
> On Wed, Jan 4, 2017 at 11:49 PM Kay C Lan <[hidden email]> wrote:
>
>> On Thu, Jan 5, 2017 at 5:26 AM, J. Landman Gay <[hidden email]>
>> wrote:
>>> I'm a little surprised that works at all. The "selectedtext" returns a
>>> string, not a position. I'd use "selectedChunk" which would provide a
>>> character location, enabling you to set the cursor at a specific
>> position.
>> Whilst your definitions of 'selectedText' and 'selectedChunk' are
>> correct, the fact is that 'set the text of the selectedText to "abc"'
>> does replace whatever text you've hilited with whatever text you've
>> specified regardless of whether the text you've hilighted is a long
>> string, a short string or an empty string. The 'normal' result of
>> doing such is that the cursor ends up and the right hand end of the
>> new text, but apparently not so if the new text is
>> numToCodePoint(0xFF001)
>>
>> I think Richmond should file a Bug report because it does seem he's
>> found an anomaly, or at the very least, if there is a valid reason why
>> this is the case for 0xFF001 (and possibly others) then maybe a Note
>> in the Dictionary describing this situation would be useful.
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Mark Waddingham-2
On 2017-01-05 09:56, Richmond Mathewson wrote:
> Um: this could be a "stupid Richmond" case rather than anything else
> as I populated cells FF001
> to FF01E with Grantha Samyuktaksharas: and those in FF002 and so on
> behave perfectly well;

This is a case of 'stupid engine', rather than 'stupid Richmond':

http://quality.livecode.com/show_bug.cgi?id=19045

The implementation of the bidi algorithm in the engine is currently
computing surrogate pairs incorrectly. In this case, 0xFF001 is being
read as a character in the arabic script area in the BMP which has the
'Arabic RTL' attribute. This means that it is being treated as an RTL
character when it should not be.

> but FF001 could be a non-character which I had overlooked: If one goes
> here:
>
> http://www.unicode.org/charts/PDF/UF0000.pdf
>
> information regarding FF001 is not much use . . .
>
> The Range: F0000 - FFFFF is the Unicode Supplementary Private Use
> Area-A; a bit like that area
> in New Mexico.
>
> Although "The entire plane is dedicated to private use with the
> exception of the last two code points."
> would seem to imply that FF001 should cause me no problems.

Indeed - end user applications are free to use SPUA-A and SPUA-B for
whatever purpose they wish... With the only caveat that two uses of said
areas might be completely incompatible. (i.e. a font designed for use in
one application which uses these areas, might break horribly in an app
which uses the area for a completely different purpose).

Warmest Regards,

Mark.

--
Mark Waddingham ~ [hidden email] ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Richmond Mathewson-2
Thanks for a very clear explanation.

On 1/5/17 11:19 am, Mark Waddingham wrote:
> On 2017-01-05 09:56, Richmond Mathewson wrote:
>> Um: this could be a "stupid Richmond" case rather than anything else
>> as I populated cells FF001
>> to FF01E with Grantha Samyuktaksharas: and those in FF002 and so on
>> behave perfectly well;
>
> This is a case of 'stupid engine', rather than 'stupid Richmond':

Ha, Ha, Ha: possibly the first time ever that it hasn't been the latter :)
>
> http://quality.livecode.com/show_bug.cgi?id=19045

By 'stupid engine' do you mean the LiveCode engine, something else, or
code that has been co-opted
from elsewhere and folded into the LC engine?

>
> The implementation of the bidi algorithm in the engine is currently
> computing surrogate pairs incorrectly.

"The implementation of the bidi algorithm" . . . ouch.

Aah . . . "bidi" means 'BIDIrectional'

Obviously something rather jazzier than my feevle effort:
https://www.dropbox.com/s/rlw0t1ymwoghq5q/SURROGATER.rev.zip?dl=0

I, like a fool, had assumed that post LiveCode 7.0 the engine was,
somehow, avoiding surrogate pairs
altogether, rather than fudging around so things were *very pleasant
indeed* for people like me when
leveraging glyphs occupying Unicode areas above the first plain.

Obviously things were slightly too good to be true.

> In this case, 0xFF001 is being read as a character in the arabic
> script area in the BMP which has the 'Arabic RTL' attribute. This
> means that it is being treated as an RTL character when it should not be.

Do you have any idea which other surrogate pairs it might be getting wrong?

Until (if ?) things get sorted out that would be a useful reference list
so as to know which Unicode slots
to avoid.

Writing as a lazy slob I feel no screaming urge to go back and recode
all those (0x4FFF6), (0x3EEDA)
hex codes as surrogate pairs . . .

>
>> but FF001 could be a non-character which I had overlooked: If one
>> goes here:
>>
>> http://www.unicode.org/charts/PDF/UF0000.pdf
>>
>> information regarding FF001 is not much use . . .
>>
>> The Range: F0000 - FFFFF is the Unicode Supplementary Private Use
>> Area-A; a bit like that area
>> in New Mexico.
>>
>> Although "The entire plane is dedicated to private use with the
>> exception of the last two code points."
>> would seem to imply that FF001 should cause me no problems.
>
> Indeed - end user applications are free to use SPUA-A and SPUA-B for
> whatever purpose they wish... With the only caveat that two uses of
> said areas might be completely incompatible. (i.e. a font designed for
> use in one application which uses these areas, might break horribly in
> an app which uses the area for a completely different purpose).

My Devawriter Pro application depends on my Devawriter.ttf font which
employs all 3 Private Use Areas to
deliver the conjunct consonants used in the Indian writing systems used
to write Sanskrit.

In fact I have spent nearly as much time developing my font as I have on
the Devawriter Pro application itself.

Obviously my font is not going to be much use outwith my application
beyond displaying
HTML, RTF and PDF documents derived from the application.

>
> Warmest Regards,
>
> Mark.
>
Best,

Richmond.
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Mark Waddingham-2
On 2017-01-05 11:01, Richmond Mathewson wrote:
> Ha, Ha, Ha: possibly the first time ever that it hasn't been the latter
> :)
>>
>> http://quality.livecode.com/show_bug.cgi?id=19045
>
> By 'stupid engine' do you mean the LiveCode engine, something else, or
> code that has been co-opted
> from elsewhere and folded into the LC engine?

Specifically the internal routine which fetches the Unicode 'properties'
for a run of characters is currently computing a surrogate pair's
codepoint incorrectly - in this case U+0FF001 is being treated as U+07BC
- which is an undefined codepoint and as such the property info being
fetched (in this case, BiDi class) is undefined.

> I, like a fool, had assumed that post LiveCode 7.0 the engine was,
> somehow, avoiding surrogate pairs
> altogether, rather than fudging around so things were *very pleasant
> indeed* for people like me when
> leveraging glyphs occupying Unicode areas above the first plain.
>
> Obviously things were slightly too good to be true.

The engine does 'automatically' deal with surrogate pairs in UTF-16.
Indeed, the fact that they exist at all in the engine's internal
representation is generally not something the developer has to worry
about (modulo bugs, like the one above).

You can use the codeunit chunk to access a string's individual UTF-16
components, codepoint chunk to access a string as a sequence of actual
codepoints, and char to access a string as a sequence of graphemes
(approximation to what most people call 'letters' or 'characters').

> Do you have any idea which other surrogate pairs it might be getting
> wrong?
>
> Until (if ?) things get sorted out that would be a useful reference
> list so as to know which Unicode slots
> to avoid.

This should list all the codepoints in the SPUA-A which will cause
directionality problems (due to incorrect property lookup):

    local tList
    repeat with tCodepoint = 0xF0000 to 0xFFFFD
       get numToCodepoint(tCodepoint)

       local tLeading, tTrailing
       put codepointToNum(codeunit 1 of it) into tLeading
       put codepointToNum(codeunit 2 of it) into tTrailing

       local tWrongCodepoint
       put (tLeading - 0xD800) + ((tTrailing - 0xDC00)  * 2^10) into
tWrongCodepoint

       get codepointProperty(numToCodepoint(tWrongCodepoint), "Bidi
Class")
       if it contains "Right_To_Left" or it contains "Arabic" then
          put format("U+0x%6x has wrong bidi class - %s\n", tCodepoint,
it) after tList
       end if
    end repeat
    put tList

> Writing as a lazy slob I feel no screaming urge to go back and recode
> all those (0x4FFF6), (0x3EEDA)
> hex codes as surrogate pairs . . .

Doing so wouldn't do you any good anyway. The bug lies in the processing
of the string *after* it has been constructed - whether it is
constructed directly from codepoints, or codeunits wouldn't make a
difference.

I've submitted a PR for a fix to the problem against the 8.1 branch
here:

    https://github.com/livecode/livecode/pull/5020

Warmest Regards,

Mark.

--
Mark Waddingham ~ [hidden email] ~ http://www.livecode.com/
LiveCode: Everyone can create apps

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Richmond Mathewson-2
Thank you: 373 wonky results!

Well, to be honest, I'm not going to wait for you and yours to sort that
out; I shall use the list to help
me avoid wonky Unicode addresses.

On 1/5/17 1:07 pm, Mark Waddingham wrote:

> On 2017-01-05 11:01, Richmond Mathewson wrote:
>> Ha, Ha, Ha: possibly the first time ever that it hasn't been the
>> latter :)
>>>
>>> http://quality.livecode.com/show_bug.cgi?id=19045
>>
>> By 'stupid engine' do you mean the LiveCode engine, something else, or
>> code that has been co-opted
>> from elsewhere and folded into the LC engine?
>
> Specifically the internal routine which fetches the Unicode
> 'properties' for a run of characters is currently computing a
> surrogate pair's codepoint incorrectly - in this case U+0FF001 is
> being treated as U+07BC - which is an undefined codepoint and as such
> the property info being fetched (in this case, BiDi class) is undefined.
>
>> I, like a fool, had assumed that post LiveCode 7.0 the engine was,
>> somehow, avoiding surrogate pairs
>> altogether, rather than fudging around so things were *very pleasant
>> indeed* for people like me when
>> leveraging glyphs occupying Unicode areas above the first plain.
>>
>> Obviously things were slightly too good to be true.
>
> The engine does 'automatically' deal with surrogate pairs in UTF-16.
> Indeed, the fact that they exist at all in the engine's internal
> representation is generally not something the developer has to worry
> about (modulo bugs, like the one above).
>
> You can use the codeunit chunk to access a string's individual UTF-16
> components, codepoint chunk to access a string as a sequence of actual
> codepoints, and char to access a string as a sequence of graphemes
> (approximation to what most people call 'letters' or 'characters').
>
>> Do you have any idea which other surrogate pairs it might be getting
>> wrong?
>>
>> Until (if ?) things get sorted out that would be a useful reference
>> list so as to know which Unicode slots
>> to avoid.
>
> This should list all the codepoints in the SPUA-A which will cause
> directionality problems (due to incorrect property lookup):
>
>    local tList
>    repeat with tCodepoint = 0xF0000 to 0xFFFFD
>       get numToCodepoint(tCodepoint)
>
>       local tLeading, tTrailing
>       put codepointToNum(codeunit 1 of it) into tLeading
>       put codepointToNum(codeunit 2 of it) into tTrailing
>
>       local tWrongCodepoint
>       put (tLeading - 0xD800) + ((tTrailing - 0xDC00)  * 2^10) into
> tWrongCodepoint
>
>       get codepointProperty(numToCodepoint(tWrongCodepoint), "Bidi
> Class")
>       if it contains "Right_To_Left" or it contains "Arabic" then
>          put format("U+0x%6x has wrong bidi class - %s\n", tCodepoint,
> it) after tList
>       end if
>    end repeat
>    put tList

Anyone who wants to mess around with this (I am on a Macintosh at the
moment) on Windows or Linux
can download this:

https://www.dropbox.com/s/i8ba0viztujs0dq/bad%20Unicode.livecode.zip?dl=0

>
>> Writing as a lazy slob I feel no screaming urge to go back and recode
>> all those (0x4FFF6), (0x3EEDA)
>> hex codes as surrogate pairs . . .
>
> Doing so wouldn't do you any good anyway. The bug lies in the
> processing of the string *after* it has been constructed - whether it
> is constructed directly from codepoints, or codeunits wouldn't make a
> difference.
>
> I've submitted a PR for a fix to the problem against the 8.1 branch here:
>
>    https://github.com/livecode/livecode/pull/5020

Presumably that also holds forth for the LiveCode 9 series.
>
> Warmest Regards,
>
> Mark.
>

Best, Richmond.
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Jumping cursors

Richmond Mathewson-2
In reply to this post by Mark Waddingham-2
Impressive:

"added this to the 8.1.3-rc-1
<https://github.com/livecode/livecode/milestone/126> milestone"

Has any one any idea when to expect that release?

Richmond.

<snip>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode