Quantcast

Translating escape sequences

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Translating escape sequences

Mark Talluto via use-livecode
I'm dealing with non-English languages, and JSON data retrieved from a
database comes in with unicode escape sequences like this: Eduardo
Ba\u00f1uls.

I need to translate those. I can do it by replacing the "\u" with "0x"
and then using numToCodepoint() to get the UTF16 character. But there
could be many of these in the same string, so I'm looking for a one-shot
command that might just do them all. I don't think we have one.

The alternative is to loop through all the text, getting an offset for
each "\u" and then calculating the number of characters after that to
use with numToCodepoint(). But will it always be 4 characters in any
language?

Or is there an easier way?

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
Does JavaScript have a way to do the translation?

Sent from my iPhone

> On Mar 14, 2017, at 5:26 PM, J. Landman Gay via use-livecode <[hidden email]> wrote:
>
> I'm dealing with non-English languages, and JSON data retrieved from a database comes in with unicode escape sequences like this: Eduardo Ba\u00f1uls.
>
> I need to translate those. I can do it by replacing the "\u" with "0x" and then using numToCodepoint() to get the UTF16 character. But there could be many of these in the same string, so I'm looking for a one-shot command that might just do them all. I don't think we have one.
>
> The alternative is to loop through all the text, getting an offset for each "\u" and then calculating the number of characters after that to use with numToCodepoint(). But will it always be 4 characters in any language?
>
> Or is there an easier way?
>
> --
> Jacqueline Landman Gay         |     [hidden email]
> HyperActive Software           |     http://www.hyperactivesw.com
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
What I mean is - retrieve through JS to avoid escape characters then translate to utf-8 to pass to LC.

Might be too complicated though.

Sent from my iPhone

> On Mar 14, 2017, at 5:26 PM, J. Landman Gay via use-livecode <[hidden email]> wrote:
>
> www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
No; it won't always be 4 characters, here's an admittedly extremely
obscure ancient Sinhala number;
0x111F4.

Of course the chances of encountering whacky characters like that is
small, but you'll have to make sure you
can cope with them should they crop up.

If you look at Eduardo Ba\u00f1uls you will have to strip what comes
after the '\' of the prefix 'u'
and the suffix 'uls' and then you can cope with whatever is left:

Reasonably pseudo-code following:

set the item delimiter to \
put what's after the item delimiter into HOLDER
delete char 1 of HOLDER
delete the last char of HOLDER
delete the last char of HOLDER
delete the last char of HOLDER
put "0x" & HOLDER into NUNUM

at this point "NUNUM" could be alost any length, but that should not
matter unduly.

Richmond.

On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:

> I'm dealing with non-English languages, and JSON data retrieved from a
> database comes in with unicode escape sequences like this: Eduardo
> Ba\u00f1uls.
>
> I need to translate those. I can do it by replacing the "\u" with "0x"
> and then using numToCodepoint() to get the UTF16 character. But there
> could be many of these in the same string, so I'm looking for a
> one-shot command that might just do them all. I don't think we have one.
>
> The alternative is to loop through all the text, getting an offset for
> each "\u" and then calculating the number of characters after that to
> use with numToCodepoint(). But will it always be 4 characters in any
> language?
>
> Or is there an easier way?
>

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode


On 14/03/2017 21:26, J. Landman Gay via use-livecode wrote:

> I'm dealing with non-English languages, and JSON data retrieved from a
> database comes in with unicode escape sequences like this: Eduardo
> Ba\u00f1uls.
>
> I need to translate those. I can do it by replacing the "\u" with "0x"
> and then using numToCodepoint() to get the UTF16 character. But there
> could be many of these in the same string, so I'm looking for a one-shot
> command that might just do them all. I don't think we have one.
>
> The alternative is to loop through all the text, getting an offset for
> each "\u" and then calculating the number of characters after that to
> use with numToCodepoint(). But will it always be 4 characters in any
> language?
>
> Or is there an easier way?

JsonImport() should handle those automatically.  Please let me know if
it doesn't!

                                         Peter

--
Dr Peter Brett <[hidden email]>

lcb-mode for Emacs: https://github.com/peter-b/lcb-mode

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
The problem with the pseudo code is that there's no clear indication of how
many characters at the end to preserve. I'm not sure how the libraries deal
with that.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com



On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode
<[hidden email]> wrote:

> No; it won't always be 4 characters, here's an admittedly extremely
> obscure ancient Sinhala number;
> 0x111F4.
>
> Of course the chances of encountering whacky characters like that is
> small, but you'll have to make sure you
> can cope with them should they crop up.
>
> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
> after the '\' of the prefix 'u'
> and the suffix 'uls' and then you can cope with whatever is left:
>
> Reasonably pseudo-code following:
>
> set the item delimiter to \
> put what's after the item delimiter into HOLDER
> delete char 1 of HOLDER
> delete the last char of HOLDER
> delete the last char of HOLDER
> delete the last char of HOLDER
> put "0x" & HOLDER into NUNUM
>
> at this point "NUNUM" could be alost any length, but that should not
> matter unduly.
>
> Richmond.
>
> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>> I'm dealing with non-English languages, and JSON data retrieved from a
>> database comes in with unicode escape sequences like this: Eduardo
>> Ba\u00f1uls.
>>
>> I need to translate those. I can do it by replacing the "\u" with "0x"
>> and then using numToCodepoint() to get the UTF16 character. But there
>> could be many of these in the same string, so I'm looking for a
>> one-shot command that might just do them all. I don't think we have one.
>>
>> The alternative is to loop through all the text, getting an offset for
>> each "\u" and then calculating the number of characters after that to
>> use with numToCodepoint(). But will it always be 4 characters in any
>> language?
>>
>> Or is there an easier way?
>>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
Just knock off the last 3, and what is left is what you want.

Richmond.

On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:

> The problem with the pseudo code is that there's no clear indication
> of how many characters at the end to preserve. I'm not sure how the
> libraries deal with that.
>
> --
> Jacqueline Landman Gay         |     [hidden email]
> HyperActive Software           |     http://www.hyperactivesw.com
>
>
>
> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode
> <[hidden email]> wrote:
>
>> No; it won't always be 4 characters, here's an admittedly extremely
>> obscure ancient Sinhala number;
>> 0x111F4.
>>
>> Of course the chances of encountering whacky characters like that is
>> small, but you'll have to make sure you
>> can cope with them should they crop up.
>>
>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>> after the '\' of the prefix 'u'
>> and the suffix 'uls' and then you can cope with whatever is left:
>>
>> Reasonably pseudo-code following:
>>
>> set the item delimiter to \
>> put what's after the item delimiter into HOLDER
>> delete char 1 of HOLDER
>> delete the last char of HOLDER
>> delete the last char of HOLDER
>> delete the last char of HOLDER
>> put "0x" & HOLDER into NUNUM
>>
>> at this point "NUNUM" could be alost any length, but that should not
>> matter unduly.
>>
>> Richmond.
>>
>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>> database comes in with unicode escape sequences like this: Eduardo
>>> Ba\u00f1uls.
>>>
>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>> and then using numToCodepoint() to get the UTF16 character. But there
>>> could be many of these in the same string, so I'm looking for a
>>> one-shot command that might just do them all. I don't think we have
>>> one.
>>>
>>> The alternative is to loop through all the text, getting an offset for
>>> each "\u" and then calculating the number of characters after that to
>>> use with numToCodepoint(). But will it always be 4 characters in any
>>> language?
>>>
>>> Or is there an easier way?
>>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
What if the user name has seven characters after the escape sequence?

On 3/15/17 3:16 PM, Richmond Mathewson via use-livecode wrote:

> Just knock off the last 3, and what is left is what you want.
>
> Richmond.
>
> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>> The problem with the pseudo code is that there's no clear indication
>> of how many characters at the end to preserve. I'm not sure how the
>> libraries deal with that.
>>
>> --
>> Jacqueline Landman Gay         |     [hidden email]
>> HyperActive Software           |     http://www.hyperactivesw.com
>>
>>
>>
>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode
>> <[hidden email]> wrote:
>>
>>> No; it won't always be 4 characters, here's an admittedly extremely
>>> obscure ancient Sinhala number;
>>> 0x111F4.
>>>
>>> Of course the chances of encountering whacky characters like that is
>>> small, but you'll have to make sure you
>>> can cope with them should they crop up.
>>>
>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>> after the '\' of the prefix 'u'
>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>
>>> Reasonably pseudo-code following:
>>>
>>> set the item delimiter to \
>>> put what's after the item delimiter into HOLDER
>>> delete char 1 of HOLDER
>>> delete the last char of HOLDER
>>> delete the last char of HOLDER
>>> delete the last char of HOLDER
>>> put "0x" & HOLDER into NUNUM
>>>
>>> at this point "NUNUM" could be alost any length, but that should not
>>> matter unduly.
>>>
>>> Richmond.
>>>
>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>>> database comes in with unicode escape sequences like this: Eduardo
>>>> Ba\u00f1uls.
>>>>
>>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>> could be many of these in the same string, so I'm looking for a
>>>> one-shot command that might just do them all. I don't think we have
>>>> one.
>>>>
>>>> The alternative is to loop through all the text, getting an offset for
>>>> each "\u" and then calculating the number of characters after that to
>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>> language?
>>>>
>>>> Or is there an easier way?
>>>>
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> [hidden email]
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>


--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
Sounds like a mob hit. :-)

Bob S


> On Mar 15, 2017, at 13:16 , Richmond Mathewson via use-livecode <[hidden email]> wrote:
>
> Just knock off the last 3, and what is left is what you want.
>
> Richmond.


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
does this mean one could replace /u with 0x and then replace uls with empty
and end up with the correct end result?

On Wed, Mar 15, 2017 at 2:16 PM, Richmond Mathewson via use-livecode <
[hidden email]> wrote:

> Just knock off the last 3, and what is left is what you want.
>
> Richmond.
>
> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>
>> The problem with the pseudo code is that there's no clear indication of
>> how many characters at the end to preserve. I'm not sure how the libraries
>> deal with that.
>>
>> --
>> Jacqueline Landman Gay         |     [hidden email]
>> HyperActive Software           |     http://www.hyperactivesw.com
>>
>>
>>
>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode <
>> [hidden email]> wrote:
>>
>> No; it won't always be 4 characters, here's an admittedly extremely
>>> obscure ancient Sinhala number;
>>> 0x111F4.
>>>
>>> Of course the chances of encountering whacky characters like that is
>>> small, but you'll have to make sure you
>>> can cope with them should they crop up.
>>>
>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>> after the '\' of the prefix 'u'
>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>
>>> Reasonably pseudo-code following:
>>>
>>> set the item delimiter to \
>>> put what's after the item delimiter into HOLDER
>>> delete char 1 of HOLDER
>>> delete the last char of HOLDER
>>> delete the last char of HOLDER
>>> delete the last char of HOLDER
>>> put "0x" & HOLDER into NUNUM
>>>
>>> at this point "NUNUM" could be alost any length, but that should not
>>> matter unduly.
>>>
>>> Richmond.
>>>
>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>
>>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>>> database comes in with unicode escape sequences like this: Eduardo
>>>> Ba\u00f1uls.
>>>>
>>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>> could be many of these in the same string, so I'm looking for a
>>>> one-shot command that might just do them all. I don't think we have one.
>>>>
>>>> The alternative is to loop through all the text, getting an offset for
>>>> each "\u" and then calculating the number of characters after that to
>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>> language?
>>>>
>>>> Or is there an easier way?
>>>>
>>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> [hidden email]
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>
>>
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
On 3/15/17 4:03 PM, Mike Bonner via use-livecode wrote:
> does this mean one could replace /u with 0x and then replace uls with empty
> and end up with the correct end result?

Aha. Now I know what's been wrong with my scripts. I've been replacing
*nulls* with empty.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
(watches as the whole topic zooms over his head)

On Wed, Mar 15, 2017 at 4:03 PM, J. Landman Gay via use-livecode <
[hidden email]> wrote:

> On 3/15/17 4:03 PM, Mike Bonner via use-livecode wrote:
>
>> does this mean one could replace /u with 0x and then replace uls with
>> empty
>> and end up with the correct end result?
>>
>
> Aha. Now I know what's been wrong with my scripts. I've been replacing
> *nulls* with empty.
>
> --
> Jacqueline Landman Gay         |     [hidden email]
> HyperActive Software           |     http://www.hyperactivesw.com
>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
Ouch. My excuse is that I was working with the example you supplied.

Richmond.

On 15/03/17 22:36, J. Landman Gay via use-livecode wrote:

> What if the user name has seven characters after the escape sequence?
>
> On 3/15/17 3:16 PM, Richmond Mathewson via use-livecode wrote:
>> Just knock off the last 3, and what is left is what you want.
>>
>> Richmond.
>>
>> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>>> The problem with the pseudo code is that there's no clear indication
>>> of how many characters at the end to preserve. I'm not sure how the
>>> libraries deal with that.
>>>
>>> --
>>> Jacqueline Landman Gay         |     [hidden email]
>>> HyperActive Software           | http://www.hyperactivesw.com
>>>
>>>
>>>
>>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode
>>> <[hidden email]> wrote:
>>>
>>>> No; it won't always be 4 characters, here's an admittedly extremely
>>>> obscure ancient Sinhala number;
>>>> 0x111F4.
>>>>
>>>> Of course the chances of encountering whacky characters like that is
>>>> small, but you'll have to make sure you
>>>> can cope with them should they crop up.
>>>>
>>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>>> after the '\' of the prefix 'u'
>>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>>
>>>> Reasonably pseudo-code following:
>>>>
>>>> set the item delimiter to \
>>>> put what's after the item delimiter into HOLDER
>>>> delete char 1 of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> put "0x" & HOLDER into NUNUM
>>>>
>>>> at this point "NUNUM" could be alost any length, but that should not
>>>> matter unduly.
>>>>
>>>> Richmond.
>>>>
>>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>>> I'm dealing with non-English languages, and JSON data retrieved
>>>>> from a
>>>>> database comes in with unicode escape sequences like this: Eduardo
>>>>> Ba\u00f1uls.
>>>>>
>>>>> I need to translate those. I can do it by replacing the "\u" with
>>>>> "0x"
>>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>>> could be many of these in the same string, so I'm looking for a
>>>>> one-shot command that might just do them all. I don't think we have
>>>>> one.
>>>>>
>>>>> The alternative is to loop through all the text, getting an offset
>>>>> for
>>>>> each "\u" and then calculating the number of characters after that to
>>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>>> language?
>>>>>
>>>>> Or is there an easier way?
>>>>>
>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> [hidden email]
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>>>
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> [hidden email]
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Translating escape sequences

Mark Talluto via use-livecode
In reply to this post by Mark Talluto via use-livecode
Should do.

Richmond.

On 15/03/17 23:03, Mike Bonner via use-livecode wrote:

> does this mean one could replace /u with 0x and then replace uls with empty
> and end up with the correct end result?
>
> On Wed, Mar 15, 2017 at 2:16 PM, Richmond Mathewson via use-livecode <
> [hidden email]> wrote:
>
>> Just knock off the last 3, and what is left is what you want.
>>
>> Richmond.
>>
>> On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote:
>>
>>> The problem with the pseudo code is that there's no clear indication of
>>> how many characters at the end to preserve. I'm not sure how the libraries
>>> deal with that.
>>>
>>> --
>>> Jacqueline Landman Gay         |     [hidden email]
>>> HyperActive Software           |     http://www.hyperactivesw.com
>>>
>>>
>>>
>>> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode <
>>> [hidden email]> wrote:
>>>
>>> No; it won't always be 4 characters, here's an admittedly extremely
>>>> obscure ancient Sinhala number;
>>>> 0x111F4.
>>>>
>>>> Of course the chances of encountering whacky characters like that is
>>>> small, but you'll have to make sure you
>>>> can cope with them should they crop up.
>>>>
>>>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes
>>>> after the '\' of the prefix 'u'
>>>> and the suffix 'uls' and then you can cope with whatever is left:
>>>>
>>>> Reasonably pseudo-code following:
>>>>
>>>> set the item delimiter to \
>>>> put what's after the item delimiter into HOLDER
>>>> delete char 1 of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> delete the last char of HOLDER
>>>> put "0x" & HOLDER into NUNUM
>>>>
>>>> at this point "NUNUM" could be alost any length, but that should not
>>>> matter unduly.
>>>>
>>>> Richmond.
>>>>
>>>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote:
>>>>
>>>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>>>> database comes in with unicode escape sequences like this: Eduardo
>>>>> Ba\u00f1uls.
>>>>>
>>>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>>>> and then using numToCodepoint() to get the UTF16 character. But there
>>>>> could be many of these in the same string, so I'm looking for a
>>>>> one-shot command that might just do them all. I don't think we have one.
>>>>>
>>>>> The alternative is to loop through all the text, getting an offset for
>>>>> each "\u" and then calculating the number of characters after that to
>>>>> use with numToCodepoint(). But will it always be 4 characters in any
>>>>> language?
>>>>>
>>>>> Or is there an easier way?
>>>>>
>>>>>
>>>> _______________________________________________
>>>> use-livecode mailing list
>>>> [hidden email]
>>>> Please visit this url to subscribe, unsubscribe and manage your
>>>> subscription preferences:
>>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>>
>>>
>>>
>>> _______________________________________________
>>> use-livecode mailing list
>>> [hidden email]
>>> Please visit this url to subscribe, unsubscribe and manage your
>>> subscription preferences:
>>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>>
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Loading...