Re: Translating escape sequence

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode

> Jacque wrote:
>
> I'm dealing with non-English languages, and JSON data retrieved from a
> database comes in with unicode escape sequences like this: Eduardo
> Ba\u00f1uls.
>
> I need to translate those. I can do it by replacing the "\u" with "0x"
> and then using numToCodepoint() to get the UTF16 character. But there
> could be many of these in the same string, so I'm looking for a one-shot
> command that might just do them all.


JSONImport does it.  
If the escaped string is not in JSON format this function will wrap it in JSON then let JSONImport do its thing.

put deEscape("Eduardo Ba\u00f1uls")

function deEscape pEscapedText
        put "{'1':'**dummy**'}" into temp
        replace "**dummy**" with pEscapedText in temp
        replace "'" with quote in temp
        put JSONImport(temp)into pArray
        return pArray[1]
end deEscape

Roundabout but does the trick.

Jim Lambert
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
Thanks. I actually was using jsonImport() with these strings
successfully (no wrapper required) but it has a bug on Android that
makes it unuseable. That's what caused the problem in the first place,
because jsonToArray() doesn't deal with escape sequences.

So I went ahead and wrote a decoder for escaped sequences that works,
but found out I still can't use it. If I replace the escapes before
using jsonToArray(), jsonToArray throws an error; it can't deal with the
UTF16 strings. And I can't run my decoder through the keys of the
converted array after jsonToArray is finished, because they are already
munged into garbage characters by then.

So I'm stuck, I don't see any way to deal with these. I'll put in a bug
report about jsonImport() but it will probably be a while before it gets
fixed.

I hope someone else has an idea.

On 3/14/17 7:13 PM, Jim Lambert via use-livecode wrote:

>
>> Jacque wrote:
>>
>> I'm dealing with non-English languages, and JSON data retrieved from a
>> database comes in with unicode escape sequences like this: Eduardo
>> Ba\u00f1uls.
>>
>> I need to translate those. I can do it by replacing the "\u" with "0x"
>> and then using numToCodepoint() to get the UTF16 character. But there
>> could be many of these in the same string, so I'm looking for a one-shot
>> command that might just do them all.
>
>
> JSONImport does it.
> If the escaped string is not in JSON format this function will wrap it in JSON then let JSONImport do its thing.
>
> put deEscape("Eduardo Ba\u00f1uls")
>
> function deEscape pEscapedText
> put "{'1':'**dummy**'}" into temp
> replace "**dummy**" with pEscapedText in temp
> replace "'" with quote in temp
> put JSONImport(temp)into pArray
> return pArray[1]
> end deEscape
>
> Roundabout but does the trick.
>
> Jim Lambert
> _______________________________________________
> use-livecode mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>


--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
Maybe right after you import the JSON data, preprocess it with something
like this:

set the lineDelimiter to "\u"
repeat for each line tLine in tJsonData
     put numToCodePoint("0x" & char 1 to 4 of tLine) & char 5 to -1 of
tLine after tNewData
end repeat
put tNewData into tJsonData

Then go on your merry way. Would that work?

Phil Davis



On 3/14/17 9:28 PM, J. Landman Gay via use-livecode wrote:

> Thanks. I actually was using jsonImport() with these strings
> successfully (no wrapper required) but it has a bug on Android that
> makes it unuseable. That's what caused the problem in the first place,
> because jsonToArray() doesn't deal with escape sequences.
>
> So I went ahead and wrote a decoder for escaped sequences that works,
> but found out I still can't use it. If I replace the escapes before
> using jsonToArray(), jsonToArray throws an error; it can't deal with
> the UTF16 strings. And I can't run my decoder through the keys of the
> converted array after jsonToArray is finished, because they are
> already munged into garbage characters by then.
>
> So I'm stuck, I don't see any way to deal with these. I'll put in a
> bug report about jsonImport() but it will probably be a while before
> it gets fixed.
>
> I hope someone else has an idea.
>
> On 3/14/17 7:13 PM, Jim Lambert via use-livecode wrote:
>>
>>> Jacque wrote:
>>>
>>> I'm dealing with non-English languages, and JSON data retrieved from a
>>> database comes in with unicode escape sequences like this: Eduardo
>>> Ba\u00f1uls.
>>>
>>> I need to translate those. I can do it by replacing the "\u" with "0x"
>>> and then using numToCodepoint() to get the UTF16 character. But there
>>> could be many of these in the same string, so I'm looking for a
>>> one-shot
>>> command that might just do them all.
>>
>>
>> JSONImport does it.
>> If the escaped string is not in JSON format this function will wrap
>> it in JSON then let JSONImport do its thing.
>>
>> put deEscape("Eduardo Ba\u00f1uls")
>>
>> function deEscape pEscapedText
>>     put "{'1':'**dummy**'}" into temp
>>     replace "**dummy**" with pEscapedText in temp
>>     replace "'" with quote in temp
>>     put JSONImport(temp)into pArray
>>     return pArray[1]
>> end deEscape
>>
>> Roundabout but does the trick.
>>
>> Jim Lambert
>> _______________________________________________
>> use-livecode mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
>>
>
>

--
Phil Davis


_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
In reply to this post by J. Landman Gay via use-livecode

> On 15 Mar 2017, at 3:28 pm, J. Landman Gay via use-livecode <[hidden email]> wrote:
>
> So I'm stuck, I don't see any way to deal with these. I'll put in a bug report about jsonImport() but it will probably be a while before it gets fixed.
>
> I hope someone else has an idea.

I do -)

Jansson (the library that mergJSON uses) does actually handle all escaped unicode codepoints just fine. There is, however, an issue with the JSONToArray function. Try patching it like I have done here https://github.com/montegoulding/mergJSON/pull/8

Cheers

Monte



_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
On 3/15/17 12:00 AM, Monte Goulding via use-livecode wrote:

>
>> On 15 Mar 2017, at 3:28 pm, J. Landman Gay via use-livecode
>> <[hidden email]> wrote:
>>
>> So I'm stuck, I don't see any way to deal with these. I'll put in a
>> bug report about jsonImport() but it will probably be a while
>> before it gets fixed.
>>
>> I hope someone else has an idea.
>
> I do -)
>
> Jansson (the library that mergJSON uses) does actually handle all
> escaped unicode codepoints just fine. There is, however, an issue
> with the JSONToArray function. Try patching it like I have done here
> https://github.com/montegoulding/mergJSON/pull/8

Cool. :) I'd patch it if I knew how to do that but I don't know enough
about git to even start. But since you've put in a pull request, I can
wait until the next dp.

Thanks much, Monte. I owe you another sandwich.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode

> On 15 Mar 2017, at 5:08 pm, J. Landman Gay via use-livecode <[hidden email]> wrote:
>
> Cool. :) I'd patch it if I knew how to do that but I don't know enough about git to even start. But since you've put in a pull request, I can wait until the next dp.

I meant you could patch it in your copy of LC. Just edit the script of stack “ws.goulding.script-library.mergjson”. You may need to mess with the permissions of the app bundle if you want to save on Mac.
>
> Thanks much, Monte. I owe you another sandwich.

;-)
_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
In reply to this post by J. Landman Gay via use-livecode
On 3/14/17 11:58 PM, Phil Davis via use-livecode wrote:

> Maybe right after you import the JSON data, preprocess it with something
> like this:
>
> set the lineDelimiter to "\u"
> repeat for each line tLine in tJsonData
>     put numToCodePoint("0x" & char 1 to 4 of tLine) & char 5 to -1 of
> tLine after tNewData
> end repeat
> put tNewData into tJsonData
>
> Then go on your merry way. Would that work?

Alas, no. I did something similar earlier this evening. If you replace
the escaped sequences before running it through jsonToArray, the
function throws [an error and gives] up. If you run jsonToArray first,
there are no escapes to process, they are all converted to garbage by then.

But Monte has a fix in the pipes. I think he's been lurking here and
fixing bugs before I can report them. He just wants us to think he's
psychic.

I beat him to the bug in jsonImport though.

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
Reply | Threaded
Open this post in threaded view
|

Re: Translating escape sequence

J. Landman Gay via use-livecode
In reply to this post by J. Landman Gay via use-livecode
On 3/15/17 12:00 AM, Monte Goulding via use-livecode wrote:
>  Try patching it like I have done here https://github.com/montegoulding/mergJSON/pull/8

Hey, I figured out how to patch and where they hid your libraries. It
works! :)

--
Jacqueline Landman Gay         |     [hidden email]
HyperActive Software           |     http://www.hyperactivesw.com

_______________________________________________
use-livecode mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode