Character Encoding

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Character Encoding

masmit
Does anyone know which character encoding Apple uses to display  
strings in iTunes?

I'm reading some text info from the m4a tags produced by iTunes, and  
there seem to be some double-byte characters in there. I have no  
experience of this, and know nothing, so I'm grasping at straws.

As an example of what I get, the 'Esoterik' (where the 2nd (lower  
case) e should have an acute accent over it, I'm getting:

Esot <char value  195><char value 169> rik

what it should be, under whatever character set Rev defaults to on  
Mac OSX, it should be a single char of value 142.

Any help on this would be great.

Thanks,

Mark
_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Reply | Threaded
Open this post in threaded view
|

Re: Character Encoding

xtalkprogrammer
Hi Mark,

I haven't tested it, but it might be UTF-8.

Best,

Mark

--

Economy-x-Talk
Consultancy and Software Engineering
http://economy-x-talk.com
http://www.salery.biz

Salery is the easiest way to get your own web store on-line: http://
www.salery.biz/salery.html



Op 29-apr-2006, om 14:39 heeft Mark Smith het volgende geschreven:

> Does anyone know which character encoding Apple uses to display  
> strings in iTunes?
>
> I'm reading some text info from the m4a tags produced by iTunes,  
> and there seem to be some double-byte characters in there. I have  
> no experience of this, and know nothing, so I'm grasping at straws.
>
> As an example of what I get, the 'Esoterik' (where the 2nd (lower  
> case) e should have an acute accent over it, I'm getting:
>
> Esot <char value  195><char value 169> rik
>
> what it should be, under whatever character set Rev defaults to on  
> Mac OSX, it should be a single char of value 142.
>
> Any help on this would be great.
>
> Thanks,
>
> Mark
_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: Character Encoding

Thierry Arbellot-2
In reply to this post by masmit
This is UTF-8 encoding

using uniEncode and/or uniDecode functions, you should be able to
translate to extended ascii.

I hope it helps,
Thierry

On 2006, Apr 29, , at 14:39, Mark Smith wrote:

> Does anyone know which character encoding Apple uses to display
> strings in iTunes?
>
> I'm reading some text info from the m4a tags produced by iTunes, and
> there seem to be some double-byte characters in there. I have no
> experience of this, and know nothing, so I'm grasping at straws.
>
> As an example of what I get, the 'Esoterik' (where the 2nd (lower
> case) e should have an acute accent over it, I'm getting:
>
> Esot <char value  195><char value 169> rik
>
> what it should be, under whatever character set Rev defaults to on Mac
> OSX, it should be a single char of value 142.
>
> Any help on this would be great.
>
> Thanks,
>
> Mark
> _______________________________________________
> use-revolution mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution
>

_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Reply | Threaded
Open this post in threaded view
|

Re: Character Encoding

masmit
Mark and Thierry, thanks.

What seems to work , where iTunesStr is what I extract from the m4aTag:

put unidecode(uniencode(iTunesStr,"UTF8")) into decodedStr

This works on my (english) system, but are there gotchas that would  
apply when using this, for instance, on a french or german system?

Best,

Mark

On 29 Apr 2006, at 14:01, Thierry Arbellot wrote:

> This is UTF-8 encoding
>
> using uniEncode and/or uniDecode functions, you should be able to  
> translate to extended ascii.
>
> I hope it helps,
> Thierry
>
> On 2006, Apr 29, , at 14:39, Mark Smith wrote:
>
>> Does anyone know which character encoding Apple uses to display  
>> strings in iTunes?
>>
>> I'm reading some text info from the m4a tags produced by iTunes,  
>> and there seem to be some double-byte characters in there. I have  
>> no experience of this, and know nothing, so I'm grasping at straws.
>>
>> As an example of what I get, the 'Esoterik' (where the 2nd (lower  
>> case) e should have an acute accent over it, I'm getting:
>>
>> Esot <char value  195><char value 169> rik
>>
>> what it should be, under whatever character set Rev defaults to on  
>> Mac OSX, it should be a single char of value 142.
>>
>> Any help on this would be great.
>>
>> Thanks,
>>
>> Mark
>> _______________________________________________
>> use-revolution mailing list
>> [hidden email]
>> Please visit this url to subscribe, unsubscribe and manage your  
>> subscription preferences:
>> http://lists.runrev.com/mailman/listinfo/use-revolution
>>
>
> _______________________________________________
> use-revolution mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your  
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution

_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Reply | Threaded
Open this post in threaded view
|

Re: Character Encoding

xtalkprogrammer
Mark,

The language of the system, Dutch, German, or English, is not a  
problem. You'd have problems if you try to decode and display tags  
that are actually written in unicode and cannot be displayed with  
normal ascii characters. I'd try to display the tags as unicode text  
--there is a reason why they are encoded like that. Write me off-list  
if you need help with unicode in Rev.

Best regards,

Mark

--

Economy-x-Talk
Consultancy and Software Engineering
http://economy-x-talk.com
http://www.salery.biz

Salery is the easiest way to get your own web store on-line: http://
www.salery.biz/salery.html



Op 29-apr-2006, om 15:31 heeft Mark Smith het volgende geschreven:

> Mark and Thierry, thanks.
>
> What seems to work , where iTunesStr is what I extract from the  
> m4aTag:
>
> put unidecode(uniencode(iTunesStr,"UTF8")) into decodedStr
>
> This works on my (english) system, but are there gotchas that would  
> apply when using this, for instance, on a french or german system?
>
> Best,
>
> Mark
_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution
Kind regards,

Drs. Mark Schonewille

Economy-x-Talk Consultancy and Software Engineering
Homepage: http://economy-x-talk.com
Twitter: http://twitter.com/xtalkprogrammer
Facebook: http://facebook.com/LiveCode.Beginner
KvK: 50277553
Reply | Threaded
Open this post in threaded view
|

Re: Character Encoding

masmit
Mark, thanks for the offer, I may take you up on it. I'm trying to  
avoid having to make users of id3lib use 'set the unicodeText of fld  
"Title" to id3libgetTitle()" rather than just 'put id3libGetTitle()  
into fld "title"'
.
In the case of m4a tags, since it seems like they're simply UTF8, it  
seems like it should be ok to simply uniencode and unidecode to  
whatever the default encoding is on a users system... ID3 tags, on  
the other hand, have the possibility of using UTF16, which may well  
have to be dealt with differently.

Currently, id3lib is only decoding m4a tags, while it's simply  
returning whatever chars it finds in ID3 tags. Maybe I should make  
these things settable by the user of the lib.

Best,

Mark

On 29 Apr 2006, at 15:31, Mark Schonewille wrote:

> Mark,
>
> The language of the system, Dutch, German, or English, is not a  
> problem. You'd have problems if you try to decode and display tags  
> that are actually written in unicode and cannot be displayed with  
> normal ascii characters. I'd try to display the tags as unicode  
> text --there is a reason why they are encoded like that. Write me  
> off-list if you need help with unicode in Rev.
>
> Best regards,
>
> Mark
>
> --
>
> Economy-x-Talk
> Consultancy and Software Engineering
> http://economy-x-talk.com
> http://www.salery.biz
>
> Salery is the easiest way to get your own web store on-line: http://
> www.salery.biz/salery.html
>
>
>
> Op 29-apr-2006, om 15:31 heeft Mark Smith het volgende geschreven:
>
>> Mark and Thierry, thanks.
>>
>> What seems to work , where iTunesStr is what I extract from the  
>> m4aTag:
>>
>> put unidecode(uniencode(iTunesStr,"UTF8")) into decodedStr
>>
>> This works on my (english) system, but are there gotchas that  
>> would apply when using this, for instance, on a french or german  
>> system?
>>
>> Best,
>>
>> Mark
> _______________________________________________
> use-revolution mailing list
> [hidden email]
> Please visit this url to subscribe, unsubscribe and manage your  
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-revolution

_______________________________________________
use-revolution mailing list
[hidden email]
Please visit this url to subscribe, unsubscribe and manage your subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution