UTF-8 encoding

Report any Hollywood bugs here

UTF-8 encoding

Postby jPV » Fri Feb 24, 2017 2:02 pm

Looks like there are couple issues with UTF-8 encoded text, at least on MorphOS where I've been testing it.

1) UTF-8 encoded text doesn't seem to work at all with bitmap fonts. At least I haven't found yet a bitmap font which would work with it, while they do work with some other programs even with UTF-8 encoded content.

2) With truetype fonts it works with TextOut and CreateTextObject functions, but doesn't work with Print or NPrint functions. Documentation tells it should work with these all when set with SetDefaultEncoding.

Code: Select all
s$="Räikkönen" ; String with UTF-8 data
SetDefaultEncoding(#ENCODING_UTF8)

; Test with a truetype font
SetFont(#SANS,15)
NPrint(s$) ; Doesn't print correctly
Print(s$)  ; Doesn't print correctly
CreateTextObject(1,s$)
DisplayTextObject(1,0,30) ; Prints correctly
TextOut(0,45,s$) ; Prints correctly

; Test with a bitmap font
SetFont(#BITMAP_DEFAULT,8)
CreateTextObject(1,s$)
DisplayTextObject(1,0,100) ; Doesn't print correctly
TextOut(0,120,s$) ; Doesn't print correctly

WaitLeftMouse
User avatar
jPV
 
Posts: 84
Joined: Sat Mar 26, 2016 11:44 am
Location: RNO

Re: UTF-8 encoding

Postby airsoftsoftwair » Mon Feb 27, 2017 4:34 pm

Thanks for the report but these two have been fixed already. The next Hollywood version will introduce support for Unicode and I discovered these two issues myself while implementing Unicode support, here are the two entries from the history:

Code: Select all
- Fix: Bitmap fonts didn't support #ENCODING_UTF8
...
- Fix: Contrary to the description in the documentation, Print() and NPrint() didn't respect
  the encoding that was set using SetDefaultEncoding()
User avatar
airsoftsoftwair
 
Posts: 2225
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany

Re: UTF-8 encoding

Postby midwan » Mon May 01, 2017 2:16 pm

I found an issue with UTF-8 encoding in Hollywood 7.0, different from the one above though.

If we @INCLUDE a file in our project, and that included file is saved as Unicode or UTF-8 (in other words, non-ANSI), there is an error as soon as that file is attempted to be loaded - "Invalid symbol at line 1".

For the record, line 1 contained the beginning of a comment block in my file, like so:

Code: Select all
/*
** Some comment
*/
code...


If I re-saved the file as ANSI, then I could work with it normally.

Steps to recreate:
1. Start a project which @INCLUDEs a separate file
2. Save that second file as UTF-8
3. Attempt to run the project
midwan
 
Posts: 32
Joined: Sun Jun 19, 2016 1:15 pm

Re: UTF-8 encoding

Postby airsoftsoftwair » Tue May 02, 2017 8:53 pm

Yes, this is a known bug in @INCLUDE. As a workaround, just save your UTF-8 files without BOM. Then there won't be any problems @INCLUDEing them. The error happens only with UTF-8 files with a BOM.
User avatar
airsoftsoftwair
 
Posts: 2225
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany

Re: UTF-8 encoding

Postby midwan » Wed May 03, 2017 9:30 pm

Ah, thanks for that.

I assume this will be fixed in a future update then?
midwan
 
Posts: 32
Joined: Sun Jun 19, 2016 1:15 pm

Re: UTF-8 encoding

Postby airsoftsoftwair » Sun May 07, 2017 5:41 pm

Of course.
User avatar
airsoftsoftwair
 
Posts: 2225
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany

Re: UTF-8 encoding

Postby airsoftsoftwair » Thu Aug 10, 2017 9:33 pm

Code: Select all
- Fix: @INCLUDE didn't handle the UTF-8 BOM correctly
User avatar
airsoftsoftwair
 
Posts: 2225
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany


Return to Hollywood bugs

Who is online

Users browsing this forum: No registered users and 1 guest