UTF-8 encoding

Report any Hollywood bugs here
Post Reply
User avatar
jPV
Posts: 603
Joined: Sat Mar 26, 2016 10:44 am
Location: RNO
Contact:

UTF-8 encoding

Post by jPV »

Looks like there are couple issues with UTF-8 encoded text, at least on MorphOS where I've been testing it.

1) UTF-8 encoded text doesn't seem to work at all with bitmap fonts. At least I haven't found yet a bitmap font which would work with it, while they do work with some other programs even with UTF-8 encoded content.

2) With truetype fonts it works with TextOut and CreateTextObject functions, but doesn't work with Print or NPrint functions. Documentation tells it should work with these all when set with SetDefaultEncoding.

Code: Select all

s$="Räikkönen" ; String with UTF-8 data
SetDefaultEncoding(#ENCODING_UTF8)

; Test with a truetype font
SetFont(#SANS,15)
NPrint(s$) ; Doesn't print correctly
Print(s$)  ; Doesn't print correctly
CreateTextObject(1,s$)
DisplayTextObject(1,0,30) ; Prints correctly
TextOut(0,45,s$) ; Prints correctly

; Test with a bitmap font
SetFont(#BITMAP_DEFAULT,8)
CreateTextObject(1,s$)
DisplayTextObject(1,0,100) ; Doesn't print correctly
TextOut(0,120,s$) ; Doesn't print correctly

WaitLeftMouse
User avatar
airsoftsoftwair
Posts: 5433
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair »

Thanks for the report but these two have been fixed already. The next Hollywood version will introduce support for Unicode and I discovered these two issues myself while implementing Unicode support, here are the two entries from the history:

Code: Select all

- Fix: Bitmap fonts didn't support #ENCODING_UTF8
...
- Fix: Contrary to the description in the documentation, Print() and NPrint() didn't respect
  the encoding that was set using SetDefaultEncoding()
User avatar
midwan
Posts: 74
Joined: Sun Jun 19, 2016 1:15 pm
Location: Sweden

Re: UTF-8 encoding

Post by midwan »

I found an issue with UTF-8 encoding in Hollywood 7.0, different from the one above though.

If we @INCLUDE a file in our project, and that included file is saved as Unicode or UTF-8 (in other words, non-ANSI), there is an error as soon as that file is attempted to be loaded - "Invalid symbol at line 1".

For the record, line 1 contained the beginning of a comment block in my file, like so:

Code: Select all

/*
** Some comment
*/
code...
If I re-saved the file as ANSI, then I could work with it normally.

Steps to recreate:
1. Start a project which @INCLUDEs a separate file
2. Save that second file as UTF-8
3. Attempt to run the project
User avatar
airsoftsoftwair
Posts: 5433
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair »

Yes, this is a known bug in @INCLUDE. As a workaround, just save your UTF-8 files without BOM. Then there won't be any problems @INCLUDEing them. The error happens only with UTF-8 files with a BOM.
User avatar
midwan
Posts: 74
Joined: Sun Jun 19, 2016 1:15 pm
Location: Sweden

Re: UTF-8 encoding

Post by midwan »

Ah, thanks for that.

I assume this will be fixed in a future update then?
User avatar
airsoftsoftwair
Posts: 5433
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair »

Of course.
User avatar
airsoftsoftwair
Posts: 5433
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair »

Code: Select all

- Fix: @INCLUDE didn't handle the UTF-8 BOM correctly
Post Reply