UTF-8 encoding

Report any Hollywood bugs here
Post Reply
User avatar
jPV
Posts: 135
Joined: Sat Mar 26, 2016 11:44 am
Location: RNO
Contact:

UTF-8 encoding

Post by jPV » Fri Feb 24, 2017 2:02 pm

Looks like there are couple issues with UTF-8 encoded text, at least on MorphOS where I've been testing it.

1) UTF-8 encoded text doesn't seem to work at all with bitmap fonts. At least I haven't found yet a bitmap font which would work with it, while they do work with some other programs even with UTF-8 encoded content.

2) With truetype fonts it works with TextOut and CreateTextObject functions, but doesn't work with Print or NPrint functions. Documentation tells it should work with these all when set with SetDefaultEncoding.

Code: Select all

s$="Räikkönen" ; String with UTF-8 data
SetDefaultEncoding(#ENCODING_UTF8)

; Test with a truetype font
SetFont(#SANS,15)
NPrint(s$) ; Doesn't print correctly
Print(s$)  ; Doesn't print correctly
CreateTextObject(1,s$)
DisplayTextObject(1,0,30) ; Prints correctly
TextOut(0,45,s$) ; Prints correctly

; Test with a bitmap font
SetFont(#BITMAP_DEFAULT,8)
CreateTextObject(1,s$)
DisplayTextObject(1,0,100) ; Doesn't print correctly
TextOut(0,120,s$) ; Doesn't print correctly

WaitLeftMouse

User avatar
airsoftsoftwair
Posts: 2483
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair » Mon Feb 27, 2017 4:34 pm

Thanks for the report but these two have been fixed already. The next Hollywood version will introduce support for Unicode and I discovered these two issues myself while implementing Unicode support, here are the two entries from the history:

Code: Select all

- Fix: Bitmap fonts didn't support #ENCODING_UTF8
...
- Fix: Contrary to the description in the documentation, Print() and NPrint() didn't respect
  the encoding that was set using SetDefaultEncoding()

midwan
Posts: 32
Joined: Sun Jun 19, 2016 1:15 pm

Re: UTF-8 encoding

Post by midwan » Mon May 01, 2017 2:16 pm

I found an issue with UTF-8 encoding in Hollywood 7.0, different from the one above though.

If we @INCLUDE a file in our project, and that included file is saved as Unicode or UTF-8 (in other words, non-ANSI), there is an error as soon as that file is attempted to be loaded - "Invalid symbol at line 1".

For the record, line 1 contained the beginning of a comment block in my file, like so:

Code: Select all

/*
** Some comment
*/
code...
If I re-saved the file as ANSI, then I could work with it normally.

Steps to recreate:
1. Start a project which @INCLUDEs a separate file
2. Save that second file as UTF-8
3. Attempt to run the project

User avatar
airsoftsoftwair
Posts: 2483
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair » Tue May 02, 2017 8:53 pm

Yes, this is a known bug in @INCLUDE. As a workaround, just save your UTF-8 files without BOM. Then there won't be any problems @INCLUDEing them. The error happens only with UTF-8 files with a BOM.

midwan
Posts: 32
Joined: Sun Jun 19, 2016 1:15 pm

Re: UTF-8 encoding

Post by midwan » Wed May 03, 2017 9:30 pm

Ah, thanks for that.

I assume this will be fixed in a future update then?

User avatar
airsoftsoftwair
Posts: 2483
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair » Sun May 07, 2017 5:41 pm

Of course.

User avatar
airsoftsoftwair
Posts: 2483
Joined: Fri Feb 12, 2010 3:33 pm
Location: Germany
Contact:

Re: UTF-8 encoding

Post by airsoftsoftwair » Thu Aug 10, 2017 9:33 pm

Code: Select all

- Fix: @INCLUDE didn't handle the UTF-8 BOM correctly

Post Reply