Possible Unicode parse bug

Report any Hollywood bugs here
Post Reply
SamuraiCrow
Posts: 213
Joined: Fri May 15, 2015 5:15 pm
Location: Ft. Collins, Colorado USA

Possible Unicode parse bug

Post by SamuraiCrow » Tue Dec 11, 2018 10:23 pm

When trying to parse in the library name at the beginning of the Hollywood command dump from the CLI, the ReadLine() function pulls in the string properly but LeftStr() doesn't see the two colons at the beginning of the first line.

My goal is to write a viewer and converter to generate the XML token list for the Annotate 3.0.1 editor's syntax highlighting feature in such a way as to make it independent of the version of Hollywood used so I'm targeting version 6.1 upwards. I use a preprocessor version check to shut off the UTF8 if Hollywood version 7 or greater is detected.

I'm going to cross compile to all the Amiga-like platforms from the Windows machine I am developing on plus source code (so other syntax highlighters can be added later).

User avatar
airsoftsoftwair
Posts: 3022
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: Possible Unicode parse bug

Post by airsoftsoftwair » Wed Dec 12, 2018 7:26 pm

SamuraiCrow wrote:
Tue Dec 11, 2018 10:23 pm
When trying to parse in the library name at the beginning of the Hollywood command dump from the CLI, the ReadLine() function pulls in the string properly but LeftStr() doesn't see the two colons at the beginning of the first line.
Please provide some very brief demo code that shows the issue...

SamuraiCrow
Posts: 213
Joined: Fri May 15, 2015 5:15 pm
Location: Ft. Collins, Colorado USA

Re: Possible Unicode parse bug

Post by SamuraiCrow » Thu Dec 27, 2018 1:43 pm

First generate a list of the Hollywood functions with:

Code: Select all

hollywood -exportcommands CommandList.txt
then try to parse the list with:

Code: Select all

@VERSION 6,1
@IF #HW_VERSION>=7
  SetDefaultEncoding(#ENCODING_ISO8859_1, #ENCODING_ISO8859_1)
@ENDIF
Local fh=OpenFile(Nil, "CommandList.txt", #MODE_READ ), LineBuffer$, LineNum=0
LineBuffer$=ReadLine(fh)
DebugPrint("first line = " .. LineBuffer$)
If LeftStr(LineBuffer$, 2) = "::"
	LineBuffer$=MidStr(LineBuffer$, 2)
	DebugPrint("made it to header = " .. LineBuffer$)
Else
	DebugPrint("header = ", LeftStr(LineBuffer$, 3))
	DebugPrint("LineBuffer$ = ", LineBuffer$)
	DebugPrint("Malformed file format in Line " .. StrStr(LineNum))
EndIf
CloseFile(fh)
End
I get the following output:
Hollywood 7.1 [Windows] [64-bit] (c) by Andreas Falkenhahn
The Cross-Platform Multimedia Application Layer

Licenced to Samuel Crow

Loading plugin ahx.hwp...done
Loading plugin oggtheora.hwp...done
Loading plugin oggvorbis.hwp...done
Loading plugin rapagui.hwp...done
Loading plugin rebelsdl.hwp...done
Loading plugin zip.hwp...done
Opening script test1.hws...done
Compiling script...done
Preparing display...done
And Action!
first line = ::string
header = 
LineBuffer$ = ::string
Malformed file format in Line 0

User avatar
airsoftsoftwair
Posts: 3022
Joined: Fri Feb 12, 2010 2:33 pm
Location: Germany
Contact:

Re: Possible Unicode parse bug

Post by airsoftsoftwair » Thu Dec 27, 2018 2:48 pm

Can't reproduce this here. The output is:

Code: Select all

first line = ::string
made it to header = string
Maybe provide your commandlist.txt as well because something might be wrong with it.

SamuraiCrow
Posts: 213
Joined: Fri May 15, 2015 5:15 pm
Location: Ft. Collins, Colorado USA

Re: Possible Unicode parse bug

Post by SamuraiCrow » Thu Dec 27, 2018 6:11 pm

Found that there was a 3 character UTF8 BOM at the beginning of the text file. Once I loaded it into Notepad++ , converted it to ANSI and resaved it, it was usable. Should I expect it to be there when generating the command list every time?

SamuraiCrow
Posts: 213
Joined: Fri May 15, 2015 5:15 pm
Location: Ft. Collins, Colorado USA

Re: Possible Unicode parse bug

Post by SamuraiCrow » Thu Dec 27, 2018 6:53 pm

Update:
The freshly generated file doesn't have the BOM but I must have saved the file in another editor like the IDE or something.

Post Reply