Page 1 of 1

Possible Unicode parse bug

Posted: Tue Dec 11, 2018 10:23 pm
by SamuraiCrow
When trying to parse in the library name at the beginning of the Hollywood command dump from the CLI, the ReadLine() function pulls in the string properly but LeftStr() doesn't see the two colons at the beginning of the first line.

My goal is to write a viewer and converter to generate the XML token list for the Annotate 3.0.1 editor's syntax highlighting feature in such a way as to make it independent of the version of Hollywood used so I'm targeting version 6.1 upwards. I use a preprocessor version check to shut off the UTF8 if Hollywood version 7 or greater is detected.

I'm going to cross compile to all the Amiga-like platforms from the Windows machine I am developing on plus source code (so other syntax highlighters can be added later).

Re: Possible Unicode parse bug

Posted: Wed Dec 12, 2018 7:26 pm
by airsoftsoftwair
SamuraiCrow wrote: Tue Dec 11, 2018 10:23 pm When trying to parse in the library name at the beginning of the Hollywood command dump from the CLI, the ReadLine() function pulls in the string properly but LeftStr() doesn't see the two colons at the beginning of the first line.
Please provide some very brief demo code that shows the issue...

Re: Possible Unicode parse bug

Posted: Thu Dec 27, 2018 1:43 pm
by SamuraiCrow
First generate a list of the Hollywood functions with:

Code: Select all

hollywood -exportcommands CommandList.txt
then try to parse the list with:

Code: Select all

@VERSION 6,1
@IF #HW_VERSION>=7
  SetDefaultEncoding(#ENCODING_ISO8859_1, #ENCODING_ISO8859_1)
@ENDIF
Local fh=OpenFile(Nil, "CommandList.txt", #MODE_READ ), LineBuffer$, LineNum=0
LineBuffer$=ReadLine(fh)
DebugPrint("first line = " .. LineBuffer$)
If LeftStr(LineBuffer$, 2) = "::"
	LineBuffer$=MidStr(LineBuffer$, 2)
	DebugPrint("made it to header = " .. LineBuffer$)
Else
	DebugPrint("header = ", LeftStr(LineBuffer$, 3))
	DebugPrint("LineBuffer$ = ", LineBuffer$)
	DebugPrint("Malformed file format in Line " .. StrStr(LineNum))
EndIf
CloseFile(fh)
End
I get the following output:
Hollywood 7.1 [Windows] [64-bit] (c) by Andreas Falkenhahn
The Cross-Platform Multimedia Application Layer

Licenced to Samuel Crow

Loading plugin ahx.hwp...done
Loading plugin oggtheora.hwp...done
Loading plugin oggvorbis.hwp...done
Loading plugin rapagui.hwp...done
Loading plugin rebelsdl.hwp...done
Loading plugin zip.hwp...done
Opening script test1.hws...done
Compiling script...done
Preparing display...done
And Action!
first line = ::string
header = 
LineBuffer$ = ::string
Malformed file format in Line 0

Re: Possible Unicode parse bug

Posted: Thu Dec 27, 2018 2:48 pm
by airsoftsoftwair
Can't reproduce this here. The output is:

Code: Select all

first line = ::string
made it to header = string
Maybe provide your commandlist.txt as well because something might be wrong with it.

Re: Possible Unicode parse bug

Posted: Thu Dec 27, 2018 6:11 pm
by SamuraiCrow
Found that there was a 3 character UTF8 BOM at the beginning of the text file. Once I loaded it into Notepad++ , converted it to ANSI and resaved it, it was usable. Should I expect it to be there when generating the command list every time?

Re: Possible Unicode parse bug

Posted: Thu Dec 27, 2018 6:53 pm
by SamuraiCrow
Update:
The freshly generated file doesn't have the BOM but I must have saved the file in another editor like the IDE or something.