HSZ

From OHRRPGCE-Wiki
(Redirected from HSX)
Jump to: navigation, search

.HSZ and .HSX files are lumped inside HSP files. They are the raw binary script data with a small header, and optionally appended with a table of string literals. There is no difference between .HSX and .HSZ files, except 32-bit versions of HSPEAK use .HSZ to hide the scripts from 16-bit scripting versions of Game which do not check the script format version (and silently fail on a 32-bit script). When loading script id n, load whichever of n.HSX or n.HSZ exists.


Contents

[edit] Header

The header is of variable length, depending on the version of HSpeak used. If the header is too short, all fields after the first 2 have defaults (all old HSX files always have at least two fields). The header extends until the start of the script data.

[edit] Script format meanings

0 16-bit script
1 32-bit script with 16-bit string table offset
2 32-bit script with 32-bit string table offset

[edit] Script format 2

Type Meaning Default if missing
INT Offset in bytes to the script data (also length of header in bytes) NA
INT Number of variables the script has. Both local variables and script arguments count towards this. NA
INT Number of arguments a script takes NA
INT Script format version. Current version is 2 NA
LONG Offset in bytes to the string table from the start of the file. 0 if no string table is present 0
LONG Length in bytes of the string table String table extends until the end of the file
INT Bitsets for optional features.
bit 0: non-leaf nodes are appended with srcpos debugging information
0
LONG Offset in bytes from start of file to the local variable name string table, or 0 if not present 0

[edit] Script format 0 or 1

Type Meaning Default if missing
INT same as above NA
INT same as above NA
INT same as above "any amount" (bypass checks)
INT same as above 0 (So the format version might not be specified)
INT string table offset given as 16-bit INT instead 0

[edit] Script Data

The script is described as a block of binary data organised as a tree. Each node of the tree is a "command", and the command may have any number of arguments, each of which is a pointer to a child node. Here "command" refers to all objects in the language, including numbers, variables, functions, scripts, and flow control. Each command returns a value to its calling command, though some are garbage, and some are discarded by commands.

For the remainder of this article, we shall refer to WORDs of data. A WORD a signed, 16-bit INT for .HSX lumps, and a signed, 32-bit LONG for .HSZ lumps.

There are two types of nodes with slightly different structure. Each command is a series of WORDs and begins with:

  • The command "kind": which can be 1 - 7 (see below)
  • A command "id", the meaning of which varies on the command kind

This is the total content of command kinds 1, 3 and 4. Kinds 2,5,6,7 might have arguments, so have:

  • Number of arguments
  • One WORD per argument, which is an offset/pointer to that command in the script data. It is in WORDs from the beginning of the script data.
  • If the relevant bit is set in the header, then a 32-bit LONG srcpos debug datum follows (nodes with arguments only - even when stepping through commands in the debugger, the debugger doesn't usually stop on leaf nodes)

Execution begins at the root (at offset 0). The root command is always a do loop containing the script's top-level commands. For each encountered command, all arguments to it are evaluated in order if there are any, and then all the return values are fed into the specified function which spits out a result, which is returned. This is not the case with flow control statements (kind 2), where evaluation of the arguments is selective. Arguments may not be all executed, so processing must be done between the execution of each argument.

A summary of the different command kinds:

Kind Explanation ID meaning Arguments
1 Number The value of the number None
2 Flow control type of flow (0 to 15) Yes
3 Global variable ID of the global to return (0 to 4095) None
4 Local variable ID of the local to return (0 to # of locals - 1) None
5 Math function- a special group of functions which are basic math functions ID - 0 to 22 Yes
6 "Builtin" function- a plotscripting function such as show textbox ID of the command Yes
7 Script- load a script and execute it for a return value ID number of the script Yes

Several commands take a (reference to a) local or global variable as an argument. When variables appear as command kinds 2 and 3 they are NOT lvalues. So variables references are given as constants, specifying a variable like this:

  • If value < 0, local with ID abs(value + 1)
  • If value >= 0, global with ID value

[edit] Kind 1: Number

No argument, the id is the value of the number, and is placed on the list of return values.

[edit] Kind 2: Flow Structure

The Type of flow control depends on the command id, and the different structures vary.

Ids 1 and 2, begin and end, never occur in a compiled script, but are not rejected by the interpreter.

[edit] Id 0: do

A set of commands to execute in order. Each argument is a command. These may appear anywhere in a script (if a scripter is cheeky), but are normally the arguments to while, for and switch The return value is 0.

[edit] Id 3: return

Sets the script's return value, does nothing else (doesn't actually stop the script).

[edit] Id 4: if

If always has these 3 arguments:

  1. a conditional expression
  2. a then
  3. an else

If a then or else were not specified in the script, then empty then/elses are created with no arguments.

[edit] Id 5: then

Exactly the same as do, but only called from if.

[edit] Id 6: else

Exactly the same as do, but only called from if. Likely to be empty.

[edit] Id 7: for

For has 5 arguments:

  1. ID of a variable to use as counter
  2. counter start value
  3. counter end value
  4. counter step per loop
  5. a do block

The do block is called over and over until start > end if step is positive, or < if negative.

As soon as the start value is read, the variable is set to it before executing other arguments. For example:

for(var, start, var + 10, 1) do()

is equivalent to

for(var, start, start + 10, 1) do()

[edit] Id 10: while

While's 2 arguments are:

  1. Conditional
  2. a do block

The do is repeated until the conditional evaluates to false.

[edit] Id 11: break

Break will cause the script interpreter to abort do blocks, according to its one argument, the number of do blocks to exit.

The only blocks that "count" towards this limit are actual do blocks (ID 0), not then, else or other such blocks. It also breaks out of switch blocks.

[edit] Id 12: continue

Continue will restart the nth do block higher than the current command, after evaluating conditionals, etc. n is continue's parameter. It defaults to 1, and is best explained with an example:

script, foo, begin
 variable(i)
 for(i,1,10,1) do, begin
  if(i == 5) then (continue)
  #do something with i
 end
end

In this script, i will have something done to it when it's 1, 2, 3, 4, 6, 7, 8, 9 and 10. 5 is skipped, due to the continue.

Note that only do blocks are restarted. Then, else and other such blocks don't count toward this.

[edit] Id 13: exitscript

Immediately stops the script. The current return value is returned to any calling script.

[edit] Id 14: exitreturning

As exitscript, but accepts a parameter to explicitly set the return value.

[edit] Id 15: switch

???

[edit] Id 16: case

Note: never appears in a compiled script

[edit] Kind 3: Global Variable

The ID is the number of the global variable (0-4095)

[edit] Kind 4: Local Variable

The ID is the number of the local variable, from 0 to number of variables - 1 (see Header)

[edit] Kind 5: Math/Builtin Function

Much like the Flow Control kind, the Math Function Kind is indexed by ID.

With a few exceptions, operators work on 2 operands, which will be its two arguments. The Left Hand Side (LHS) will be the first argument, while the Right Hand Side (RHS) is the second. Some builtins use the LHS as an l-value.

In this list, math operations are listed using standard C-family operators/functions, as opposed to weird HamsterSpeak native syntax :-)

[edit] Id 0: random

  • 0 - Returns a random number between LHS and RHS, inclusive.

[edit] Ids 1 to 6: Basic arithmetic

  • 1 - exponent (LHS ** RHS)
  • 2 - modulus (LHS % RHS)
  • 3 - division (LHS / RHS)
  • 4 - multiplication (LHS * RHS)
  • 5 - subtraction (LHS - RHS)
  • 6 - addition (LHS + RHS)

[edit] Ids 7 to 9: Bitwise Operators

  • 7 - XOR (LHS ^ RHS)
  • 8 - OR (LHS | RHS)
  • 9 - AND (LHS & RHS)

[edit] Ids 10 to 15: Logical Comparison

  • 10 - equal (LHS == RHS)
  • 11 - not equal (LHS != RHS)
  • 12 - less than (LHS < RHS)
  • 13 - greater than (LHS > RHS)
  • 14 - less than or equal (LHS <= RHS)
  • 15 - greater than or equal (LHS >= RHS)

[edit] Ids 16 to 18: Variables

Note: The LHS is an encoded variable number. LHS >= 0 is a global variable, while LHS < 0 is a local (# ABS(LHS + 1))

  • 16 - set variable (LHS = RHS)
  • 17 - increment variable (LHS += RHS)
  • 18 - decrement variable (LHS -= RHS)

[edit] Ids 19 to 22: Logical Operators

  • 19 - not (! LHS)
  • 20 - logical and (LHS && RHS) -- NOTE: This operator short-circuits. I.e. if LHS = 0, then RHS is not evaluated, as the answer must be false.
  • 21 - logical or (LHS || RHS) -- NOTE: This operator short-circuits. I.e. if LHS != 0, then RHS is not evaluated, as the answer must be true.
  • 22 - logical xor (!LHS ^ !RHS)

[edit] Ids 23 to 25: More Math Functions

  • 23 - absolute value (abs (LHS))
  • 24 - sign ((LHS > 0) - (LHS < 0))
  • 25 - squareroot, rounded to nearest integer ((int)(sqrt(LHS)+0.5))

[edit] Kind 6: Builtin Function

The ID is the number of the built in function to run. A full list of these functions, and their IDs, may be found in Plotscr.hsd.

The arguments to this command are the parameters to the function.

[edit] Kind 7: Script

Runs a script. The script is loaded and run. When it is done, it returns a value.

[edit] String Table

This contains any string literals used by setstringfromtable and appendstringfromtable function calls (IDs 251 and 252) within the script.

setstringfromtable and appendstringfromtable have 2 arguments: a string ID, and the offset of the string literal from the beginning of the string table in 4-byte words.

The string literals can be of any length, and are in a 4-byte length and 1-byte-per-character format. (They are padded by up to 3 bytes between strings.)

[edit] Debug Information

These are optional features, see the header.

[edit] srcpos

A "srcpos" is a 32-bit word that encodes the position of the token in a source file to which a command corresponds. The line number of the token is retrieved by counting newlines in the source file (see HSP#SOURCE.LUMPED) upto the token point.

Bit field Meaning
0-7 Length of the token in characters, including whitespace (aside from trailing whitespace), capped at 255. Included to remove the need to parse the source to print the original token.
8 "Virtual" flag: this command was actually inserted by HSpeak, and does not exist in the source (eg. an empty else() added to an if). The srcpos indicates the parent construct (continuing the example, the if)
9-31 The character number of the token in the file (counting from 1) + the "offset" of that file. The file and its offset is retrieved by scanning SRCFILES.TXT. The file's offset + 0 is reserved for referring to the whole file but not any position in it specifically.

[edit] Local variable names

If present, local variable (and argument) names are stored in a table just like the string table. They are normally stored after the string table, but aren't needed at runtime; they are just for printing script state. Skip over the 1st..(n-1)th names to reach the nth.

Personal tools