Talk:Henceforth VM

From OHRRPGCE-Wiki

Jump to: navigation, search

S'orlok Reaves: Thanks! I was wondering how to do numbered lists, but didn't have time to look it up. :D

[edit] Stacks

In the article, you mention that "Currently, the OHR is not threaded, but future versions may be, and some other script might use the stack while you are "wait"ing."

Wouldn't it be better to maintain independent stacks for each thread? I believe that's how processes work on real CPUs. This way, a misbehaving script (buggy, or perhaps corrupted somehow) can't corrupt the whole stack, only its own.

If memory usage is a concern, then perhaps analyzing the scripts at compile time would be in order. It is possible to determine how much stack each script uses, and then basically add it up to find the net difference. Then, you only have to allocate just enough for the stack usage (although, allowing for the stack to grow, in case the compiler can't figure it out properly, such as for recursive scripts).

Or, just start the stack out small, and grow it as necessary.

Bob the Hamster: yes, I do believe we will maintain a separate stack for each script thread. The grow-as-necessary approach seems easiest so we will probably do that first.


S'orlok Reaves: That's a very good suggestion, and avoids a lot of hackishness. I like it. One of my worries with having separate stacks was that (good) stack-based languages have implicit tail recursion by ignoring the keyword "return". This makes tight recursive loops much faster, and I was worried that having separate stacks would require copying some data on a "return". That was before I'd started reading the HSZ specs; I think we can now have separate stacks, and allow "return", "exitscript", and "exitreturning" to affect the parent stack. That way, we get the benefits of having separate stacks, and still retain the tail-recursion of any locally defined scripts.

I will change the documentation after I finish implementing the naive cross-compiler; I need a clear picture of this in my head before I start writing. Script-local stacks will definitely be an added feature, though. Thanks for mentioning it!

Mike C.: Wait, why would you need to copy anything? The stack used by a thread is used by all scripts executing in that thread. I.e., if we start with A, which calls B, and B returns something, it just leaves it on the stack for A. The point of having separate stacks is so that Foo, an infinite loop that does something every frame, doesn't affect A and B.

In other words, there is no "parent stack", just a single stack per thread.


S'orlok Reaves: Oh, I wish I'd read this earlier; I seem to have massively mis-understood threads. My idea was to have multitasking on the script level. Anyway, when multitasking is added to the classic OHR, I'll just narrow the HVM to match. Thanks for your clarification, though, I'll make sure to leave room in the design for that.


[edit] Moving Forward


The Mad Cacti: That native compiler/HS analyser tool looks cool! I think that it would be generally useful (to devs (ie. me)) outside of Henceforth. I sometimes want a tool to examine compiled HS scripts.

This page has inspired an idea. As I've probably mentioned, I want to rewrite the script interpreter for vanilla OHR too, also translating (at run time?) the HSZ format to something more suitable for interpretation. (Though, I have to watch out for certain games with 800kb script files - scripts will have to be translated on use.)

So, here's the idea: RPG files could continue to contain scripts in HSZ format (and hopefully won't contain duplicate copies in any other format). Instead, we can repurpose HSX as a formatfor describing scripts in a way convienent for tools like HSP2HF and HSDECMPL rather than intended for direct interpretation, and containing extra metadata. I think that the tree structure is good for this. I want to add arrays (with lots of complexity) in the near future, and also add debugging info to scripts, like line numbers and references to the script (the original text script files could be added to the RPG file for this purpose) and more. It would be nice to be able to do this without worrying about hacking/breaking the HSZ format to avoid sever penalties of direct interpretation. For example, to support hss file references, I could add a new "comment" type (say, #8) which wraps a command to state its text file location. HSZ interpreters which don't want that information could ignore it. Eg:

Offset 0  ::  Type: 2 flow,    Id: 0 do,      Args: 1, Arg0: (cmd) 4
Offset 4  ::  Type: 8 comment, Id: 1 hss pos, Args: 3, Arg0: (line no) 13, Arg1: (column no) 15, Arg2: (wrapped cmd) 10
Offset 10 ::  Type: 6 builtin, Id: 1 wait,    Args: 1, Arg0: (amount) 17
Offset 13 ::  Type: 1 int,     Id: 345   (line no)
Offset 15 ::  Type: 1 int,     Id: 2     (col no)
Offset 17 ::  Type: 1 int,     Id: 10    (wait amount)

Actually, this is just an example, prehaps it would be too bloaty and we need an alternative (binary?) format for hss references. But other features like arrays could work in a similar way. What do you think?

Your inlining rule seems pretty reasonable, I thought about it for a long while and couldn't think of any failure cases. I note that it might be possible to inline starting from the root rather than the leaves too.

You seem to have an awful lot of strings and hashmaps. Won't this eat memory? And, I really can't understand why the first 7 variables would be stored in the "Local Variable Array" and the rest in the "Local Variable Hash". Does this mean that local variables are numbered anyway?

What's this business about runtime modifiable/generated scripts? How are these created? I'm guessing that that information hasn't been added yet.


S'orlok Reaves: This is all good stuff! Let me see if I can cover it all:

1) I've read some of your emails about re-writing the script interpreter. Translating scripts the first time they're loaded makes sense to me, even for <800KB-scripted games. Henceforth is my attempt at just such an improvement; however, I think the standard OHR could do much better. For example, the HF bytecode is needed because I can't inject Java bytecode into J2ME apps. You guys, on the other hand, could optionally enable some kind of "inline threading" to really get some performance gains.

2) I like your idea for comments, and I'm sure you'll refactor it to perfection. Here's two ideas: a) You might keep "decorations" (file line numbers, etc.) in a separate file that's only loaded when debugging is enabled. b) You might do the "Color Forth" thing; make even-numbered Hamsterspeak nodes contain code, and odd-numbered ones contain the comments for the previous line. This keeps comments in the same file, and has no noticeable affect on the algorithm you already use ("load address 0, run it and jump to labeled addresses").

3) I think it'd be great if Hamsterspeak was extended with arrays. Keep me posted. (I really don't know enough about the internals of Hamsterspeak to have an informed opinion on arrays, sorry.)

4) As for (in)efficiency in the HVM, I'm being cautiously optimistic (and profiling, of course). If hashes bog it down, I'll give it my old classic: a) If the hash size is <20 elements or so, just store it in an array & search. :P (It's worked before). What do you mean by "a lot of strings"?

5) The first seven local variables ("function parameters", to you guys) are treated specially; this way, I can give them their own load/store bytecodes. This gives a useful boost to small scripts that are called often and have to store their variables, then load them several times. If a script takes more than seven parameters, it is probably so big that it won't notice the hit in performance from storing additional parameters in a hash. But, of course, I'll profile it and change it if I'm wrong; this is just a first approximation. Finally, local variables which are named are not numbered; declaring "@test" creates a test variable with no ID.

6) Runtime-modifiable scripts are those that can be "injected" at runtime. Let's say you want to constantly track the value of global variable 5. In this case, you can over-ride the HVM's "draw" routine to poll at GLOBAL[5] every tick. I still haven't fleshed-out the HF API for drawing yet, and it will probably have to wait until after the first release of the HVM.

7) Inlining from the leaves makes the most sense to me, although I see your point. Once I get a working prototype, the fun of comparing inlining policies will really begin.

Thanks for commenting on this article; I need all the input I can get. PS: The Script Inspector tool is open source, although it currently only handles HSZ files. I'd encourage you to wait for the first release of the HVM, and then bug me to bundle the Inspector separately; right now it's not fully capable.

[edit] Missing operator?

Mike C.: I notice that there's no primitive "or" operator. Is this by design, or an oversight? Or, are we expected to call "not", and then "and"? Same thing for the missing "b_or" operator.

Personal tools