This is a pretty bad case of "better late than never," since I've known about this problem for almost as long as it's been around. This is about a directory traversal/RCE vulnerability in Unreal Tournament (CVE-2024-34492).
Really, there's not much point in disclosing a vulnerability in a 25-year-old game except for some kind of closure, but I think the sheer obviousness of this (and another thing I'll get to in a second) is amazing evidence of how much game developers have dodged a bullet from sheer disinterest in games as attack vectors.
That disinterest is starting to wane though. There have been a few high-profile incidents lately, and it's really past time for developers to start taking the security of their products seriously before they turn into serious problems.
I reported this issue to Epic's security team in August 2022. They didn't respond, but they did delist Unreal Tournament from sale shortly after. Not sure if the two are related. I reported this to the OldUnreal team currently maintaining Unreal Tournament 90 days ago, so I think responsible disclosure time has passed.
Anyway, this originally started in 2005 when someone discovered the "DLL trick" and used it for UTDC, a third-party community-created anti-cheat mod, and the same trick was later used by the ACE anti-cheat mod.
Normally, when you connect to an Unreal Tournament server, it downloads any mods that the server is running, and it drops them into a cache directory with a hash-based filename and adds it to a lookup list that maps it to what the server says its filename is. Only a few file types are permitted to be downloaded from the server in the first place, and DLLs are not one of them.
Older Unreal Engine games also support native DLLs corresponding to their packages, supplying functions that are executed by native code in the DLLs.
For quite a while, if you connect to an Unreal Tournament server, you'd be greeted with an NPLoader popup that looks something like this:
If "native mods" isn't setting off alarm bells I don't know what will. The game client is not supposed to be downloading DLLs, yet somehow this mod is capable of "installing" DLLs by itself.
The way this works under the hood is that in Unreal, most assets are converted to internal formats, but music files are just raw binary blobs that are copies of the original file. Someone had the idea to import a DLL as a music file from Unreal Tournament's System directory, then have the mod load the Editor package and export the asset, which dumps it into the client's System directory, causing the game to load it the next time it connects.
There is no restriction at all on the import path, so this can really drop any file anywhere on the user's machine. Also, while this popup dialog is nice, it is completely unnecessary - A mod an export a file with no user interaction whatsoever.
The simplest solution to this would be to update the game client to forbid the client from loading the Editor package.
This probably isn't the only serious problem with older Unreal titles. I've seen evidence that the UnrealScript loader, despite being strictly typed, does almost no validation of the bytecode at all, which means UObject pointers can be converted to integers and vice versa via constructed bytecode, which is almost certainly RCE-able by forging a UObject pointer into an attacker-controlled buffer somehow.
I might write a later piece on how to secure your game against malicious mods, especially if you're going to allow mods to contain code, but for now: Don't play Unreal Tournament on untrusted servers.
by OneEightHundred (noreply@blogger.com) at 2024-06-12 17:09
I think failures are educational and worth documenting, so here's the story of how FSE-on-GPU failed.
One of the things I'm working on right now is Gstd, an attempt to make a Zstandard variant that runs on GPU compute.
Zstandard uses a newer entropy coding method called FSE, which is a variant of tabled asymmetric numeral systems (tANS). Using this scheme requires two steps, both of which have challenges in how to implement them on GPU compute: Decoding the probabilities and converting those probabilities into a decode table.
GPUs are designed for highly parallel instruction execution, especially executing the same operation on many different inputs at once (a "vector" of values). Like Brotli-G and GDEFLATE, Gstd uses multiple bitstreams simultaneously to allow values to be parsed from all of them at once.
One of the reasons this type of scheme is so efficient is that GPUs have instructions to compute a running total for each of the inputs, so if you can create a vector containing 0 for lanes that don't need to be refilled and 1 for lanes that do need to be refilled, and do a prefix-sum op, then you'll get a vector where all of the lanes contain the offset from the current read position that they need to be refilled from.
Doing things in parallel this way however requires minimizing the dependency of sequential values. Running totals ARE dependent on preceding values, but they can be done fast because the GPU has an operation to do it.
This creates a problem for decoding FSE because it has multiple steps that are sequentially dependent.
Here's a quick summary of the FSE table decoding and generation algorithm:
The example given in the specification looks like this:
state order | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
state value | 1 | 39 | 77 | 84 | 122 |
width | 32 | 32 | 32 | 16 | 16 |
Number_of_Bits |
5 | 5 | 5 | 4 | 4 |
range number | 2 | 4 | 6 | 0 | 1 |
Baseline |
32 | 64 | 96 | 0 | 16 |
range | 32-63 | 64-95 | 96-127 | 0-15 | 16-31 |
Note that the baselines start at zero where bit usage is low and wrap around to the high bit usage ones, which are lower-numbered and come first. The size of the baseline range is always 2^Number_Of_Bits, which ensures there is always a slot that will decode to any given state value. Like all ANS variants, these are encoded in the reverse order that they are decoded, so there needs to always be a slot that will decode to a symbol that, combined with the bits, equals a the desired state value.
There are several serial dependencies here:
The last one has a few implications, there are basically two ways to do it: Run a loop over the post-placement slots and figure out which occurrence of the symbol each entry is, or run a loop over pre-placement slots for each symbol to sort them. Both are seriously problematic due to low utilization. What if we could just assign the baselines to pre-scattered values instead of post-scatter?
This goes against the theory, but if it didn't hurt the compression ratio enough, it might make it work. A clever idea that turned out to be a big mistake. It's actually pretty nice in terms of implementation possibilities:
First, fill an empty table with zeroes and then pack the baseline, bits, and symbol into a value and put it into the table. On hardware with WaveMatch and equivalent ops, it's possible to create a bitfield containing 0 for lanes with zeroes and 1 for lanes with non-zero values, which can be used in tandem with masking and high-bit search ops to quickly find which lane preceding a given lane has a non-zero value. On hardware without it, it's possible to do it in logarithmic time by packing and transforming the values in a way that their desired order is numerically ascending and repeatedly using WaveMax.
The baselines are ascending, but were monotonically ascending, so this basically involved negating the bit usage and coding a value that could be subtracted from the slot index and then multiplied by 2^BitUsage to get the actual baseline.
Once that's all done, loop over that table and multiply each index by the prime distribution value and take that value modulo the table size to get the final position. Great!
So, I tested this out with some sample data from OpenArena. The results were coming out pretty favorably (~15%) ahead of GDEFLATE and changing from post-scatter to pre-scatter was not having a big impact on the compression ratio.
However, FSE is still difficult to work with on GPU: It requires a big table, and that big table has to be indexed from groupshared memory because of how big it is, unless it can be crammed into a register somehow... But there's a lot of data that would have to be crammed into registers!
For 15%, it would be worth it though! Unfortunately that hit a painful realization while working on the decoder: The encoder was dropping the contents of in-compressible blocks. Once that was fixed, the savings were only about 3%.
3% was not great to begin with, but was especially odd because GDEFLATE is not functionally different from Deflate, and Zstandard usually outperforms Deflate by a measurable margin. Upon running tests, there were a good number of blocks where Zstandard was compressing to about 10% larger than GDEFLATE. My best guess was, and still is, that GDEFLATE uses libdeflate as its compressor, and libdeflate is quite good... so, to test this possibility, I needed to make some code that could convert a Deflate block into a Zstandard block!
Now, in order to do that, I need to generate FSE probability tables for the match, literal length, and offset codes, and then rescale them to total a power of 2 as the scheme requires. How do you do that?
Well, normally for an arithmetic or range coder, the cost per occurrence of a symbol is -log(Probability/SumOfProbabilities)/log(2). As probability decreases, the bit usage declines rapidly at first and then slower. Through some math, this can be simplified into smaller calculations, but there's one thing that's weird about FSE: The reduction of bit cost seems linear, doesn't it? Maybe the bit usage is just approximately-logarithmic but not actually? Close enough?
Suppose the successor state for a symbol is uniformly random (and they do behave pretty randomly). Have a look at the table above. If you bump the slot usage for a symbol by 1, the effect is to split one of those 5-bit ranges into two 4-bit ranges, but those 4-bit ranges cover the same range of successor states, so really that fraction of the successor state ranges was reduced by 1 bit in cost. The problem is that all of those 5-bit ranges are the same size, so each time you add 1 slot, it keeps reducing the bits-per-symbol by the same amount, not an amount that's changing according to a logarithm curve. It continues doing this until it hits a power of 2, at which point it has to start splitting 4-bit ranges into 3-bit ranges.
But that means the ideal slot to bump the probability of is the same until it hits a power-of-2 usage, resulting in a table that contains only power-of-2 values except for at most one probability - and all of those powers of 2 have integral bit usage, which is just Huffman.
This seemed very problematic so I had to run a few tests, and the answer was that FSE doesn't have this problem because given uniform successor states, FSE will produce predecessor states that are disproportionately lower values. (This is why the less-than-one distribution works in the first place.) It's somewhat easy to see why: Lower-numbered states take up more numeric range.
Doing baseline distribution before scattering breaks that assumption though, and it makes the predecessor states uniformly distributed.
Some further tests confirmed my suspicions: If you use a properly-scaled probability distribution with cell properties determined before scattering, then the result is worse than Huffman. It's only better than Huffman if the cell properties are determined AFTER scattering.
That means that in order to do this right, it was necessary to sort the entries, which I don't think there is any particularly good algorithm for.
Yeah, but Huffman is only a small amount of overhead too. FSE and arithmetic coding make it possible to get fractional bit usage, but the benefit of that depends a lot on the distribution of values. The fewer bits used by the most-probable symbols, the more likely it is to help, because a fractional bit of accuracy is a bigger portion of the values being encoded.
Ultimately, this realization is pretty lethal to using FSE on GPU, since the remaining options for table construction are quite bad, and the table is a problem in the first place.
The good news is that rANS should still work. In fact, the way GDEFLATE decodes Huffman codes is by using binary search within a vector register, and rANS symbol lookup would work the same way.
So, Gstd isn't dead yet (it will be dead if it turns out that the larger blocks are due to format inefficiencies instead of libdeflate being really good), but it will not be using FSE, and I think the work of getting FSE to work on GPU compute was unfortunately wasted.
by OneEightHundred (noreply@blogger.com) at 2024-06-08 17:21
This is part 3 of a 4-part series, mirrored from the Gale Force Games page.
The only other engine in ScummVM that really resembles mTropolis in terms of the development style and challenges is Director, and ScummVM's Director engine is able to take advantage of a tool called ProjectorRays that converts exported and protected Director movies back into a form that Director itself can load, including decompiling Lingo bytecode into source code.
Wouldn't something like that for mTropolis be useful too? Unfortunately, no, for multiple reasons:
In the last installment, I covered how MTDisasm came into being and ultimately was able to fully dump Obsidian's game data. At that point, making it work again was clearly an achievable goal, it was just going to take a lot of work to do it.
My initial plan was to convert MTDisasm into a library and use that as the loader for a new program called MTEmu (mUlator was a bit too on-the-nose). Unfortunately, while I had done a bunch of OS interface stuff for my Glider PRO port, very little of it except for the resource loader was going to be useful for Obsidian. Even the StuffIt unpack tool, ported from The Unarchiver, wasn't going to be too useful - the installer format used a different algorithm, so I'd have to port that too. I was basically going to be starting from 2D drawing boilerplate, and writing boilerplate sucks.
Trying to add it to ScummVM was somewhat under consideration, and it already had things I needed: QuickTime parsing and decoding, PE-COFF resource parsing (needed to get cursors out of DLLs), MIDI output, and all of the OS boilerplate. My biggest reservation though was the license. I am not a GPL fan, everything I do is MIT/Apache licensed if I can help it because I think reaching more platforms is more important than contributor reciprocation.
I ultimately decided to try doing it as a ScummVM addition anyway for three reasons:
I want to keep the possibility open, but bringing it to console would mean re-implementing the things mentioned above (basically going the MTEmu route) and paying for certification. It wouldn't be cheap or easy, and given the level of interest in it, I think it would be difficult to justify.
I didn't think it would be re-released in the first place though, so who knows? If the people with the rights to it want ink on a dotted line, they know where to find me.
ScummVM has a "create_engine" tool now, but at the time the mTropolis work was started, the recommendation was to clone the engine used for acclaimed 1993 Game of the Year Plumbers Don't Wear Ties, so that was used to get a simple foundation, but one immediate problem was that the initial plan of bringing MTDisasm in as a library was going to be a no-no with the project's code standards (which, among other things, prefers using its own integer types instead of stdint.h).
Because of this, the MTDisasm data object loaders were mostly brought in by hand. One thing I knew I wanted to do from the start was separate the loading of object data from instantiation of the objects, partly for cleanliness (though at the cost of duplication), partly to deal with disconnects between how data was stored on disk vs. how it should exist in a loaded object.
One of the first technical challenges was planning for linking up object references, which has 2 problematic aspects: The first problem was that if objects are loaded in some order, then it's possible that an object references another object that hasn't been loaded yet, so object link-up has to be its own phase. The second, and much bigger problem, was mTropolis's system of "aliases."
As briefly mentioned in an earlier part, a modifier in mTropolis may be converted into an "alias" that allows the modifier to be reused in multiple different objects. This was especially important in tandem with "behaviors," which are modifiers containing other modifiers. One thing that might not be apparent from that though is that the behaviors have to be cloned when they are brought into the scene as aliases. If a behavior includes a variable inside of the behavior, for instance, then each instance of that behavior has its own copy of the variable - and other things inside of the behavior reference that variable!
On top of that, if a variable is converted to an alias, then the aliased variable has another special behavior: Changes to any instance of the alias apply to all of them, making it effectively a variable reference.
Aliased variables were actually implemented via an incorrect assumption for quite a while. Initially, they were implemented via a somewhat complicated approach where the aliased variable modifier existed globally, but references to it could be put in multiple parts of the project. It later turned out that aliased variable modifiers are actually distinct objects - If you add them to the scene twice, each one has a different GUID. The engine was eventually changed to handle this correctly, cloning variable modifiers and making them reference a shared storage object instead.
As a legacy of that though, the mTropolis engine's loader has several steps to making an object exist:
"Materializing" an object does 3 things:
Even now, it's not entirely clear how object references are supposed to work though. Muppet Treasure Island, another mTropolis title, had numerous problems with variables having duplicate names but different contents and behaving in a way where the correct behavior must have involved resolving the references to a different one than what the GUID pointed to, but in other cases, the one pointed to by the GUID was the correct one.
Dealing with a complex thing like this is really hard if you can't actively see what's happening. The first line of defense is logging messages. Since mTropolis depends heavily on message-passing for logic, it's really important to be able to see what messages are being sent and where they are going. If you run the game with debug level 3 or higher (e.g. by putting "-d 3" in the command line), then ScummVM will log all message propagation to the console.
This is extremely important, but printing things to the console doesn't provide a lot of information about the state of the scene. To help with a lot of development problems at once, one feature that went in very early was the debug overlay. It was so high-priority that its position in the to-do list was right after being able to reach the first screen.
If you've paid attention while adding Obsidian to ScummVM (or if you go to the options by hitting Ctrl-F5 in the ZOOM/Steam version to exit to the launcher...), you may notice that there's an option marked "Start with debugger."
This launches the game with a debug overlay and a few buttons on the side, and a display that shows you the name of the active scene and active shared scene.
Unfortunately, the step-through debugger part was planned (and much of the internal architecture built around it), but never actually came to fruition because it turned out to not be the right tool for debugging most of the logic bugs that popped up.
Step-through debugging is useful if you want to be able to analyze the state of a program while it's in the middle of running scripts, for example, but most problems with game logic working properly were not due to scripts executing incorrectly, they were due to problems with messaging - sending messages in the wrong order, sending messages that weren't supposed to be sent, not sending messages that were supposed to be sent, and so on.
Debugging message problems on the other hand didn't benefit a lot from having a step-through debugger, it mostly depended on looking at the disassembled scene to figure out how it was supposed to work, comparing that with the message log to figure out it was actually doing instead, and then setting up test scenarios in mTropolis to see what messages actually get sent - and in what order - in a similar situation.
One example of how messaging went wrong is having to figure out the exact point where queued messages were discharged during a scene transition, something that was causing the wrong music to play in the Statue lower level.
The project viewer and inspector, however, were incredibly helpful:
These let you see all kinds of information in real-time: What is loaded, what the values of variables are set to, what the GUIDs of objects in the scene (which allows them to be cross-referenced with MTDisasm output), and any other information that's been exposed for that object.
There are also some toast notifications that pop up at the bottom:
Warnings are colored yellow and errors are colored red. You won't see many warnings these days, but the main source of warnings is that every modifier and element in the ScummVM mTropolis engine has a "support level," which is either unimplemented, partially-finished, or finished.
Entering a scene with a partially-finished or unimplemented element or modifier results in a warning notification. This is to make it clear that a scene with things in an unfinished state has been entered, and if something isn't working correctly, one of those unfinished things is likely to be the culprit!
(You may be wondering why there's a warning about text labels in a scene with no text labels. That's because Obsidian has its own debug overlay text label, but in the retail version, it's moved off-screen.)
Error popups mostly occur due to Miniscript errors. If you play Obsidian, you'll notice several of these. That's not actually a problem with ScummVM - the game has some scripts that do invalid things, and in all of the cases that I'm aware of, I've confirmed that the proper behavior is in fact to throw an error and stop the script.
Making this also ran into a bit of a problem with ScummVM's internal architecture, it has its own GUI system, which you can see in the in-game settings and launcher, and it's not a great GUI architecture but it's not really bad for what it does either, but it does have one problem: It assumes exclusive control until you leave it, which means it is completely unusable while the game is running. Dealing with that required rolling a new small UI kit to make the windows, scroll bars, hierarchy tree, etc. that you see in the debug overlay.
Having that UI kit also required everything else to be aware of it. For example, if the game is supposed to change your mouse cursor, it actually doesn't, it changes the cursor assignment of the main game "window" so that the mouse is visible, and not the game cursor, when it's over one of the debug overlays. Same goes for detecting mouse movement if the mouse isn't in the game window.
Of course, the big irony of all of this is that these debugging capabilities are considerably more advanced and informative than what mTropolis gives you, so I actually had way more information available to me than the developers of the game did!
One fun thing about being able to see everything in the game data is being able to see things that were either obscure or in one case, really not meant to be found at all, and find all kinds of new information about the game that was either not well-known or not known at all.
The internal name of the bureau realm is "Labyrinth" and the name of the spider realm is "Abraxas."
It turns out pressing the space bar skips most cinematics, except for important story cinematics, letting you get through the game much faster. Huh. Doing this causes some bugs sometimes though, like preventing the music from stopping when you beat the Bureau chapter.
You may have noticed that there is a help booth in the first part of the Bureau named "Sources" that shows a falling book, referencing the Myst intro, and if you click the screen, you get a crazed man telling you to bring him "the blue pages," another Myst reference. What you may not know is that all of the booths are reachable from the phone puzzle, and if you call the Sources booth, the phone is answered by Henry Stauf from The 7th Guest.
Speaking of the phone puzzle, I've looked at the code for it, and I still don't understand it. Each of the dials actually represents a positive or negative coordinate, and a coordinate is only valid if either all three are negative or positive (determined by which half of the slider it's on), but that means there are actually 2 valid coordinates for any point you want to reach.
There are supposed to be fireflies in the Bismuth junkyard, but you probably won't see them on a normal playthrough because when you first land in it, you're actually in a duplicate of it that belongs to the previous chapter, and the duplicate doesn't have the fireflies. Normally, leaving that landing scene triggers a disk change, which puts you in the actual section. Try going back down and looking around!
Not only was this one completely unknown previously, it was actually discovered by accident due to the unfinished modifier toast popup mentioned earlier. Viewing the zoetrope popped up a warning about an unimplemented "Path Motion Modifier," which was a bit strange because the wobbling frame was working just fine and nothing seemed to be unusual, let alone having problems because of a missing modifier that basically existed to move sprites around.
Looking at the scene in the debugger showed some interesting things.
The path motion modifier responds to a message named B206_Start_Drop. Drop what though? Further down there is a "Click Behavior" but it's actually not a click behavior, it has 5 key detection triggers and a counter that fires when all 5 are reached.
After looking through the logic, it turns out that if you type "Voxel" after beating the puzzle, then a head appears and the animated bird drops a turd on it.
To make matters worse, the path motion modifier is not used for anything
else in the game except for this, which meant I had to implement its
behavior just for this bird poop Easter egg that nobody even knew about.
You thought nobody would ever find out... |
No game can survive without reusable systems though, but making systems out of objects in a scene hierarchy is particularly weird.
Much of Obsidian's navigation system is handled by a list of "nodes," and you're usually on one of those nodes. The node lits is usually supplied by compound variable that has to be named "cSR" in the subsection.
This is a fairly common recurring pattern: Behaviors are placed in the hierarchy alongside variables that the behaviors have to consume.
A pretty gnarly piece of "what were they thinking?" here though is how those values get in the list in the first place. The editor lets you set the initial value of variables, so I bet you're thinking that you can edit lists in the editor right? Well you would be wrong, the actual way is that you have a script set the values that you want, run your project, and then when you go back into the editor, the values that you set while running it have persisted into the value. Intentionally leaking play state back into editor-persistent state as intended design? Whee!
Some other mTropolis users would later come up with their own patterns, like Muppet Treasure Island populating lists by broadcasting a message to part of the scene hierarchy and having all of the responders ping back. Why this was better than just giving the scripting language a "for" loop, I have no idea.
You're probably wondering then, if coding was so cumbersome in this, how did the library terminals work? Well, for one thing, the terminals are done via a combination of several different parts that handle specific functions, from text output to actually updating things. It's basically a finite state machine, so most of the logic behavior simply handles all possible states.
The game also includes a custom plug-in called "RSGKit" that includes a few modifiers used to do more complex tasks, like string manipulation (which Miniscript has no built-in support for) and creating WordMixer answers.
Speaking of WordMixer, the dictionary data is also completely stored in the RSGKit plug-in. The ScummVM mTropolis engine has its own internal plug-in system to implement these modifiers, including parsing out the dictionary data from hard-coded offsets in the DLLs, and the dictionaries are used for both WordMixer and the filing cabinets. Naturally, getting the word list exactly correct (including short words) is mandatory, since some important files are referenced by number in the scripts.
Sometimes, the way the game and its logic system work create problems that clash with how ScummVM is supposed to work in ways that are really hard to do anything about. A big one is the menus.
If you go to the game options in ScummVM and change the sound and music volume, you might notice that it... doesn't really work.
You might be thinking "well so what, just scale the volume of sounds with the SFX level and the MIDI music with the music level!" That's easier said than done. The volume levels are stored in the save files (by the game logic), and syncing them up with ScummVM's settings requires manually hooking into the game logic. Additionally, the sound volume system used in Obsidian was designed to evaluate volume levels when entering a scene - not while you're already in the scene! But one quirk you may not be expecting is that MIDI was used for some non-music sounds, like the keycard sound in the maze, and (surprise!) the vidbot screen on/off sound. (Remember that last one for the next installment!)
Similar problems when trying to load from the in-game menu:
Simply put, the game expects loads to happen at a certain time (i.e. only from the menu), and follow through on them a certain way. Saving a game in mTropols is not done by the engine tracking everything that needs to be saved, it's done by the game logic, which has its own ideas about what needs to be saved and loaded.Fortunately, I was able to implement saving from the menus for Obsidian specifically, since whether or not you can press the Esc button to go to the menu is controlled by a single boolean variable, and you can save from anywhere that the menu is available.
Even supporting a menu is more complicated than it may seem. The game has to be able to support transitioning out of and then back into just about any scene, which is a problem if there is game state from puzzles persisting through scenes. That's probably why the piazza puzzle has a blocker that prevents you from saving anywhere in the main puzzle area, and there is at least one bug where reloading the game in a particular scene skips the puzzle (which is not a ScummVM bug - it was present in the retail version).
I think this has covered most of the process involved in getting the ScummVM mTropolis engine into a working state. The next and final installment of this series will cover how it went beyond the original, with better MIDI support, widescreen mode, subtitles, improved color depth, and a tiny little quirk to get the save screenshots right! It'll also have some final thoughts about wrapping this project up, and the future of the mTropolis engine in ScummVM.
by OneEightHundred (noreply@blogger.com) at 2024-03-06 01:48
This is part 1 of a 4 part series, mirrored from the Gale Force Games page.
I'm putting this together to tell the weird story and technical details of how Obsidian was brought back to life and added to ScummVM. I didn't originally want to do this because I think dev blogs are more interesting when they show the state of the art and help people understand how to solve similar problems, and if there are two things that stand out about porting Obsidian, it's how unusual of a project it was, and how much luck was involved in it being as easy as it turned out to be, and possible at all.
Maybe that weirdness will be an interesting story on its own? Who knows.
In 1993, Cyan released Myst, a 3D-rendered surreal point-and-click adventure game heavy on puzzles and atmosphere, it was a massive success. Numerous other games attempted to follow in its footsteps, and most were... not very good. In 1997, Rocket Science Games released an ambitious, high-budget game with a similar gameplay concept and fantastic visuals, set in a series of surreal dream worlds full of creative and similarly-surreal puzzles. Unfortunately, for various reasons, the game didn't sell well, and RSG was forced to close its doors.
With Obsidian relying on QuickTime, the seeds of its demise as a playable piece of software were sown. Apple built an entirely new operating system, leaving the Mac version unplayable when the emulation environment for their old operating system was discontinued. They also discontinued QuickTime for Windows, already a notoriously finicky piece of software, and future updates to Windows made it quite difficult to run.
Fast-forward 25 years later to August 2022, and support for Obsidian was added to ScummVM. Barely over a year later, the rights to the game were acquired by Jordan Freeman Group and it was officially put back into print on ZOOM Platform and Steam.
In 1984, Apple released the Macintosh computer, which kicked off the foundation of the user interface used by desktop computers to this day, with later models incorporating high-resolution displays and high-fidelity waveform sound output. It was this design, combined with the release of QuickTime, Apple's video authoring and playback system, that kicked off an era of "multimedia" experiences.
With the release of Windows 3.1 in 1992 and especially Windows 95 in 1995, Windows had largely caught up to Macs in multimedia capability, but much of the multimedia authoring ecosystem was still on Mac, with popular tools such as HyperCard, Authorware, SuperCard, and Director. HyperCard and SuperCard never got Windows support, and Director didn't get it until its fourth major version.
mTropolis entered the market in 1995 with a Mac-only authoring environment and runtime support on Mac and Windows. Unlike Director, a project exported from mTropolis couldn't be packaged into a stand-alone application, it had to be bundled with a player program. This also came with a major caveat (which we will come back to, because it's really important!): Projects exported for Mac and Windows were in different formats.
Before we get into the format differences, I need to explain the nature of the operating systems in this environment. Another way that Macs were a major departure from DOS, and much closer to how things work today, was how application development for it worked. Applications were developed against a large suite of functions provided by the operating system, unlike DOS where your application was mostly shoving information directly into the hardware.
One of Apple's big innovations at the time was that a huge amount of multimedia capability was standard and provided by the operating system. An API called QuickDraw handled drawing, Sound Manager handled sound output, and later QuickTime handled video and audio decompression and playback. It also had its own image format (PICT) that interfaced with so many of its capabilities that it's difficult to open on Windows to this day.
With Windows, Microsoft sought to have similar facilities. In particular, an API called GDI existed to provide much of the functionality that QuickDraw provided on Mac. Naturally, both APIs had different memory structures and formats that they used to provide their data to their drawing systems.
Another concept that will quickly become important here is byte ordering. All numbers on computers are stored as a sequence of binary data, usually chopped up into 8-bit numbers or bytes. Most numbers are either 8-bit, 16-bit, 32-bit, or 64-bit in size. I'll be discussing most numbers in hexadecimal here.
When a number larger than a byte is stored, there is an important question to ask: What order do you store the bytes in? Say you have decimal number 123456789, that is 75bcd15 in hexadecimal. There are two ways of storing this: Least-significant byte first ("little endian"), which produces the byte sequence 15 cd 5b 07, or most-significant byte first ("big endian"), which produces the sequence 07 5b cd 15. Why would you use one versus the other? That's a question for the electrial engineers, but Macs were originally developed for the Motorola 68000 CPU series, which used MSB-first ordering when loading large values from memory, while DOS and Windows were developed for Intel 8086 derivatives which used LSB-first ordering.
Combining all of these things, we start to see one of the timeless dilemmas of portability: A lot of multimedia stuff is being provided by the operating system, but the operating system is determining the format that the data has to be in to use it. How do you deal with this? There are two strategies: Use a common format that you can convert into the format that the operating system expects at runtime, or export to the format that the target operating system expects.
There is a third option, which is to load the OS-native format of one operating system on the other operating system. This turns out to be a pretty bad option for Mac-based authoring for two reasons: First, many of the memory and data formats that have first-class citizenship on MacOS were thinly-documented, and second, Apple liked to have complicated (but capable) file formats, often integrating with other pieces of their multimedia infrastructure, making supporting them robustly on other platforms difficult. (Windows, by comparison, tended to use fairly simple formats.)
One thing that helped with this process though (but ultimately consigned Obsidian to bit-rot) was that Apple had ported QuickTime to Windows. While Microsoft had come up with their own competing technology, Video for Windows (commonly known for playing AVI files), repackaging QuickTime files into AVIs was generally not a simple process due to QuickTime's more advanced features not being supported in VFW. Ultimately, mTropolis could play back AVI files on the Windows version of a project, but it had to be in AVI format ahead of time.
This became a major problem further down the line though, as Apple discontinued support for QuickTime on Windows, and getting QuickTime to work became more and more difficult with later versions.
mTropolis was positioned as an alternative to Macromedia Director, but ultimately failed in the market. mFactory was bought out by Quark, who released mTropolis 2.0 only as an update to existing mTropolis licensees and then closed the company, favoring their own product, Quark Immedia (which also failed). What's not so clear about this is how mTropolis was actually sold to licensees though.
Director was pretty pricey, but could be bought as a through a whole bunch of authorized resellers. mTropolis was considerably more rare, and VERY expensive. The first version was $4,500 per copy, equivalent to $9,200 today, and I've yet to find any evidence of it being sold as a mass-market product. It may have only been sold directly to licensees, and numerous credit lines and mentions suggest that mFactory was more hands-on with their licensees. Was it being offered under a more "premium" business-to-business model where they provided lots of support? Was the plan to go boxed-product after they'd worked out the kinks with early licensees? Who knows. Either way, used copies of it are extremely scarce.
While I was eventually able to obtain a used copy of the full version, the easiest way around this problem is actually the demo version, which was bundled with some reference material (which is also rare) and on a few issues of the UK magazine MacFormat, specifically the February 1997 (MF47) and May 1998 (MF63) issues. While the demo versions don't allow you to save your projects, they do allow you to export them, and since we're only interested in the exported format, that's perfectly fine! Now just need to dust off an old piece of fruit company hardware to run it.
Sorry about those aftermarket prices |
After finally obtaining a copy, I wasn't really expecting to contribute more than as much information as I could find and hope that someone else who know what they were doing would pick up development of it. Most engines in ScummVM are developed by reverse-engineering the game executable with IDA or Ghidra, tools which can help turn compiled code run by the CPU into a more readable format. This requires a considerable amount of work, since the structure of more complex types and the usage of values have to all be determined by hand, and sometimes two different values wind up being treated as the same value because they were mapped to the same hardware register, and all other kinds of problems.
Doing that is often crucial though, because often you really need to know what the program is doing to handle certain things. File formats and behaviors can be cryptic, logic can be hard-coded, and being unable to look at the actual code doing things is a big problem.
I had no idea how to do any of that (although I've picked up on it somewhat since!), but I did know how to poke around data formats, and I had just come off of porting Glider PRO, which involved doing a lot of research on Mac-native formats documented in the Inside Macintosh reference book series, and getting a copy of mTropolis meant I had room to play with it. At this point, actually making it work again was not even remotely the plan.
mTropolis authoring interface |
All elements can have modifiers attached, which is where the real fun is. Variables are modifiers, scripts are modifiers, logical things are modifiers, movement is done via modifiers, basically everything interesting is done by attaching a modifier to an object.
We'll get to how all of this works in a later installment of this series. First, we need to talk about how to analyze this thing. If you go to the "Build" menu and select "Build Title..." then you get this nice dialog.
Build dialog |
"mFactory Animation?" That could be a serious problem. We'll talk about why later. For now, let's click Build to export our project!
Prompt for path for startup segment export |
Looks like we forgot something! We need to set the directory to export to. mTropolis was designed from the ground up for multi-CD titles, and the way it does this is by letting you create segment labels, each of which corresponds to a single exported file, and then assign each section in the project to a segment label. We can set these up further in the project config:
Segment mapping editor |
But anyway, let's set our startup segment to the "Move to PC" directory here and build it with a Macintosh target, and then build it again with a Windows target.
Exported projects |
Success! Let's move the files to my machine running the best operating system for open source development and everything else (Windows) and pop it open in XVI32.
Exported project in XVI32 |
There's not a lot that we can tell from the outset, but resizing the window horizontally causes things to line up. That seems to suggest that there's a table with a bunch of fixed-size elements in it. Of what purpose? We'll find out soon, but I'm also going to highlight another pattern that occurs quite often here.
Aligned header table, length-prefixed strings |
We can tell from this now that this size table takes up 34 bytes per row, but it's also not clear where it starts. But also, look at the boxes highlighted purple and green. Both of them contain strings with a length prefix. While C represents the length of strings by terminating them with zero bytes, MacOS was written mainly in assembly and Pascal, and Pascal uses length-prefixed strings. Even on MacOS, there was an extension to insert length prefixes to strings in C because the MacOS APIs preferred that so much, and that mentality carried over to a lot of Mac software, like this.
However, that also brings us to another problem: How many bytes long is that length prefix? That's actually a recurring problem throughout this kind of thing: Knowing how big a value is, and where it starts and ends, is really important, but most of the time, values are going to be too small to take up their entire capacity, so it's not clear if the zeroes in the other half belong to that value, or represent something else.
Let's open the Windows version, "Untitled-11.MPL", and compare it side-by-side to the Mac version.
Position-aligned byte swapsI've highlighted a few blocks here, but really this type of thing is all over the format. Remember what I said earlier about Mac and Windows machines using different byte order? If you compare the files data tends to appear in the same positions, but in reverse byte order. This is immensely helpful, because it means the overwhelming majority of data start positions and sizes can now be determined by just exporting the data into both formats and looking for the byte order flips.
There did turn out to be a few caveats to this though. The developers of this decided to try sticking to platform-native formats whenever they could, so a few things have data that is in completely-different format. One of those things that is the source of much deception are the rectangle and point structures. On MacOS, a point is typically two 16-bit values, with the first representing the vertical position and the second representing the horizontal position. On Windows, it is the reverse. The problem with this is that the member swapping combined with the byte swapping results in a complete 32-bit swap, so it is often difficult to tell if a 4-byte flip is a 32-bit number or a point.
This is still a bunch of salad though, and the earliest stages of doing something like this are the most painful because, while there are numerous concepts used repeatedly throughout this, they haven't been discovered yet. Even today, I haven't actually figured out what all of the values in the file header mean.
We can start with one basic one though: Most of the time, when a file format has a catalog, it has size and position data. In this case, some of the values in the catalog are a bit deceptive: The "sceneStr" text is actually just junk that was left in memory when the catalog data was being written, only the part with the stream type up until the first 0-byte terminator is relevant, and the scene type is a fixed-size 24-byte character buffer.
We can look at some of the values and see what is likely to be catalog positioning. An easy guess is that if two values sum up into a third value, then the value that gets summed into is probably the file position and the other one is the size.
Catalog position and size data, and the first recorded positionOnce we've identified this though, we still need to figure out if that data belongs to the scene type string before or after it. In this case, that can be guessed pretty easily by seeing if the data after the last string makes sense as a catalog entry or not. In this case, it does, so the data probably belongs to the string preceding it.
This can be poked with a little further by adding more assets and scenes to the project, which causes the value at 0x30 to correspondingly increase. Great, we know where the substreams are located!
At this point, I started making a program called MTDisasm to do further analysis, in particular dumping the substreams of a project to separate files. I'll spare most of the details, but the catalog is a series of streams marked by segment number, and the file positions are relative to the start of the segment file. Segment files have no header!
Thankfully, this byte-swapping thing is unbelievably, helpful, but our adventure is just beginning. We still don't know what all of that data is before the "bootStream" or what the contents of any of them mean. This is where we get into "black box analysis" where we start doing things to our project, exporting it, comparing exports, and using those comparisons to understand how what we do in the editor is reflected in the data.
In theory, if it's possible to completely determine how everything you can do in the editor maps to the file format, then you can basically determine almost all of what the creators had in their project before they exported it.
There are still many issues in front of us though. Knowing what the data says doesn't always tell us how things are supposed to behave, sometimes the exported data is missing important things, and there's still all the work of making it all actually happen.
I'll cover the details of this in a later installment, but when I said this required a lot of luck, the byte-swapping stuff was a massive windfall, but there is one other type of thing that could have easily derailed this, at least to the point of really needing to start reverse-engineering the actual code: Proprietary compression formats!
For the most part, if data is fixed-size, it is relatively easy to load, but proprietary compression formats can be a nightmare. If you look at the deflate compression algorithm used in the ZIP format, for instance, it has LZ77 commands and literals compressed by Huffman trees, and the Huffman trees themselves are compressed, and there's very weak correlation between what the bits in the data are and what values they actually represent.
Even simple compression formats like Cinepak have multiple layers of information processing before you can get the data that you want, and being lossy formats, there isn't a reliable way to put data into it where you get a predictable result out.
mTropolis uses a proprietary compression format for its animations called "mFactory Animation" and fortunately, that format turned out to be relatively simple. This is actually somewhat expected: mTropolis was supposed to run on low-end Macs, at a time when high-end Macs could only decode an MP3 file in real-time if you weren't doing anything else, so compression formats were often relatively simple.
This type of thing was a pretty close call though. If it weren't for that, this might not have happened.
I'll get to how that was reverse-engineered in a later installment of this series.
Plot twists, surprises, an anamorphic filter created solely for the intro video of a dead company, and of course, jank! But maybe less jank than you expect. In future installments I'll talk about:
by OneEightHundred (noreply@blogger.com) at 2024-02-27 20:11
This is part 2 of a 4-part series, mirrored from the Gale Force Games page.
In the last part, I went over the basics of analyzing values in an exported project, but this doesn't explain how to get actual data out of the scene. To skip ahead a bit, one problem is that objects in mTropolis aren't the same size. Without any knowledge of the structure, we aren't really even sure how to determine where objects start and end, and that can be a bit difficult.
A simple way of dealing with variable-sized objects is just make multiple identical ones and see how far apart they wind up in the data. It also isn't entirely clear yet how the stream catalog even associates with scenes in the project tree.
Let's export it and use MTDisasm's "bin" mode to dump the streams, then open up stream-3-1.scene for the Mac and Win versions in XVI32 again.
A bunch of useful things to note here. We can see that the element ends with a name and a null byte, and by noting the spacing, we can guess where the first one starts, and determine that they're 61 bytes in size. 61 is 3D in hexadecimal, and we can see the size marked in the data as well. The scene stream also seems to start with a scene definition of some sort too, maybe?Also of interest, there is a 32-bit value that is 8 only in the last one - This turns out to be a flags field, and the 8 is a bit flag indicating that the element is the last one in the list of elements for that level of the tree.
(Fun fact: One of the flag bits is for if the object is expanded in the structure window, which is useless outside of the editor.)
The start of the element also provides some useful clues, which led to figuring out the format after investigating different types of objects: Most (but not all!) object types start with a 32-bit object type ID, a 16-bit revision number, a 32-bit field of unknown usage, and a 32-bit size that includes the preceding values.
Unfortunately, while the first 2 values are common across all objects, the next 2 are not always followed. That turned out to be a problem when a fallback handler for unknown object types was sometimes flying off outside of the data due to the size field not actually being there. Plug-ins (which we'll get to later) are particularly weird, with the size value being garbled, but still decodable, only on Windows.
There is actually an unusual quirk here: That piece at the start of the file is not the scene definition. One way that you can tell that it isn't is that if you add modifiers to the scene, they don't show up in the scene stream, they show up in... the boot stream?
Well, now we get to one of the weirder aspects of mTropolis's architecture, a quirk that actually causes an annoying bug in the Windows version of Obsidian. The object at the start is some kind of stream header, of unknown purpose. The definition of the actual scene object is not stored in the scene stream, it's stored in the boot stream and it always exists, but the objects inside of them are loaded on-demand.
Not only that, but scenes aren't really even scenes. Projects have their own type, as do sections and subsections, but scenes are just graphic elements... or at least they are most of the time. If you link them to an image or QuickTime movie, they change type into that kind of element, so scenes are actually the same type of object as a scene element.
So what is the bug in Obsidian that this causes? It's the bug where if you trigger the rebellion music in the first chapter, and then go the wrong way, the music keeps playing forever, even if you restart the game, because the MIDI modifier that plays the music is directly under the scene in the project hierarchy, preventing it from ever being unloaded. The Mac version fixes this issue by... deleting that music track.
The strategy for MTDisasm for the most part has been to create structures for all of the data types, byte arrays by default if all of the values seem to be the same, and larger values if they have visible byte swaps. Fields with unknown use are named "Unknown" with a number after it, and for convenience, if an unknown field turns out to be multiple values, it is not renumbered, but rather split so other values maintain their number. For example, if a field named unknown3 turns out to be 2 values, it is split into unknown3_1 and unknown3_2.
Here's what our test scene looks like cranked through MTDisasm's text mode today:
Let's take one of the more minor examples of how this can go wrong by creating a floating point variable modifier and setting its initial value to pi.
So far so good, export the files, dump the streams with MTDisasm bin mode and...
Uh oh, something is very wrong here! The Windows version has a value in it that XVI32 is telling us looks like a double-precision float encoding pi at the end, but the Mac version is 2 bytes longer and looks pretty different. What is going on?
Let's think like a Mac programmer here: Motorola 68000 CPUs didn't originally have floating point support, and when they did, it was added via the MC68881 FPU add-on, later incorporated into the 68040 CPU. That FPU preferred float data in an 80-bit format, with a 16-bit exponent and 63-bit mantissa with a required 1 bit for finite numbers above the mantissa.
That format has a few advantages, such as being relatively efficient to work with on CPUs that only support integer arithmetic, but the specifics of floating point behavior are well-documented in other places, and the important thing here is that the Mac and Windows versions are using completely different number formats for floats.
This behavior exists in some other places too: Color values on Mac are 6 bytes per color (due to Macs having 16-bit precision per channel), but 4 byte BGRA format on Windows. The auxiliary data in QuickTime elements is particularly different, with 14 extra bytes in the header on Mac and 12 extra bytes in the footer on Windows.
Now that we've covered much of how the data is structured, it's time to talk about what is either the best or worst feature of mTropolis: Object behavior and communication.
Almost everything that makes objects interesting in mTropolis is accomplished via modifiers and messages. Modifiers are a special type of object that can be attached to scene elements (and, in a few specific cases, other modifiers), making them do things. Importantly, many modifiers do not really modify the object, but provide other functionality, and simply exist as modifiers. Variables, for instance, are modifiers. Saving and loading games is a modifier. Scripts are modifiers.
Also central to this is one of mTropolis's most important features: Behaviors, which are containers for other modifiers. Behaviors can be switchable, allowing them to be turned on and off, potentially turning off all of the modifiers inside of the behavior, and they can also be aliased, allowing you to save a behavior into the alias palette and reuse it across multiple objects.
Inter-object and inter-modifier communication is done via messages. Most modifiers trigger when they respond to a message, and the message is either a built-in message type (such as a "Mouse Over" message firing when the mouse moves over an element) or a custom message type created by the author. Messages can usually be sent with a value as well, potentially affecting how the recipient responds.
Recreating mTropolis's behavior requires extensively testing what messages fire when, and in what order. Sometimes the order is complicated. For instance, message senders usually have an "immediate" flag that determines whether the message sends immediately (before the message currently being propagated continues propagating), or is added to a message queue. When the message queue is processed was behind a bug where the Statue area was playing the wrong music because it turns out that if a non-immediate message is sent, and then something causes a scene transition, the queued-up message has to be sent after the scene transition.
It turns out that having a spaghetti of message senders and recipients isn't really enough to express your game logic efficiently all of the time, so mTropolis does supply a scripting system in the form of the Miniscript modifier.
Miniscript is infamous for being hobbled by intentionally not having any loop constructs to encourage developers to use things other than code instead. If you want a loop, you either have to queue up a message of some kind to re-trigger the modifier, and possibly chain things together. That's really only the beginning of its problems though, it's full of weird cases where the semantics aren't really clear at all. For example, as mentioned earlier, variables are modifiers, but if you use a variable from a script, it may treat it as referencing the value of the variable or referencing the variable modifier itself, depending on what you're doing. What triggers the implicit conversion of a variable to the value it contains? Who knows, just have to endure the bugs and find out.
One case involves a bunch of things being miscompiled in Obsidian due to a design fault that is invisible, and impossible to fix without trashing the project and starting over. For example, let's look at the Miniscript modifier with GUID 0034164d, named "Init on PE". Here's what MTDisasm currently tells us that it says when decompiled:
if element.width <> 640 then
set element.width to 640
end if
if element.height <> 360 then
set element.height to 360
end if
if element.visible <> true then
set element.visible to true
end if
if element.position <> (0, 0) then
set element.position to (0, 0)
end if
if element.direct <> true then
set element.direct to true
end if
if element.paused <> true then
set element.paused to true
end if
if label:(4:3ae) <> true then
set element.loop to true
end if
What is going on with this last one? What is a label?
Well, let's say you write this code in Miniscript:
if loop <> true then
set element.loop to false
end if
What have you actually written here? When you save a Miniscript modifier, if there is an identifier like "loop" in this case, then it first tries to resolve it to a global value. If that fails, then it has to look it up, which internally is done using a subscript (i.e. by looking up a value "contained in" something else). At first I thought the thing it was subscripting something that meant, basically, "go look for it." Nope. The default thing to subscript is the current element, but if you subscript an element by name, it will still look further up in the scene hierarchy for that object.
The exact search order had to be sussed out through via repeated tests, cause you really don't want to be looking up the wrong thing, do you?
MTDisasm will add some additional annotations to note how a value is being resolved. But what is a label? Well, in this case, mTropolis supports a feature where you can embed time markers in AIFF sound files, and then allow those time markers to trigger a media cue modifier, among other things, but those time markers are named, and those names go into the global label table. In this case, someone imported a sound file with a time marker named "loop."
There is no way to view the global label table, there is no way to delete a label after it's been imported, and the labels permanently become names that identifiers in scripts resolve to.
So, some poor sap is now comparing the value of a sound marker to "true" instead of the element's "loop" property like he would have been if that sound file was never imported.
Another thing that must be handled properly when running Miniscript is errors. Many errors will terminate a script execution, while some things (like sending a message to an object that doesn't exist) will not, and games have code that doesn't work correctly if that behavior isn't mirrored correctly.
How are we even getting these scripts back though? Do the Miniscript modifiers just have the program source in them? Well, in very early versions of mTropolis, yes. In every shipped title I've seen, no, the export process removes the Miniscript code and only leaves in a compiled version.
Let's take a look at the Miniscript object that we just mentioned in Obsidian.
After isolating the values, we see what we're looking at. Quick note: The reason for the "program header" is that there are actually two modifiers that use Miniscript programs: The Miniscript modifier, and the "If Messenger Modifier" which uses a Miniscript program to evaluate a condition.
In this case, there are no references. A reference is just a name and an object GUID. There is a field indicating the size of the instruction data, which looks like a blob of... stuff. It's not aligned, so the instructions aren't fixed-size. Let's look a bit closer.
If we look carefully, there is a bit of a pattern here. Each instruction has a size 4 bytes in, which makes it possible to isolate individual instructions. Convenient!
The first guess of how this works, which turned out to be correct, is that this is a stack machine, one of the easiest ways to write a scripting system. The way a stack machine works, for the unfamiliar, is that instructions do some combination of removing ("popping") values from the end of a list and appending ("pushing") new values to the end of the list ("stack"). For example, if you had to evaluate the expression:
((A*B)+(C*D))
Then the stack machine instructions would look something like:
push A
push B
multiply
push C
push D
multiply
add
After a lot of messing around, this is what MTDisasm outputs for the instructions of that first condition block:
0: push_global default
1: get_child 00000000
2: push_value double 640
3: cmp_neq
4: jump conditional unknown=62 skip=5
5: push_global,args=1 default
6: get_child,args=1 00000000
7: push_value double 640
8: set
Okay so what are these? "default" in the disassembly really means "element" but what we are doing here is pushing the value of "element" and then getting the attribute with index 0 from the attribute list (which is "width"), then pushing the value 640.0, doing a comparison (which pops the last 2 values from the stack and pushes true or false on to the stack. Then it does a conditional jump based on the value, and if it's false, skips to 5 instructions ahead. If that skip didn't happen, it pushes the element (with a write-intention flag in the arguments!), gets the child (also with a write-intention flag), pushes the value 640.0, and then runs a set operation that applies the value.
One irony here is that it would have been extremely easy to add support for loops into this by just allowing negative offsets in the jump instruction, the developers simply chose not to because... reasons?
This is kind of hard to read though, so what can we do? We decompile it!
Decompilation for something like this normally involves 2 steps: Converting stack machine instructions back into expression trees, and converting the control flow graph back into structured control flow.
The first one is straightforward, since Miniscript never has branches in the middle of an expression. Remember that pseudocode for the stack machine I had earlier? Well, if you just keep track of each input to something that consumes values from the stack, you can just regurgitate them into expression text. The only thing complicated there is some precedence analysis, which is done to avoid having to put parentheses around everything... It also doesn't quite work right, if I remember, but it works enough that I can tell what's going on.
Control flow analysis, which involves converting jumps back into if/else/end if blocks, is the harder part. Let's say we right a script that roughly looks like this:
if A then
do something A
else
do something B
end if
do something C
This will compile into stack machine code that looks something like this:
0: push A
1: jump conditional to instruction 4
2: Do something A
3: jump to instruction 5
4: Do something B
5: Do something C
These "Do something ..." blocks can be any number of instructions. In order to decompile this, we need to convert it into a control flow graph. Control flow graphs in modern compilers can be quite complicated, especially if they form loops, but since Miniscript doesn't have loops, we're left with what's called a directed acyclic graph, meaning no path through the graph ever flows through the same node more than once.
To do this, we split sets of instructions out into blocks. A block starts with any instruction that is the target of a jump, and ends with either a jump instruction or where another block starts.
Note that there is an "end of program" block. This block is always added to the end and contains no instructions, but it's useful for the decompile algorithm.
So how do we work with this?
First, for each one of these nodes, we have to determine several pieces of information about it:
I'm sure there are better ways to do this, but the way MTDisasm does this is by creating "islands" that each correspond to an emittable block of code. All islands start with a block of code and end in a "sink" which exits the island. Initially, there is one island that starts with the first block and sinks into the "end of program" block. It then repeatedly processes islands until it's done, and each time an island is processed, it can produce additional islands that also must be processed.
Processing an island is done as follows:
First, check if there is more than one post-dominator of the starting block. If there is, then we need to find one that terminates the island. Post-dominators are always reached in a fixed sequence, so we can find it by just following the first successor until we hit the first post-dominator. The island is then split into 2 islands, one that sinks into the post-dominator, and one that starts at the post-dominator (with the original sink).
Then, the successor list of the block is checked. For each successor, it checks if there is an island starting at that successor already. Doing that handles "if" statements that have no "else" block and go to a successor of the "else" block when the condition is false, for instance. If there is no island starting at that successor, a new island is created, starting with that successor's first block and ending in the original island's sink. Since we know the island is now post-dominated by its sink, we know that is the end of the control flow path of the new island.
This process is repeated until there are no islands left.
After that, we convert the islands into code. Starting with the first island, the island's instructions are converted back into an expression tree and emitted. Then, if the block ends in a condition, an "if" statement with the condition expression is emitted, followed by the "true" island, and an "else" followed by the "false" island if present, and then an "end if", followed by the sink island (which can chain into more conditions and sink island).
That's it! We can now see the code!
Most of the other data can be figured out using this test-case analysis. Create a bunch of the same thing, change one thing, see what changes. Some things are formatted interestingly, like a lot of things have a "rate" parameter which is actually exported as a large value divided by the rate, turning it into a delay instead.
Most of the asset formats are fairly straightforward too. Images are flipped and have different channel order on Windows, but otherwise are uncompressed. Audio is uncompressed. Sometimes things are compressed, but are compressed with a QuickTime codec and have the codec ID included, and nearly all QuickTime codecs have been reverse-engineered already.
What hasn't been reverse-engineered is that "mFactory Animation" codec. mTropolis has an animation format called "mToon" designed for possibly-moving image sequences. The mTropolis demo doesn't allow saving projects, but it does allow creating and saving mToon assets. I guess it would have to, otherwise you couldn't try it out, and it's not like you can do much with an mToon file if you can't save your project, right? (... Right? ... Yeah.)
In the state of the art of the mid-90's, there were basically three ways of compressing animation in a way that could feasibly be played back in real time: Vector quantization, run-length encoding (RLE), or other simple lossless schemes like LZ77. Vector quantization involves taking chunks of the image and converting them into a smaller set of similar chunks (the codebook), and replacing parts of the image with a lookup into the codebook. It's similar to creating a palette for an image (do people even know what that means these days?), except it creates lookups for blocks of pixels instead of single pixels. Run-length encoding just replaces series of identical pixels with a code indicating how long the series is.
How do we poke at this to start guessing what it does? Well, we can pop open Color It!, one of the many programs of the era designed to be an affordable Photoshop replacement, and try doing some things. Let's make an arrow, and give it a weird size like 12x13 so we can find the size easily, then make 2 identical frames so we can determine the frame stride and hopefully match that up with the size somewhere.
Okay, looks like we've got it. Let's set the compression to "None" and export our project and crack it open!
Important thing to note here: The segment asset data appears after all of the scene objects, so the way mTropolis handles some things here is by storing the metadata with the scene object, but the frame data is in the asset data area.We have a problem though. We told it to be uncompressed, which means we should be seeing something that looks like an arrow shape in the hex data, but we don't. It compressed it anyway. The good news is, we can see from the codec ID and frame headers that this looks like an RLE codec, which is really good news because RLE codecs are way easier to figure out than VQ codecs.
There are a lot of ways to write an RLE codec. You can set a minimum run length to increase the numeric range of runs by adding it to the run count, you can expect that runs and non-run byte sequences alternate, you can let runs go past the end of a row, and so on, but it's very easy to correlate because you can just make test images with runs of given lengths and see how they're encoded.
This did turn out to be slightly more complicated: Most animations have "temporal compression" which allows for frames where runs of pixels are skipped and reused from the previous frame, and later it turned out that there was a special code for vertical displacement, which skips many rows of pixels at once. The 16-bit version is just the 8-bit version, but uses 16-bit values!
Internally, this was handled in a somewhat sneaky way too: ScummVM's mTropolis engine converts the RLE-16 format to RLE-32 so it can decode it on the fly and blast pixels straight into a 32-bit frame buffer... but that's for another time, because...
With all of this information, MTDisasm reached a point where it could dump all of Obsidian's data in readable, viewable, and listenable formats. I finally had something resembling the "source code" to the game. That's still a long way from actually running it, but it meant that there was no more information about how it's supposed to work that was out-of-reach.
Upon reaching this point, I still wasn't on the ScummVM team or really looking to join it. I had just come off of porting Glider PRO, which involved re-implementing tons of MacOS stuff by hand, so I really was not shy about spending excessive amounts of my free on this type of thing and doing it myself.
That story is for the next installment though. Next up, I'll take you down the path of turning a mTropolis player re-implementation into reality.
by OneEightHundred (noreply@blogger.com) at 2024-02-24 08:48
This is part 4 of a 4-part series, mirrored from the Gale Force Games page.
In the last installment, I discussed the process of getting Obsidian back to a point where it was, at least, no buggier than the original game. That is still not quite the case, there is one bug that doesn't occur in the original version: Inventory items getting swept across the screen during pan transitions.
Ironically, that bug isn't actually visible on modern machines running the original game because the pan transitions were all set to run at maximum rate, which is as fast as possible. If you don't like the bug, you can always turn off the "Enable short transitions" option in the game options. Today though, we're not here to talk about bugs, we are here to talk about making Obsidian in ScummVM the best Obsidian experience that over Obsidianed.
The first topic though falls into both categories: MIDI
Obsidian uses MIDI for music, played back through QuickTime. This seems like an odd choice, given that Myst and most other games were using digital music, and QuickTime's tinny MIDI instruments were very limiting, but based on interviews, this may have been driven by Thomas Dolby's love of technology and MIDI in particular. It sounds like a lot of the rationale was driven by a desire to have a dynamic music system, which didn't quite pan out. Dynamic music is used in exactly two places in the full game: During the church puzzle, and after solving the statue puzzle (which cuts out the piano track).
Before we start, we have to talk about what MIDI is. MIDI is an acronym for Musical Instrument Digital Interface, and it's designed to interface MIDI-compatible controllers (such as keyboards or programmable sequencers) to interface with devices that consume MIDI commands such as synthesizers, computers, and drum machines.
It was also commonly used in the early DOS days for various reasons related to being able to decode it using low-CPU-usage algorithms, or offloading it to sound hardware where it didn't use CPU at all, combined with its small size on disk, so ScummVM already had ample support for playing back MIDI - Sort of.
When you request a MIDI device from ScummVM, it might be a software backend like FluidSynth, it might be a physical MIDI playback device, it might be anything, but you only get one. This is a problem because Obsidian often has multiple MIDI files playing at once. The vidbot on/off sound for instance is a MIDI sound, and that plays simultaneously with the music. There's also one even more surprising one which I'll get to in a moment.
This is a problem because how MIDI works is that there are several different "channels" each of which responds to a command, and a MIDI controller sends commands assigned to channels. This allows a MIDI device to play multiple instruments at once by assigning instruments to different channels.
Being a protocol designed for real-time, MIDI commands that play notes also do not have an associated duration, instead there are separate commands for activating and deactivating a playing note. In turn, the file format used for MIDI in Obsidian, known as SMF (Standard MIDI Format), basically encodes commands to send to the MIDI device and what time to send them.
However, this creates a problem: If we stop sending commands from a MIDI file while a note is playing, then we'll never send the command to stop playing the note. When using QuickTime's software synthesizer, this isn't a problem, because it just stops running the software synthesizer for that MIDI source. If we only have one continuously-active MIDI output though, then we can't do that. Fortunately, most notes have finite duration anyway, but sometimes they don't, and for that, Obsidian has a gnarly hack: A MIDI file that sends an "all notes off" command to every channel in response to various commands, but also periodically on a 30-second timer.
In fact, you may notice a moment in the intro where the music pauses dramatically before displaying the title - that pause is at exactly the 30 second mark, presumably timed exactly to prevent that music note-stopper from ruining the theme.
So, the good news is that despite not having a proper multi-source output, the game is still capable of functioning with a single-source output, as long as you're willing to tolerate abrupt cuts in the music every 30 seconds.
There is another problem: MIDI sources have individually-controlled volume, so how do we handle that? Well, the simple way is that when a MIDI note is sent, it is sent with a "velocity" which roughly corresponds to the volume of the note, but really means something like "intensity." It's not exactly a volume scale, because you might get more attack in the sound with higher velocity, for instance, with a higher-quality MIDI renderer. But, it's close enough for what we need, so it was done by intercepting note on and note off commands, modulating the velocity, stuffing the new velocity back into the command, and sending it out.
Just because it works doesn't really mean it's good though, and the music periodically getting chopped is really annoying. But, like I said, I like digging through file formats and standards almost as much as I hate having free time, so I decided to create the secret weapon to solve this problem: The dynamic MIDI mixer. This fun feature is enabled if you enable the "Improved music mixing" option in the game options (it's on by default).
So, what does this actually do? Well, basically we are going to implement multiple MIDI drivers that funnel into a single MIDI driver in a more intelligent way. First, we have a way of tracking the full state of every single MIDI channel:
We track this for every channel of the output device and also every channel of each MIDI source.
Each MIDI source in the game is assigned to MidiCombinerSource, which is a thing that looks like a MIDI driver to it, but in the dynamic music mixer, is a funnel into the combiner.Aside from starting and stopping notes though, there are several other types of MIDI messages: There are controllers, which can alter the qualities of a channel in some way, there's the program (which affects what instrument to play), and also two features called "sustain" and "sostenuto." So far, ScummVM doesn't support any games known to use sostenuto, but I wanted to get this right the first time. Sustain and sostenuto are normally controlled by pedals, and if the pedal is triggered while active, then any note playing continues playing after the "note off" command until the pedal is released, so to track an active note, we need to track whether or not it is sustained as well.
Internally, in order for anything to happen, a MIDI source channel has to be dynamically assigned to an available "physical" channel on the output device. The way that works is: All channels are unassigned by default. If a controller change happens, and the channel is unassigned, then it updates the MidiChannelState of the source channel, but otherwise does nothing.
When a note is played on an unassigned channel, the combiner tries to find a channel that is the right channel type (i.e. one channel is typically reserved for percussion), and otherwise stopped playing its last note the longest time ago. If there is still an active source assigned to that channel, then that source channel is unassigned. Then, the new source channel is assigned to that output channel, the channel state of the output channel and the source's channel are compared, and anything that is different is adjusted by sending the necessary MIDI commands to do so, and then the note is played. If a source channel is still actively assigned to an output channel, then control commands are sent to the output immediately.
Because of this, the "all notes off" trigger actually doesn't do anything any more. It updates the MIDI combiner source's channel state, but since nothing in that MIDI file ever plays a note, none of it ever goes to the output driver.
We can also use the MIDI gain control, which is an actual volume control, instead of having to module velocity, and since we know which output channels are being used by a source, we can completely quiet a source by just sending an "all sound off" command (which bypasses sustain) to silence the channel, and then deallocate it.
Well that's unfortunate. Individual sounds in Obsidian can have their own volume level, which is modulated by the global volume level, so setting the volume to the maximum doesn't necessarily mean that the MIDI source is set to the maximum possible volume. As mentioned earlier, the vidbot sounds are MIDI, and by default their volume is set to 50%, but they do seem significantly quieter when using gain control vs. modulating the velocity. What's going on here?
After a bunch of trawling around, it turns out the answer is in the General MIDI Level 2 specification at page 7: "gain in dB = 40 * log10(cc7/127)"
... What does that mean? Unfortunately, I'm actually finding out that I did the fix wrong as I'm writing this, and need to fix it again! "cc7" is the MIDI channel volume value. The volume scale on MIDI sources is a linear scale, meaning a value half as large causes the amplitude to be halved. Decibels (dB) are a logarithmic scale, meaning any 2 values with the same distance apart the same proportional magnitude. One problem with this though is figuring out what we're measuring. Many measurements in electrical and sound engineering are the square or square root of other measurements due to various physics interactions. Decibels are a scale designed around factor-of-10 changes, but whether a 10x change is +10 or +20 depends on what quantity is being measured, due to that problem of quantities being squared.
What we want is the measure that the volume is supposed is be scaling, which is amplitude, which is on the +20 = 10x scale.
So converting normalized MIDI volume (a.k.a. the volume rescaled to a 0-1 scale) to modulation involves squaring it, so we need to figure out how to convert scaled modulation back into a new normalized volume.
... okay, great, so basically all we have to do to compute the new MIDI volume is scale the MIDI source volume to a 0-1 scale, and then multiple the original MIDI volume by the square root of that value. Cool. This means our vidbots playing at 50% volume are now a 0.7071x multiplier instead of 0.5. Unfortunately, the original implementation of the decibel scale wrong and was using a 4th-root, which I guess is better than it being too quiet, but still wrong.
Anyway, it's fixed-er now!
Most monitors today try to run games in 32-bit 8-bits-per-channel (and 8 bits of waste to help with memory alignment) color mode, or sometimes HDR if the game supports it. Attempting to run Obsidian in 32-bit mode will greet you with this error though:
Obsidian was released at a time when 16-bit color depth (or "thousands of colors" as it was called on Mac) was fairly new, most things ran in 8-bit depth with a color lookup table, so 16-bit was fairly cutting edge.All of Obsidian's images are 16-bit images though, so we're not really losing anything by running with a 32-bit render target, are we? Actually, we are. The images are 16-bit. The videos are not. The videos are encoded with Cinepak, which in full-color mode can render out to 8 bits per channel, and even if the input images to Cinepak were in 16-bit color, the averaging out that occurs during the encoding process (and from the YUV-to-RGB perceptual transform) have more accuracy per color channel. So, we want to render in 32-bit.
That's accomplished by having an override that just lies to the scripts about what color depth the game is being run at.
When Obsidian came out, most desktop monitors were 640x480, which is a 3:4 aspect ratio. Most displays today are 16:9 widescreen. However, the images and videos in Obsidian are all 640x360, also 16:9. Well that's pretty neat. It would be cool to play the game in widescreen and get rid of the letterboxing on both sides of the monitor, wouldn't it?
Okay, let's just offset the game frame up 60 pixels, cut down the resolution, and lie to the game about the resolution it's running at so it doesn't throw a startup error! That's a great start, but what about the inventory items that display below the frame?
There's an internal system for overriding object behaviors, which can somewhat handle this. It's more complicated than it could be, because with how Obsidian is broken up, sometimes you carry the item into different sections and subsections, which means the elements that display them are duplicated.
Fortunately, the inventory items are color-keyed already even though they're on a black background, so I didn't need to do anything to make them not render a black box outline, but they did run into a problem with the security survey in the maze.Uh oh, the keycards are overlapping the survey, and I really need to let him know that I eat my ice cream straight from a bowl! Well I guess we could just change the layer order so the form's on top of the cards, right? Actually, no, because the security form image isn't just the security form, it also includes a bit to the left.
Ultimately, this was resolved by detecting this specific situation. When the security form is displayed, the cards are moved off-screen, and moved back when the security form is dismissed. Now I can eat my ice cream in peace.
There was one last problem in widescreen mode:
... the Rocket Science Games logo is too big! Now you might be thinking, "so what, that company hasn't existed for 25 years, who caaaaaares!?" Unfortunately, I care, so we need to fix this! But what can we do? Shrink it? But then we get letterboxing on the sides again, and that doesn't look nice! What we need, which you may have seen if you watch old TV shows re-broadcast on widescreen, is an anamorphic filter.
An anamorphic filter works by stretching out the sides of the image more than the center areas. This was done by computing an exponential curve that has a derivative of 0 at the point where it's supposed to stop (meaning, basically, the rate of pixel coordinate change becomes normal where the curve tops, preventing a noticeable seam), and applying that to the pixel grid. Here's what the filter applied to an 8x8 grid pattern looks like:
And here's the filter applied to the logo video:
Much better!
While practically a foreign concept these days, checkpoints weren't always a thing. Like most games of its time, if you wanted to keep your progress, you had to save the game. The game even reminds you to save when you try to quit! Having auto-save would be really nice though, wouldn't it?
The auto-save feature works partly on a timer, like most ScummVM games, but there's an option (enabled by default) to also auto-save at progress points. Most puzzle solutions in Obsidian set some variable, but you aren't always allowed to save, so this creates a bit of a problem: Finding how to detect puzzle solutions, and then finding a safe place to save.
Auto-save detection is done in two ways: One way is by detecting arrival in a specific scene while coming from a specific other scene, which is used to detect things like chapter transitions, but also things like beating the maze without the proper document. The other way is detecting arrival in a specific scene with a puzzle completion variable set differently than what it was the last time the game was loaded or restarted. Normally, the latter category is done by triggering in the scene you would wind up in after the puzzle.
There is a minor omission in this scheme: If you complete a puzzle, save the game, then reload it, the autosave won't trigger... but in that case, they've pretty much saved right there anyway, so who cares?
ScummVM shows screenshots of the game where you saved it in its save game UI, but there's one problem: If you save the game from the in-game UI, then you're not looking at the game, you're looking at a menu. So, some hooks had to be added to take a screenshot before transitioning to the menu and using that screenshot for the save instead.
People seem to like subtitles, so why not add them too? Well, they were added, and you can download them from ScummVM's add-ons page.
Subtitles mostly work by detecting when a specific sound asset is activated and popping up the subtitles. There is however an option for a very small number of subtitles:
Without spoiling too much, there's one puzzle that involves sound, but popping up subtitles for the sound at all kind of gives away the answer, and it's one of the neatest puzzles in the game. So, depending on what your rationale is for enabling subtitles, you can keep this option off if you want to have subtitles, and can hear sound, but don't want to spoil the challenge.
Overall, there are a 686 voice lines in the game. Deciding when to split a subtitle up, and where to split the lines, was an ongoing challenge that I lack the expertise to do, but I did the best I could.
In some cases, this involved getting help for figuring out lines I couldn't make out. I'd never actually heard the term "lousing up" before in my life. Also, I added speaker names to the subtitles, but in some cases, we don't really know the names of these characters and just have to guess.
The identity of the characters also clearly has some story implications, but without knowing for sure who they are, that has to be danced around, a problem made more difficult by Obsidian's casting. While some of the characters are professional actors, many of them (especially the vidbots) are Rocket Science employees.
The bureau chief, for instance, is almost certainly Howard Cushnir, who also appears in a celebration video, and I think is also a character in a chapter intro cinematic, but it's not clear if that's because it's the same character, the same person playing multiple characters, or if I'm mistaken and it's not even the same person. Adding to this, the brief appearance in a cinematic is as Max's teacher, but he appears to be the project administrator in the journal. (At least, that seems to be the implication - that his appearance as the chief in the bureau realm is a reflection of his bureaucratic authority status in the real world.) It would be odd for that to be the same person, then, but maybe he's a graduate professor at a research university and it is the same character? There's no way to tell.
Another amusing case was the eye test in the bureau maze. The voiceover there drops to a soft, illegible level, and it's supposed to be a gag that it's unintelligible, so he tells you to go to the hearing test booth (who sends you back for an eye test). The problem is that it is intelligible if you isolate the sound and turn the volume up, so the intent is that it's illegible, but there are actual coherent words in the line. So, should the line show the words, or something like "<Unintelligible>" to keep the gag? I kind of split the difference.
Doing this involved a lot of testing. In many cases, I found things in the logic that looked like they caused bugs, confirmed that they caused bugs, and verified that the same bugs occurred in the original game. That type of bug is much harder to fix, and ultimately only one of them was (a progression blocker that occurs if you save before logging into the journal).
The inventory panning bug is the only bug left that hasn't been fixed, and due to how the widescreen mod works, giving it a proper fix is difficult. In non-widescreen mode, the fix is to only pan the part of the screen with main-scene elements.
In the end though, the two most important lessons I've learned in life are that shipping beats perfection, and motivation is a finite resource. Working on this was always a race against burnout, and eventually it started reaching the point where it was hard to justify any further work vs. moving on to other things. Eventually it's time to say "it's good, and that's good enough," stick a fork in it, and move on to other things.
mTropolis was used to ship a few dozen titles, and a few of them have been on the list as possible additions: Muppet Treasure Island, S.P.Q.R. - The Empire's Darkest Hour, Star Trek: The Game Show, and MindGym. The first one is done and was added in ScummVM in the 2.8.0 release, the next two I have copies of. S.P.Q.R. is the next one on the to-do.
However, as mentioned above, motivation is not a finite resource, and unlike some members of the team, I don't really approach things with nostalgia, or as a historian that thinks it's not their place to judge the quality of things from the past. My goal is to save valuable things from oblivion, and the further down the to-do list I go, the more questionable that "value" gets, in my opinion. So, we'll see what happens, and when it happens, but that's the plan for now.
It wouldn't be a good rant if I didn't leave you with some tales from other mTropolis games, would it? Muppet Treasure Island has numerous hacks to deal with duplicated aliased compound variables needing to be linked up in a way that I still haven't made any sense of, and music doesn't work in S.P.Q.R. right now because it depends on sending messages to objects in a scene that is never loaded.
Also, I finally figured out what those extra 8 bytes in the catalog header are: Eventually mFactory realized that it was a problem to have separate formats for Mac and Windows, so they decided to make a cross-platform format. Does the cross-platform format work by being in a common format that works on both platforms? Of course not. It exports the Mac and Windows versions into the same file, and de-duplicates the asset data.
That's all for now. See you in the future, somewhere.
by OneEightHundred (noreply@blogger.com) at 2024-02-24 08:47
After reviewing the code for the simple YAML parser I wrote, I decided it was getting a little messy, so before continuing, I decided to refactor it a little bit.
The simples thing to do was to separate the serialisation and the deserialisation into separate classes, and simple call those from within the YamlConvert
class in the existing methods. This approach tends to be what other JSON and YAML libraries do, with added functionality such as being able to control aspects of the serialisation/deserialisation process for specific types.
I currently don’t need, or want, to do that, as I’m taking a much more brute force approach - however it is something to consider for a future refactor. Maybe.
I ended up with the following for the YamlConvert
:
public static class YamlConvert
{
private static YamlSerialiser Serialiser;
private static YamlDeserialiser Deserialiser;
static YamlConvert()
{
Serialiser = new YamlSerialiser();
Deserialiser = new YamlDeserialiser();
}
public static string Serialise(YamlHeader header)
{
return Serialiser.Serialise(header);
}
public static YamlHeader Deserialise(string filePath)
{
if (!File.Exists(filePath)) throw new FileNotFoundException("Unable to find specified file", filePath);
var content = File.ReadAllLines(filePath);
return Deserialise(content);
}
public static YamlHeader Deserialise(string[] rawHeader)
{
return Deserialiser.Deserialise(rawHeader);
}
}
It works quite well, as it did before, and looks a lot better. There is no dependency configuration to worry about, as I mentioned above I’m not worried about swapping out the serialisation/deserialisation process at any time.
Previously we left off with a method which could parse the YAML header in one of our markdown files, and it was collecting each line between the ---
header marker, for further processing.
One of the main requirements for the overall BlogHelper9000
utility is to be able to standardise the YAML headers in each source markdown file for a post. Some of the posts had a mix of different tags, that were essentially doing the same thing, so one of the aims is to be able to collect those, and transform the values into the correct tags.
In order to achieve this, we can specify a collection of the valid header properties up front, and also a collection of the ‘other’ properties that we find, which we can hold for further in the process when we’ve written the code to handle those properties. The YamlHeader
class has already been defined, and we can use a little reflection to load that class up and pick the properties out.
private static Dictionary<string, object?> GetYamlHeaderProperties(YamlHeader? header = null)
{
var yamlHeader = header ?? new YamlHeader();
return yamlHeader.GetType()
.GetProperties(BindingFlags.DeclaredOnly | BindingFlags.Public | BindingFlags.Instance)
.Where(p => p.GetCustomAttribute<YamlIgnoreAttribute>() is null)
.ToDictionary(p =>
{
var attr = p.GetCustomAttribute<YamlNameAttribute>();
return attr is not null ? attr.Name.ToLower() : p.Name.ToLower();
}, p => p.GetValue(yamlHeader, null));
}
We need to be careful to ignore collecting properties that are not part of the YAML header in markdown files, but that we use in the YamlHeader
that we can use when doing further processing - such as holding the ‘extra’ properties that we’ll need to match up with their valid counterparts in a further step. Thus we have the custom YamlIgnoreAttribute
that we can use to ensure we drop properties that we don’t care about. We also need to ensure that we can match up C# property names with the actual YAML header name, so we also have the YamlNameAttribute
to handle this.
Then we just need a way of parsing the individual lines and pulling the header name and the value out.
(string property, string value) ParseHeaderTag(string tag)
{
tag = tag.Trim();
var index = tag.IndexOf(':');
var property = tag.Substring(0, index);
var value = tag.Substring(index+1).Trim();
return (property, value);
}
Here we just return a simple tuple after doing some simple substring manipulation, which is greatly helped by the header and its value always being seperated by ‘:’.
Then if we put all that together we can start to parse the header properties.
private static YamlHeader ParseYamlHeader(IEnumerable<string> yamlHeader)
{
var parsedHeaderProperties = new Dictionary<string, object>();
var extraHeaderProperties = new Dictionary<string, string>();
var headerProperties = GetYamlHeaderProperties();
foreach (var line in yamlHeader)
{
var propertyValue = ParseHeaderTag(line);
if (headerProperties.ContainsKey(propertyValue.property))
{
parsedHeaderProperties.Add(propertyValue.property, propertyValue.value);
}
else
{
extraHeaderProperties.Add(propertyValue.property, propertyValue.value);
}
}
return ToYamlHeader(parsedHeaderProperties, extraHeaderProperties);
All we need to do is, to setup up some dictionaries to hold the header properties, get the dictionary of valid header properties, and then loop through each line, parsing the header tag and verifying whether the property is a ‘valid’ one that we definitely know we want to keep, and or one we need to hold for further processing. You’ll noticed in the above code, that it’s missing an end brace: this is deliberate, because the ParseHeaderTag
method and ToYamlHeader
method are both nested methods.
Reading through the code to write this post has made me realise that we can do some refactoring to make this look a little nicer.
So we’ll look at that next.
The next thing to do to get BlogHelper9000 functional is to write a command which provides some information about the posts in the blog. I want to know:
I also know that I want to introduce a command which will allow me to fix the metadata in the posts, which is a little messy. I’ve been inconsistently blogging since 2007, originally starting off on a self-hosted python blog I’ve forgot the name of before migrating to Wordpress, and then migrating to a short lived .net static site generator before switching over to Jekyll.
Obviously, Markdown powered blogs like Jekyll have to provide non-markdown metadata in each post, and for Jekyll (and most markdown powered blogs) that means: YAML.
There are a couple of options when it comes to parsing YAML. One would be to use YamlDotNet which is a stable library which conforms with V1.1 and v1.2 of the YAML specifications.
But where is the fun in that?
I’ve defined a POCO called YamlHeader
which I’m going to use to use as the in-memory object to represent the YAML metadata header at the top of a markdown file.
If we take a leaf from different JSON converters, we can define a YamlConvert
class like this:
public static class YamlConvert
{
public static string Serialise(YamlHeader header)
{
}
public static YamlHeader Deserialise(string filePath)
{
}
}
With this, we can easily serialise a YamlHeader
into a string, and deserialise a file into a YamlHeader
.
Deserialising is the slight more complicated of the two, so lets start with that.
Our first unit test looks like this:
[Fact]
public void Should_Deserialise_YamlHeader()
{
var yaml = @"---
layout: post
title: 'Dynamic port assignment in Octopus Deploy'
tags: ['build tools', 'octopus deploy']
featured_image: /assets/images/posts/2020/artem-sapegin-b18TRXc8UPQ-unsplash.jpg
featured: false
hidden: false
---
post content that's not parsed";
var yamlObject = YamlConvert.Deserialise(yaml.Split(Environment.NewLine));
yamlObject.Layout.Should().Be("post");
yamlObject.Tags.Should().NotBeEmpty();
}
This immediately requires us to add an overload for Deserialise
to the YamlConvert
class, which takes a string[]
. This means our implementation for the first Deserialise
method is simply:
public static YamlHeader Deserialise(string filePath)
{
if (!File.Exists(filePath)) throw new FileNotFoundException("Unable to find specified file", filePath);
var content = File.ReadAllLines(filePath);
return Deserialise(content);
}
Now we get into the fun part. And a big caveat: I’m not sure if this is the best way of doing this, but it works for me and that’s all I care about.
Anyway. A YAML header block is identified by a single line of only ---
followd by n
lines of YAML which is signified to have ended by another single line of only ---
. You can see this in the unit test above.
The algorithm I came up with goes like this:
For each line in lines:
if line is '---' then
if header start marker not found then
header start marker found
continue
break loop
store line
parse each line of found header
So in a nutshell, it loops through each line in the file, look for the first ---
to identify the start of the header, and then until it hits another ---
, it gathers the lines for further processing.
Translated into C#, the code looks like this:
public static YamlHeader Deserialise(string[] fileContent)
{
var headerStartMarkerFound = false;
var yamlBlock = new List<string>();
foreach (var line in fileContent)
{
if (line.Trim() == "---")
{
if (!headerStartMarkerFound)
{
headerStartMarkerFound = true;
continue;
}
break;
}
yamlBlock.Add(line);
}
return ParseYamlHeader(yamlBlock);
}
This is fairly straightforward, and isn’t where I think some of the problems with the way it works actually are - all that is hidden behind ParseYamlHeader
, and is worth a post on its own.
In the introductory post to this series, I ended with issuing a command to initialise a new console project, BlogHelper9000
. It doesn’t matter how you create your project, be it from Visual Studio, Rider or the terminal, the end result is the same, as the templates are all the same.
With the new .net 6 templates, the resulting Program.cs
is somewhat sparse, if you discount the single comment then all you get in the file is a comment and a Console.WriteLine("Hello, World!");
, thanks to all the new wizardry in the latest versions of the language and the framework.
Thanks to this new fangled sorcery, the app still has a static main method, you just don’t need to see it, and as such, the args
string array is still there. For very simple applications, this is all you really need to do. However, once you get past a few commands, with a few optional flags, things can get complicated, fast. This can into a maintenance headache.
In the past I’ve written my own command line parsing abstractions, I’ve used Mono.Options and other libraries, and I think I’ve finally settled on Oakton as my go to library for quickly and easily adding command line parsing to a console application. It’s intuitive, easy to use and easy to maintain. This means you can easily introduce it into a team environment and have everyone understand it immediately.
After following Oakton’s getting started documentation, you can see how easy it is to get going with a basic implementation. I recommended introducing the ability to have both synchronous and asynchronous commands able to be executed, and you achieve this by a small tweak to the Program.cs
and taking into consideration the top-level statements in .net 6, like this:
using System.Reflection;
var executor = CommandExecutor.For(_ =>{
_.RegisterCommands(typeof(Program).GetTypeInfo().Assembly);
});
var result = await executor.ExecuteAsync(args);
return result;
In .net 5, or if you don’t like top-level statements and have a static int Main
you can make it static Task<int> Main
instead and return the executor.ExecuteAsync
instead of awaiting it.
In some console applications, different commands can have the same optional flags, and I like to put mine in a class called BaseInput
. Because I know I’m going to have several commands in this application, I’m going to add some base classes so that the different commands can share some of the same functionality. I’ve also used this in the past to, for example, create a database instance in the base class, which is then passed into each inheriting command. It’s also a good place to add some common argument/flag validation.
What I like to do is have an abstract base class, which inherits from the Oakton command, and add an abstract Run
method to it, and usually a virtual bool ValidateInput
too; these can then be overriden in our actual Command implementations and have a lot of nice functionality automated for us in a way that can be used across all Commands.
Some of the detail of these classes are elided, to stop this from being a super long post, you can see all the details in the Github repo.
public abstract class BaseCommand<TInput> : OaktonCommand<TInput>
where TInput : BaseInput
{
public override bool Execute(TInput input)
{
return ValidateInput(input) && Run(input);
}
protected abstract bool Run(TInput input);
protected virtual bool ValidateInput(TInput input)
{
/* ... */
}
}
This ensures that all the Commands we implement can optionally decide to validate the inputs that they take in, simply by overriding ValidateInput
.
The async version is exactly the same… except async:
public abstract class AsyncBaseCommand<TInput> : OaktonAsyncCommand<TInput>
where TInput : BaseInput
{
public override Task<bool> Execute(TInput input)
{
return ValidateInput(input) && Run(input);
}
protected abstract Task<bool> Run(TInput input);
protected virtual Task<bool> ValidateInput(TInput input)
{
/* ... */
}
}
There is an additional class I’ve not yet shown, which adds some further reusable functionality between each base class, and that’s the BaseHelper
class. I’ve got a pretty good idea that any commands I write for the app are going to operate on posts or post drafts, which in jekyll are stored in _posts
and _drafts
respectively. Consequently, the commands need an easy way of having these paths to hand, so a little internal helper class is a good place to put this shared logic.
internal class BaseHelper<TInput> where TInput : BaseInput
{
public string DraftsPath { get; }
public string PostsPath { get; }
private BaseHelper(TInput input)
{
DraftsPath = Path.Combine(input.BaseDirectoryFlag, "_drafts");
PostsPath = Path.Combine(input.BaseDirectoryFlag, "_posts");
}
public static BaseHelper<TInput> Initialise(TInput input)
{
return new BaseHelper<TInput>(input);
}
public bool ValidateInput(TInput input)
{
if (!Directory.Exists(DraftsPath))
{
ConsoleWriter.Write(ConsoleColor.Red, "Unable to find blog _drafts folder");
return false;
}
if (!Directory.Exists(PostsPath))
{
ConsoleWriter.Write(ConsoleColor.Red, "Unable to find blog _posts folder");
return false;
}
return true;
}
}
This means that our base class implementations can now become:
private BaseHelper<TInput> _baseHelper = null!;
protected string DraftsPath => _baseHelper.DraftsPath;
protected string PostsPath => _baseHelper.PostsPath;
public override bool Execute(TInput input)
{
_baseHelper = BaseHelper<TInput>.Initialise(input);
return ValidateInput(input) && Run(input);
}
protected virtual bool ValidateInput(TInput input)
{
return _baseHelper.ValidateInput(input);
}
null!
, where I am telling the compiler to ignore the fact that _baseHelper
is being initialised to null, as I know better.
This allows each command implementation to hook into this method and validate itself automatically.
Now that we have some base classes to work with, we can start to write our first command. If you check the history in the repo, you’ll see this wasn’t the first command I actually wrote… but it probably should have been. In any case, it only serves to illustrate our first real command implementation.
public class InfoCommand : BaseCommand<BaseInput>
{
public InfoCommand()
{
Usage("Info");
}
protected override bool Run(BaseInput input)
{
var posts = LoadsPosts();
var blogDetails = new Details();
DeterminePostCount(posts, blogDetails);
DetermineDraftsInfo(posts, blogDetails);
DetermineRecentPosts(posts, blogDetails);
DetermineDaysSinceLastPost(blogDetails);
RenderDetails(blogDetails);
return true;
}
/**...*/
}
LoadPosts
is a method in the base class which is responsible for loading the posts into memory, so that we can process them and extract meaningful details about the posts. We put store this information in a Details
class, which is what we ultimately use to render the details to the console. You can see the details of these methods in the github repository, however they all boil down to simple Linq queries.
In this post we’ve seen how to setup Oakton and configure a base class to extend the functionality and give us more flexibility, and an initial command. In subsequent posts, we’ll cover more commands and I’ll start to use the utility to tidy up metadata across all the posts in the blog and fix things like images for posts.
I just had to setup my vimrc
and vimfiles
on a new laptop for work, and had some fun with Vim, mostly as it’s been years since I had to do it. I keep my vimfiles
folder in my github, so I can grab it wherever I need it.
To recap, one of the places that Vim will look for things is $HOME/vimfiles/vimrc
, where $HOME
is actually the same as %USERPROFILE%
. In most corporate environments, the %USERPROFILE%
is actually stored in a networked folder location, to enable roaming profile support and help when a user gets a new computer.
So you can put your vimfiles
there, but, it’s a network folder - it’s slow to start an instance of Vim. Especially if you have a few plugins.
Instead, what you can do is to edit the _vimrc
file in the Vim installation folder (usually in C:\Program Files (x86)\vim
), delete the entire contents and replace it with:
set rpt+=C:\path\to\your\vimfiles
set viminfo+=nC:\path\to\your\vimfiles\or\whatever
source C:\path\to\your\vimfiles\vimrc
What this does is:
vimrc
file and uses thatThis post largely serves as a memory aid for myself when I need to do this again in future I won’t spend longer than I probably needed to googling it to find out how to do it, but I hope it helps someone else.
Recently I was inspired by @buhakmeh’s blog post, Supercharge Blogging With .NET and Ruby Frankenblog to write something similar, both as an exercise and excuse to blog about something, and as a way of tidying up the metadata on my existing blog posts and adding header images to old posts.
The initial high level requirements I want to support are:
The next series of posts will cover implementing the above requirements… not necessarily in that order. First I will go over setting up the project and configuring Oakton.
After that I will probably cover implementing fixes to the existing blog metadata, as I think that is going to be something that will be required in order for any sort of Info function to work properly, as all of the yaml metadata will need to be consistent.
Then I think I’ll tackle the image stuff, which should be fairly interesting, and should give a nice look to the existing posts, as having prominent images for posts is part of the theme for the blog, which I’ve not really taken full advantage of.
I’ll try to update this post with links to future posts, or else make it all a big series.
dotnet new console --name BlogHelper9000
At work, we have recently been porting our internal web framework into .net 6. Yes, we are late to the party on this, for reasons. Suffice it to say I currently work in an inherently risk averse industry.
Anyway, one part of the framework is responsible for getting reports from SSRS.
The way it did this is to use a wrapper class around a SOAP client generated from good old ReportService2005.asmx?wsdl
, using our faithful friend svcutil.exe
. The wrapper class used some TaskCompletionSource
magic on the events in the client to make the client.LoadReportAsync
and the other *Async
methods actually async, as the generated client was not truely async.
Fast forward to the modern times, and we need to upgrade it. How do we do that?
Obviously, Microsoft are a step ahead: svcutil
has a dotnet version - dotnet-svcutil
. We can install it and get going:
dotnet too install --global dotnet-svcutil
Once installed, we can call it against the endpoint:
dotnet-svcutil http://server/ReportServer/ReportService2005.asmx?wsdl
In our wrapper class, the initialisation of the client has to change slightly, because the generated client is different to the original svcutil
implementation. Looking at the diff between the two files, it’s because the newer version of the client users more modern .net functionality.
The wrapper class constructor has to be changed slightly:
public Wrapper(string url, NetworkCredential credentials)
{
var binding = new BasicHttpBinding(BasicHttpSecurityMode.TransportCredentialOnly);
binding.Security.Transport.ClientCredentialType = HttpClientCredentialType.Ntlm;
binding.MaxReceivedMessageSize = 10485760; // this is a 10mb limit
var address = new EndpointAddress(url);
_client = new ReportExecutionServiceSoapClient(binding, address);
_client.ClientCredentials.Windows.AllowedInpersonationLevel = TokenImpersonationLevel.Impersonation;
_client.ClientCredentials.Windows.ClientCredential = credentials;
}
Then, the code which actually generates the report can be updated to remove all of the TaskCompletionSource
, which actually simplifies it a great deal:
public async Task<byte[]> RenderReport(string reportPath, string reportFormat, ParameterValue[] parameterValues)
{
await _client.LoadReportAsync(null, reportPath, null);
await _client.SetExecutionParametersAsync(null, null, parameterValues, "en-gb");
var deviceInfo = @"<DeviceInfo><Toolbar>False</ToolBar></DeviceInfo>";
var request = new RenderRequest(null, null, reportFormat, deviceInfo);
var response = await _client.RenderAsync(request);
return response.Result;
}
You can then do whatever you like with the byte[]
, like return it in an IActionResult
or load it into a MemoryStream
and write it to disk as the file.
Recently we realised that we had quite a few applications being deployed through Octopus Deploy, and that we had a number of Environments, and a number of Channels, and that managing the ports being used in Dev/QA/UAT across different servers/channels was becoming… problematic.
When looking at this problem, it’s immediately clear that you need some way of dynamically allocating a port number on each deployment. This blog post from Paul Stovell shows the way, using a custom Powershell build step.
As we’d lost track of what sites were using what ports, and that we also have ad-hoc websites in IIS that aren’t managed by Octopus Deploy, we thought that asking IIS “Hey, what ports are the sites you know about using?” might be a way forward. We also had the additional requirement that on some of our servers, we also might have some arbitary services also using a port and that we might bump into a situation where a port was chosen that was already being used by a non-IIS application/website.
Researching the first situation, it’s quickly apparent that you can do this in Powershell, using the Webadministration
module. Based on the answers to this question on Stackoverflow, we came up with this:
Import-Module Webadministration
function Get-IIS-Used-Ports()
{
$Websites = Get-ChildItem IIS:\Sites
$ports = foreach($Site in $Websites)
{
$Binding = $Site.bindings
[string]$BindingInfo = $Binding.Collection
[string]$Port = $BindingInfo.SubString($BindingInfo.IndexOf(":")+1,$BindingInfo.LastIndexOf(":")-$BindingInfo.IndexOf(":")-1)
$Port -as [int]
}
return $ports
}
To get the list of ports on a machine that are not being used is also fairly straightforward in Powershell:
function Get-Free-Ports()
{
$availablePorts = @(49000-65000)
$usedPorts = @(Get-NetTCPConnection | Select -ExpandProperty LocalPort | Sort -Descending | Where { $_ -ge 49000})
$unusedPorts = foreach($possiblePort in $usedPorts)
{
$unused = $possiblePort -notin $usedPorts
if($unused)
{
$possiblePort
}
}
return $unusedPorts
}
With those two functions in hand, you can work out what free ports are available to be used as the ‘next port’ on a server. It’s worth pointing out that if a site in IIS is stopped, then IIS won’t allow that port to be used in another website (in IIS), but the port also doesn’t show up as a used port in netstat -a
, which is kind of what Get-NetTCPConnection
does.
function Get-Next-Port()
{
$iisUsedPorts = Get-IIS-Used-Ports
$freePorts = Get-Free-Ports
$port = $freePorts | Where-Object { $iisUsedPorts -notcontains $_} | Sort-Object | Select-Object First 1
Set-OctopusVariable -Name "Port" -Value "$port"
}
Then you just have to call it at the end of the script:
Get-Next-Port
You’d also want to have various Write-Host
or other logging messages so that you get some useful output in the build step when you’re running it.
If you found this because you have a build server which is ‘offline’, without any external internet access because of reasons, and you can’t get your build to work because dotnet fails to restore the tool you require for your build process because of said lack of external internet access, then this is for you.
In hindsight, this may be obvious for most people, but it wasn’t for me, so here it is.
In this situation, you just need to shy away from local tools completely, because as of yet, I’ve been unable to find anyway of telling dotnet not to try to restore them, and they fail every build.
Instead, I’ve installed the tool(s) as a global tool, in a specific folder, e.g. C:\dotnet-tools
, which I’ve then added to the system path on the server. You may need to restart the build server for it to pick up the changes to the environment variable.
One challenge that remains is how to ensure the dotnet tools are consistent on both the developer machine, and the build server. I leave that as an exercise for the reader.
I’m leaving this here so I can find it again easily.
We had a problem updating the Visual Studio 2019 Build Tools on a server, after updating an already existing offline layout.
I won’t go into that here, because it’s covered extensively on Microsoft’s Documentation website.
The installation kept failing, even when using --noweb
. It turns out that when your server is completely cut off from the internet, as was the case here, you also need to pass --noUpdateInstaller
.
This is because (so it would seem) that even though --noweb
correctly tells the installer to use the offline cache, it doesn’t prevent the installer from trying to update itself, which will obviously fail in a totally disconnected environment.
Since a technical breakdown of how Betsy does texture compression was posted, I wanted to lay out how the compressors in Convection Texture Tools (CVTT) work, as well as provide some context of what CVTT's objectives are in the first place to explain some of the technical decisions.
First off, while I am very happy with how CVTT has turned out, and while it's definitely a production-quality texture compressor, providing the best compressor possible for a production environment has not been its primary goal. Its primary goal is to experiment with compression techniques to improve the state of the art, particularly finding inexpensive ways to hit high quality targets.
A common theme that wound up manifesting in most of CVTT's design is that encoding decisions are either guided by informed decisions, i.e. models that relate to the problem being solved, or are exhaustive. Very little of it is done by random or random-like searching. Much of what CVTT exists to experiment with is figuring out techniques which amount to making those informed decisions.
Anyway, CVTT's ParallelMath module is kind of the foundation that everything else is built on. Much of its design is motivated by SIMD instruction set quirks, and a desire to maintain compatibility with older instruction sets like SSE2 without sacrificing too much.
Part of that compatibility effort is that most of CVTT's ops use a UInt15 type. The reason for UInt15 is to handle architectures (like SSE2!) that don't support unsigned compares, min, or max, which means performing those operations on a 16-bit number requires flipping the high bit on both operands. For any number where we know the high bit is zero for both operands, that flip is unnecessary - and a huge number of operations in CVTT fit in 15 bits.
The compare flag types are basically vector booleans, where either all bits are 1 or all bits are 0 for a given lane - There's one type for 16-bit ints, and one for 32-bit floats, and they have to be converted since they're different widths. Those are combined with several utility functions, some of which, like SelectOrZero and NotConditionalSet, can elide a few operations.
The RoundForScope type is a nifty dual-use piece of code. SSE rounding modes are determined by the CSR register, not per-op, so RoundForScope when targeting SSE will set the CSR, and then reset it in its destructor. For other architectures, including the scalar target, the TYPE of the RoundForScope passed in is what determines the operation, so the same code works whether the rounding is per-op or per-scope.
While the ParallelMath architecture has been very resistant to bugs for the most part, where it has run into bugs, they've mostly been due to improper use of AnySet or AllSet - Cases where parallel code can behave improperly because lanes where the condition should exclude it are still executing, and need to be manually filtered out using conditionals.
by OneEightHundred (noreply@blogger.com) at 2021-01-03 23:21
The plan to post a play-by-play for dev kind of fell apart as I preferred to focus on just doing the work, but the Windows port was a success.
If you want some highlights:
The SDL variant ("AerofoilSDL") is also basically done, with a new OpenGL ES 2 rendering backend and SDL sound backend for improved portability. The lead version on Windows still uses D3D11 and XAudio2 though.
Unfortunately, I'm still looking for someone to assist with the macOS port, which is made more difficult by the fact that Apple discontinued OpenGL, so I can't really provide a working renderer for it any more. (Aerofoil's renderer is actually slightly complicated, mostly due to postprocessing.)
In the meantime, the Android port is under way! The game is fully playable so far, most of the work has to do with redoing the UI for touchscreens. The in-game controls use corner taps for rubber bands and battery/helium, but it's a bit awkward if you're trying to use the battery while moving left due to the taps being on the same side of the screen.
Most of the cases where you NEED to use the battery, you're facing right, so this was kind of a tactical decision, but there are some screens (like "Grease is on TV") where it'd be really nice if it was more usable facing left.
I'm also adding a "source export" feature: The source code package will be bundled with the app, and you can just use the source export feature to save the source code to your documents directory. That is, once I figure out how to save to the documents directory, which is apparently very complicated...
Anyway, I'm working on getting this into the Google Play Store too. There might be some APKs posted to GitHub as pre-releases, but there may (if I can figure out how it works) be some Internal Testing releases via GPS. If you want to opt in to the GPS tests, shoot an e-mail to codedeposit.gps@gmail.com
Maybe, but there are two obstacles:
The game is GPL-licensed and there have reportedly been problems with Apple removing GPL-licensed apps from the App Store, and it may not be possible to comply with it. I've heard there is now a way to push apps to your personal device via Xcode with only an Apple ID, which might make satisfying some of the requirements easier, but I don't know.
Second, as with the macOS version, someone would need to do the port. I don't have a Mac, so I don't have Xcode, so I can't do it.
by OneEightHundred (noreply@blogger.com) at 2020-10-20 11:09
by OneEightHundred (noreply@blogger.com) at 2019-11-23 20:43
by OneEightHundred (noreply@blogger.com) at 2019-10-10 02:03
by OneEightHundred (noreply@blogger.com) at 2019-09-06 00:47
by OneEightHundred (noreply@blogger.com) at 2018-03-30 05:26
terVec3 lb = ti->points[1] - ti->points[0];
terVec3 lc = ti->points[2] - ti->points[0];
terVec2 lbt = ti->texCoords[1] - ti->texCoords[0];
terVec2 lct = ti->texCoords[2] - ti->texCoords[0];
// Generate local space for the triangle plane
terVec3 localX = lb.Normalize2();
terVec3 localZ = lb.Cross(lc).Normalize2();
terVec3 localY = localX.Cross(localZ).Normalize2();
// Determine X/Y vectors in local space
float plbx = lb.DotProduct(localX);
terVec2 plc = terVec2(lc.DotProduct(localX), lc.DotProduct(localY));
terVec2 tsvS, tsvT;
tsvS[0] = lbt[0] / plbx;
tsvS[1] = (lct[0] - tsvS[0]*plc[0]) / plc[1];
tsvT[0] = lbt[1] / plbx;
tsvT[1] = (lct[1] - tsvT[0]*plc[0]) / plc[1];
ti->svec = (localX*tsvS[0] + localY*tsvS[1]).Normalize2();
ti->tvec = (localX*tsvT[0] + localY*tsvT[1]).Normalize2();
by OneEightHundred (noreply@blogger.com) at 2012-01-08 00:23
by OneEightHundred (noreply@blogger.com) at 2011-12-07 18:39
#define SH_AMBIENT_FACTOR (0.25f)
#define SH_LINEAR_FACTOR (0.5f)
#define SH_QUADRATIC_FACTOR (0.3125f)
void LambertDiffuseToSHCoefs(const terVec3 &dir, float out[9])
{
// Constant
out[0] = 1.0f * SH_AMBIENT_FACTOR;
// Linear
out[1] = dir[1] * SH_LINEAR_FACTOR;
out[2] = dir[2] * SH_LINEAR_FACTOR;
out[3] = dir[0] * SH_LINEAR_FACTOR;
// Quadratics
out[4] = ( dir[0]*dir[1] ) * 3.0f*SH_QUADRATIC_FACTOR;
out[5] = ( dir[1]*dir[2] ) * 3.0f*SH_QUADRATIC_FACTOR;
out[6] = ( 1.5f*( dir[2]*dir[2] ) - 0.5f ) * SH_QUADRATIC_FACTOR;
out[7] = ( dir[0]*dir[2] ) * 3.0f*SH_QUADRATIC_FACTOR;
out[8] = 0.5f*( dir[0]*dir[0] - dir[1]*dir[1] ) * 3.0f*SH_QUADRATIC_FACTOR;
}
void RotateCoefsByMatrix(float outCoefs[9], const float pIn[9], const terMat3x3 &rMat)
{
// DC
outCoefs[0] = pIn[0];
// Linear
outCoefs[1] = rMat[1][0]*pIn[3] + rMat[1][1]*pIn[1] + rMat[1][2]*pIn[2];
outCoefs[2] = rMat[2][0]*pIn[3] + rMat[2][1]*pIn[1] + rMat[2][2]*pIn[2];
outCoefs[3] = rMat[0][0]*pIn[3] + rMat[0][1]*pIn[1] + rMat[0][2]*pIn[2];
// Quadratics
outCoefs[4] = (
( rMat[0][0]*rMat[1][1] + rMat[0][1]*rMat[1][0] ) * ( pIn[4] )
+ ( rMat[0][1]*rMat[1][2] + rMat[0][2]*rMat[1][1] ) * ( pIn[5] )
+ ( rMat[0][2]*rMat[1][0] + rMat[0][0]*rMat[1][2] ) * ( pIn[7] )
+ ( rMat[0][0]*rMat[1][0] ) * ( pIn[8] )
+ ( rMat[0][1]*rMat[1][1] ) * ( -pIn[8] )
+ ( rMat[0][2]*rMat[1][2] ) * ( 3.0f*pIn[6] )
);
outCoefs[5] = (
( rMat[1][0]*rMat[2][1] + rMat[1][1]*rMat[2][0] ) * ( pIn[4] )
+ ( rMat[1][1]*rMat[2][2] + rMat[1][2]*rMat[2][1] ) * ( pIn[5] )
+ ( rMat[1][2]*rMat[2][0] + rMat[1][0]*rMat[2][2] ) * ( pIn[7] )
+ ( rMat[1][0]*rMat[2][0] ) * ( pIn[8] )
+ ( rMat[1][1]*rMat[2][1] ) * ( -pIn[8] )
+ ( rMat[1][2]*rMat[2][2] ) * ( 3.0f*pIn[6] )
);
outCoefs[6] = (
( rMat[2][1]*rMat[2][0] ) * ( pIn[4] )
+ ( rMat[2][2]*rMat[2][1] ) * ( pIn[5] )
+ ( rMat[2][0]*rMat[2][2] ) * ( pIn[7] )
+ 0.5f*( rMat[2][0]*rMat[2][0] ) * ( pIn[8])
+ 0.5f*( rMat[2][1]*rMat[2][1] ) * ( -pIn[8])
+ 1.5f*( rMat[2][2]*rMat[2][2] ) * ( pIn[6] )
- 0.5f * ( pIn[6] )
);
outCoefs[7] = (
( rMat[0][0]*rMat[2][1] + rMat[0][1]*rMat[2][0] ) * ( pIn[4] )
+ ( rMat[0][1]*rMat[2][2] + rMat[0][2]*rMat[2][1] ) * ( pIn[5] )
+ ( rMat[0][2]*rMat[2][0] + rMat[0][0]*rMat[2][2] ) * ( pIn[7] )
+ ( rMat[0][0]*rMat[2][0] ) * ( pIn[8] )
+ ( rMat[0][1]*rMat[2][1] ) * ( -pIn[8] )
+ ( rMat[0][2]*rMat[2][2] ) * ( 3.0f*pIn[6] )
);
outCoefs[8] = (
( rMat[0][1]*rMat[0][0] - rMat[1][1]*rMat[1][0] ) * ( pIn[4] )
+ ( rMat[0][2]*rMat[0][1] - rMat[1][2]*rMat[1][1] ) * ( pIn[5] )
+ ( rMat[0][0]*rMat[0][2] - rMat[1][0]*rMat[1][2] ) * ( pIn[7] )
+0.5f*( rMat[0][0]*rMat[0][0] - rMat[1][0]*rMat[1][0] ) * ( pIn[8] )
+0.5f*( rMat[0][1]*rMat[0][1] - rMat[1][1]*rMat[1][1] ) * ( -pIn[8] )
+0.5f*( rMat[0][2]*rMat[0][2] - rMat[1][2]*rMat[1][2] ) * ( 3.0f*pIn[6] )
);
}
float3 SampleSHQuadratic(float3 dir, float3 shVector[9])
{
float3 ds1 = dir.xyz*dir.xyz;
float3 ds2 = dir*dir.yzx; // xy, zy, xz
float3 v = shVector[0];
v += dir.y * shVector[1];
v += dir.z * shVector[2];
v += dir.x * shVector[3];
v += ds2.x * shVector[4];
v += ds2.y * shVector[5];
v += (ds1.z * 1.5 - 0.5) * shVector[6];
v += ds2.z * shVector[7];
v += (ds1.x - ds1.y) * 0.5 * shVector[8];
return v;
}
void SHForDirection(const terVec3 &dir, float out[9])
{
// Constant
out[0] = 1.0f;
// Linear
out[1] = dir[1] * 3.0f;
out[2] = dir[2] * 3.0f;
out[3] = dir[0] * 3.0f;
// Quadratics
out[4] = ( dir[0]*dir[1] ) * 15.0f;
out[5] = ( dir[1]*dir[2] ) * 15.0f;
out[6] = ( 1.5f*( dir[2]*dir[2] ) - 0.5f ) * 5.0f;
out[7] = ( dir[0]*dir[2] ) * 15.0f;
out[8] = 0.5f*( dir[0]*dir[0] - dir[1]*dir[1] ) * 15.0f;
}
terVec3 RandomDirection(int (*randomFunc)(), int randMax)
{
float u = (((float)randomFunc()) / (float)(randMax - 1))*2.0f - 1.0f;
float n = sqrtf(1.0f - u*u);
float theta = 2.0f * M_PI * (((float)randomFunc()) / (float)(randMax));
return terVec3(n * cos(theta), n * sin(theta), u);
}
by OneEightHundred (noreply@blogger.com) at 2011-12-02 12:22