[quote author=Nightmist link=topic=983.msg5089#msg5089 date=1183540520]
1. Brute Force every single possible mix of actions for the first 3 rounds (1-3).
1(a). In the event of expected death before 3 rounds then stop that particular branch to save computing time.
1(b). Remove any entries that extend for over 3 rounds.
1(c). If all combat strats extends for over 3 rounds then throw a abort error?
[/quote]
Glad to see someone taking a closer look.

That's pretty close. It has a couple of special initial actions (steal, Entangling Noodles) which it treats as special, so it really kind of checks 1+3 actions deep.
Just like a KoLmafia custom combat script, the simulator (and, later, the strategy executor) will loop on the last action until the battle ends with victory or defeat. So it doesn't actually need it to cause death within 3 rounds -- it just needs to settle down so that the last action can be repeated to beat the monster. For example, if you have high moxie but a low-damage attack, the script could find "Disco Face-Stab, attack" as a strategy even if it takes 15 rounds of attacks to whittle away the monster's HP.
Pruning the search tree is very important to keep it running fast. While it's searching it keeps track of the lowest-cost winning strategy that it's found. As it's simulating it may cut a simulation short (and prune that whole branch of the search tree) if it's clear that everything in the branch will be more expensive than that lowest cost. That's when the "getting too expensive" debug message is shown. For example, if it knows "Entangling Noodles, Stream of Sauce, Stream of Sauce" at 9MP is a working strategy, then it will not bother to explore "Weapon of the Pastalord" plus anything because it can see that the MP cost of WotP immediately makes that whole branch more expensive.
The pruning is enhanced by sorting the actions by MP cost, and iterating over them from lowest-cost to highest-cost. If it considered the actions in random order with a simple foreach, it would still get the same answer, but it'd start out finding a poor strategy at first and then refine it to better strategies later. So it would prune less and ultimately search a lot more of the tree, which means it'd be a lot slower.
[quote author=Nightmist link=topic=983.msg5089#msg5089 date=1183540520]
2. Find the Meat cost for the remaining mix of actions (actions that cause death of monster within 3 turns).
2(a). Meat cost is based off expected HP/MP losses.
3. Find the lowest Meat cost of using these actions.
3(a i). In the event of tied costs then the shortest path is taken.
3(a ii). In the event they are tied and of the same cost then the first found "winning" strat is kept and the new one is dropped.
4. Use the first action of the winning strat in a single round of combat.
5. Take the response from the action taken then update the strat accordingly (if set to update the strat mid-battle).
5(a). If not set to update mid-battle then run the next action.
5(b). The updated search for the strat works for the next 3 rounds (2-4) and work under the same assumptions as 1 and is effectively a loop of 1-5.
[/quote]
Yup. So at every round it's looking 3 actions deep ... however, since it will loop the last action it can actually go more than 3 rounds deep in the simulation. I've tried having it search 4 actions deep but it didn't really help in most situations. It might make some sense for Disco Bandit combos, but otherwise looping on the last action is usually sufficient.
For the second and later rounds, it already knows its old strategy is probably pretty good. So it compares every candidate against that strategy's cost, which again usually leads to a lot of pruning and a fast search.
The code isn't super-clean everywhere -- there's a lot of it, and there are some vestiges of different approaches that I tried and abandoned scattered in there. It's kind of overdue for a total rewrite at some point once the NS13 dust has settled.