What is Tigress?

Tigress is a virtualizer for the C language that supports many novel defenses, both static and dynamic, against well-known de-virtualization attacks. In addition to the virtualization transformation, Tigress contains a collection of traditional obfuscating transformations such as control-flow flattening, opaque predicate insertion, and function merging and splitting. These are used to make the generated interpreters stealthier, more diverse, and more resilient to attack.

Design. Tigress is a source-to-source transformer built in OCaml on top of the CIL infrastructure:

This has multiple advantages: Tigress supports all of the C99 language, including gcc extensions; the transformed code can be easily examined, which is useful in a pedagogical setting; and Tigress' output, once compiled and stripped of symbols, becomes a good target for reverse engineering and de-virtualization exercises. Tigress' design is similar to that of commercial tools, such as Cloakware/IRDETO's C/C++ Transcoder, although the set of transformations we support is, obviously, much more limited.

Diversity. Tigress is designed such that, from a single source program, it is possible to generate large numbers of highly diversified variants. This diversity is both static and dynamic, i.e. two variants will differ both in their machine code and in the resulting instruction traces. In essence, every decision Tigress makes is dependent on a randomization seed, controllable by the user. In contrast to previous implementations, Tigress goes to great lengths to provide as many variants of each transformation as possible. For example, our flattening transformation supports three kinds of dispatch, can optionally split basic blocks, and can use two different kinds of opaque predicates to encode the next variable. The user interacts with Tigress by giving an input C file, a seed, and a sequence of transformations:

Applications. A tool such as Tigress has many potential applications:

  1. Tigress was originally designed as the backend of a system for distributed application tamper detection via continuous software updates. The idea was to force rapid updates to the code running on an untrusted remote site in order to increase the workload of the attacker who has to crack, and re-crack, the code as it is constantly updated.
  2. We are currently using Tigress for studies into diversity.
  3. We are planning to use Tigress to generate collections of software protection benchmark programs. These will provide the community with much needed attack targets that, hopefully, will allow us to devise uniform and generally accepted evaluation proceedures for software protection algorithms.
  4. In particular, we are hoping future de-virtualization research projects will use Tigress-generated interpreters as one of their attack targets, allowing us to further explore the virtualizer/de-virtualizer cat-and-mouse game.

Education. Tigress is also useful as an educational tool. For example, we are currently using Tigress to generate reverse engineering exam/challenge problems for the students in a course we're teaching: we first use Tigress to generate a unique random program for every student in the class, then transform the program using some appropriate combination of obfuscations, and finally give the resulting program to students as a cracking target. The difficulty of the challenge can be easily varied by picking different sequences of transformations, and, since diversity guarantees that every program instance is unique, cheating is made more difficult.

Future. Tigress is under active development and we continue to add new features to the virtualizer. A further goal is to make Tigress the first freely available C language obfuscator to support a large collection of classic obfuscating and tamperproofing transformations, the way that SandMark did for Java. The absence of a general tool for experimentation into the security and performance of software protection algorithms for binary code has severely hampered progress in the area, and we hope Tigress will fill this void.


 

Function Virtualization

This transformation turns a function into an interpreter, whose bytecode language is specialized for this function. The transformation has been designed to induce as much diversity as possible, i.e. every decision made is dependent on the randomization seed. The diversity is both static and dynamic, i.e. each interpreter variant differs in the structure of its code as well as in its execution pattern.

Design. For this transformation, Tigress first constructs type-annotated abstract syntax tree (AST) from the C source, from which it generates control-flow graphs of instruction trees. Tigress then selects a random instruction set architecture (ISA) and, using this ISA, generates a bytecode program specialized for the input function. Finally, Tigress selects a random dispatch method and produces an output program.

Static diversity. Tigress supports two mechanisms for generating ISAs with a high degree of static diversity: instructions can pass arguments in arbitrary combinations of stack locations and registers, and instructions can be arbitrarily long (with highly complex semantics) through the use of superoperators.

Dynamic diversity. We ensure that dynamic execution patterns are diversified by merging randomized bogus functions with the ``real'' function. We can furthermore impede dynamic analysis by making instruction traces artificially long.

Static stealth. Not only diversity but also stealth is important for interpreters. For static stealth, the split transformation can break up the interpreter loop into smaller pieces, and the AddOpaque transformation can make instruction handlers less conspicuous.

Dynamic stealth. For dynamic stealth, Tigress interpreters can be made reentrant, meaning only a few iterations of the dispatch loop are executed at a time, effectively mixing instructions executed from the interpreter with instructions executed by the rest of the program. This is of particular interest when wanting to hide the execution pattern from analysts, and when the exact time that the function executes is not important, as long as it completes eventually.

Generating Interpreters

To generate an interpreter, you give the --Transform=Virtualize option. The options below are available to control the kind of interpreter that gets generated.

OptionArgumentsDescription
--Transform Virtualize Turn a function into an interpreter.
--VirtualizeShortIdents bool Generate shorter identifiers to produce interpreters suitable for publication. Default=false.
--VirtualizeIsWindows bool Set this to true if you're on Windows rather than a Unix system. Currently only relevant when generating bogus functions.
--VirtualizeDispatch switch, direct, indirect, call, ifnest, linear, binary, interpolation, ? Select the interpreter's dispatch method. Default=switch.
  • switch = dispatch by while(){switch(next){...}}
  • direct = dispatch by direct threading
  • indirect = dispatch by indirect threading
  • call = dispatch by call threading
  • ifnest = dispatch by nested if-statements
  • linear = dispatch by searching a table using linear search
  • binary = dispatch by searching a table using binary search
  • interpolation = dispatch by searching a table using interpolation search
  • ? = Pick a random dispatch method
--VirtualizeOperands stack, registers, mixed, ? Type of operands to allow in the ISA. Default=stack.
  • stack = use only stack arguments to instructions
  • registers = use only register arguments to instructions
  • * = same as stack,registers
  • ? = select one an argument at random.
--VirtualizeMaxDuplicateOps INTSPEC Number of ADD instructions, for example, with different signatures. Default=0.
--VirtualizeRandomOps bool Should opcodes be randomized, or go from 0..n? Default=true.
--VirtualizeSuperOpsRatio Float>0.0 Desired number of super operators. Default=0.0.
--VirtualizeMaxMergeLength INTSPEC Longest sequence of instructions to be merged into one. Default=0.
--VirtualizeMaxOpaque INTSPEC Number of opaques to add to each instruction handler. Default=0.
--VirtualizeNumberOfBogusFuns INTSPEC Weave the execution of random functions into the execution of the original program. This makes certain kinds of pattern-based dynamic analysis more difficult. Default=0.
--VirtualizeBogusFunKinds trivial, arithSeq, collatz, * The kind of bogus function to generate. Comma-separated list. Default=arithSeq,collatz.
  • trivial = insert a trivial computation
  • arithSeq = insert a simple arithmetic loop
  • collatz = insert a computation of the Collatz sequence
  • * = select all options
--VirtualizeBogusLoopKinds trivial, arithSeq, collatz, * Insert a bogus loop for each instruction list. This will extend the length of the trace, making dynamic analysis more difficult. Default=collatz.
  • trivial = insert a trivial computation
  • arithSeq = insert a simple arithmetic loop
  • collatz = insert a computation of the Collatz sequence
  • * = select all options
--VirtualizeBogusLoopIterations INTSPEC Adjust this value to balance performance and trace length. Default=0.
--VirtualizeReentrant bool Make the function reentrant. Default=false.
--VirtualizeOptimizeBody BOOLSPEC Clean up after superoperator generation by optimizing the body of the generated function. Default=false.
--VirtualizeOptimizeTreeCode BOOLSPEC Do constant folding etc. prior to interpreter generation. Default=false.
--VirtualizeTrace bool Insert tracing code to show the stack and the virtual instructions executing. Default=false.
--VirtualizeComment bool Insert comments in the generated interpreter. Default=false.
--VirtualizeDump tree, ISA, instrs, types, vars, strings, calls, bytes, array, stack, * Dump internal data structures used by the virtualizer. Comma-separated list. Default=dump nothing.
  • tree = dump the expression trees generated from the CIL representation
  • ISA = dump the Instruction Set Architecture
  • instrs = dump the generated virtual instructions
  • types = dump the types found
  • vars = dump the local variables found
  • strings = dump the strings found
  • calls = dump the function calls found
  • bytes = dump the bytecode array
  • array = dump the instruction array
  • stack = dump the evaluation stack
  • * = select all options

Dispatch Method Selection

For both static and dynamic diversity, Tigress supports eight different dispatch methods. The following code is generated for the different methods, where Ξop1; is the instruction handler for operator op1:

DispatchGenerated code
switch
switch(prog[pc]) {
   op1: Ξop1; break;
   op2: Ξop2; break;
}
direct
goto *prog[pc];
op1hdl: Ξop1; goto *prog[pc];
op2hdl: Ξop2; goto *prog[pc];
indirect
goto *jtab[prog[pc]];
op1hdl: Ξop1; goto *jtab[prog[pc]];
op2hdl: Ξop2; goto *jtab[prog[pc]];
call
void op1fun(){Ξop1}
void op2fun(){Ξop2}
…
call *prog[pc]();
ifnest
if (prog[pc]==op1) Ξop1
else if (prog[pc]==op2) Ξop2
else if …
linear, binary, interpolation
alg = linear|binary|interpolation|…
top: 
   goto *(searchalg(map,prog[pc]));
op1hdl: Ξop1; goto top;
op2hdl: Ξop2; goto top;

Note

Several dispatch methods make use of gcc's and clang labels-as-values. For other compilers only the switch and ifnest dispatch methods should be used.

Instruction Set Architecture Generation

Instruction sets can use stacks, registers, or both to pass values between instructions. By default, the following, very simple, instruction set is used:

  labels:         l ∈ Labels 
  functions:      f ∈ Funs 
  variables:      x ∈ Vars 
  strings:        s ∈ Strings 
  temporaries:     t ::= regint | stackint  
  binary operators: binop ::= add | sub | …
  unary operators:  unop ::= uminus | neg | …
  types:           τ ::= int | float | … | void *
  literals:        λ ::= intlit | floatlit | …
  instructions: e ::=  
       t ← constant τ λ
     | t ← local  x
     | t ← global  x
     | t ← formal  x
     | t ← string  s
     | t ← binary  τ  binop t t
     | t ← unary  τ  unop t
     | t ← convert  τ τ t
     | t ← ternary  τ t t t
     | t ← load  τ t
     | store τ t t
     | t ← memcpy  t t int
     | call  f
     | x, x, ← asm  s  t, t, …
     | indirectCall  t
     | return  τ t
     | goto  l
     | t ← addrOfLabel  l
     | indirectGoto  t
     | branchIfTrue  t  l 
     | switch  τ t  λ  λ  l ⟨l, l, …⟩ 
     | merged  ⟨ e, e, \ldots⟩ 

However, a high degree of diversity can be achieved from the way instructions communicate with each other, through values stored on the stack or passed in virtual registers. Tigress can generate instructions that use any combination of registers and stack storage for the inputs they read or the output they produce.

Tigress can induce further diversity by merging instructions into superoperators. New, merged, instructions can have an almost abritrary complex semantics, involving multiple arithmetic operations and operations both on the stack and virtual registers. For more information on superoperators, see Optimizing an ANSI C interpreter with superoperators by Todd Proebsting. The complex semantics of instructions generated by superoperators make manual analysis of generated interpreters, such as discussed by Rolles in Unpacking virtualization obfuscators, difficult.

Examples

Consider setting --VirtualizeMaxDuplicateOps=2 and --VirtualizeOperands=mixed resulting in two store-int instructions, one that takes both arguments in registers, and one that takes one argument on the stack and the other in a register. Tigress will chose between them randomly. Here are the corresponding instruction handlers:

case _0__store_int$left_REG_0$right_REG_1: 
   (_0__pc[0]) ++;
   *((int *)_0__regs[0][*((int *)_0__pc[0])]._void_star) = _0__regs[0][*((int *)(_0__pc[0] + 4))]._int;
   _0__pc[0] += 8;
   break;

case _0__store_int$right_STA_0$left_REG_0: 
   (_0__pc[0]) ++;
   *((int *)_0__regs[0][*((int *)_0__pc[0])]._void_star) = _0__stack[0][_0__sp[0] + 0]._int;
   (_0__sp[0]) --;
   _0__pc[0] += 4;
   break;

Consider next setting --VirtualizeSuperOpsRatio=2.0 and --VirtualizeMaxMergeLength=10, resulting in virtual instructions with highly complex semantics. Here is the instruction handler for one such instruction, made up by merging 10 primitive instructions:

case _0__local$result_STA_0$value_LIT_0__\
   convert_void_star2void_star$left_STA_0$result_REG_0__\
   load_int$result_REG_0$left_REG_1__\
   local$result_STA_0$value_LIT_0__\
   convert_void_star2void_star$left_STA_0$result_REG_0__\
   store_int$left_REG_0$right_REG_1__\
   local$result_REG_0$value_LIT_1__\
   local$result_STA_0$value_LIT_0__\
   convert_void_star2void_star$left_STA_0$result_REG_0__\
   load_int$result_STA_0$left_REG_0: 
    (_0__pc[0]) ++;
    _0__regs[0][*((int *)(_0__pc[0] + 4))]._void_star = (void *)(_0__locals + *((int *)_0__pc[0]));
    _0__regs[0][*((int *)(_0__pc[0] + 8))]._int = *((int *)_0__regs[0][*((int *)(_0__pc[0] + 12))]._void_star);
    _0__regs[0][*((int *)(_0__pc[0] + 20))]._void_star = (void *)(_0__locals + *((int *)(_0__pc[0] + 16)));
    *((int *)_0__regs[0][*((int *)(_0__pc[0] + 24))]._void_star) = _0__regs[0][*((int *)(_0__pc[0] + 28))]._int;
    _0__regs[0][*((int *)(_0__pc[0] + 32))]._void_star = (void *)(_0__locals + *((int *)(_0__pc[0] + 36)));
    _0__regs[0][*((int *)(_0__pc[0] + 44))]._void_star = (void *)(_0__locals + *((int *)(_0__pc[0] + 40)));
    _0__stack[0][_0__sp[0] + 1]._int = *((int *)_0__regs[0][*((int *)(_0__pc[0] + 48))]._void_star);
    (_0__sp[0]) ++;
    _0__pc[0] += 52;
    break;

Note that the instruction name really is almost 400 characters long; the backslashes are here only for display purposes! Also note that the instruction itself is 53 bytes long, almost as long as the longest VAX instruction (EMODH, 54 bytes) and much longer than the longest x86 instruction (15 bytes)

Instruction Handler Obfuscation

Add opaques etc. to the generated interpreter. This is useful to break up the instruction handlers and the dispatch logic, making them less conspicuous.

Bogus Functions

Generate bogus functions that are virtualized along with the "real" function. Instructions from the bogus and real function are executed cyclically and in sequence, i.e. first an instruction from the real function, then one from bogus function number 1, then one from bogus function number 2, etc., and then the process repeats with an instruction from the real function. The purpose is to frustrate dynamic analyses that try to locate the virtual program counter.

Bogus Loops

Add random computations to every iteration of the dispatch loop. Use this to frustrate dynamic analysis by

  1. inserting bogus instructions between consecutive iterations of the dispatch loop, thereby making the dispatch harder to recognize;
  2. making traces longer and thereby harder to store and analyze.

Reentrant Interpreters

Make interpreters that can execute a few instructions, return, and later resume to execute a few more instructions, until, eventually, they terminate. This is particularly useful when it is not important exactly when the a piece of code executes, as long as it executes eventually, and where the stealthiness of the computations is paramount.

You must prepare your code in the following ways:

  1. The function you want to virtualize must have an argument int* operation. It can occur anywhere among the formal parameters:

    void foo(int* operation, int n, int* result) {…}
    
  2. The first time foo gets called, operation must be <0, and you must pass actual arguments to foo that it will use throughout the computation:

    int operation = -10; 
    foo(&operation,n,&result);
    

    "-10" here means to initialize foo and execute 10 instructions.

  3. Sprinkle calls to foo throughout your program, making sure that operation>0:

    operation = 10;
    foo(&operation,bogus1,&bogus2);  
    

    Here you can pass whatever arguments you want to foo, they won't be used. Rather, the ones that were passed in the first call will be used throughout. "10" here means to resume foo and execute 10 instructions.

  4. You can check if foo has terminated by testing the value of operation after the call:

    operation = 10;
    foo(&operation,bogus1,&bogus2);  
    if (operation > 0)
       /* we're done! */
    else if (operation < 0)
       /* more work to do! */
    
  5. If you want to make sure that foo has terminated --- because you really want its result at a particular point --- set operation to a large enough value:

    operation = 1000;
    foo(&operation,bogus1,&bogus2);  
    
  6. Additional calls to foo once termination has been reached is safe; no additional instructions will be executed.

  7. If you want to call foo to compute a new value, call it again with operation<0:

       int operation = -10; 
       foo(&operation,n,&result);
    

Notes

Our current implementation doesn't handle function results, so make sure your function is void, and returns the result in a global or in a formal parameter.

To ensure termination you can

  1. experiment yourself with how many iterations are necessary to finish the computation;
  2. make sure that the last call to foo is passed a huge value to 'operation';
  3. put the last call to foo in a loop
       foo(&operation);   
       while (operation < 0) {
          /* some other computation here */
          operation = 10;
          foo(&operation);   
       } 
       /* result is available here */
    

It is a good idea to combine reentrant interpreters with superoperators. Superoperators produce long instructions that perform more work during each iteration, and as a result the number of dispatches (i.e. loop iterations) is reduced. In other words, if you want to frustrate dynamic analysis that looks for evidence of the dispatch loop in the instruction trace, superoperators combined with reentrant interpreters will reduce the presence of such artifacts.


 

Control-Flow Flattening

This is a classic control-flow transformation that removes structured flow. Similar to the virtualization transformation, we support several kinds of "dispatch," i.e. how the next block is selected.

OptionArgumentsDescription
--Transform Flatten Flatten a function using Chenxi Wang's algorithm
--FlattenDispatch switch, goto, indirect, ? Dispatch method. Default=switch.
  • switch = dispatch by while(1) {switch (next) {blocks}}
  • goto = dispatch by {labl1: block1; goto block2;}
  • indirect = dispatch by goto* (jtab[next])
  • ? = select an dispatch method at random.
--FlattenObfuscateNext BOOLSPEC Whether the dispatch variable should be obfuscated with opaque expressions or not. Default=true.
--FlattenOpaqueStructs list, array, * Type of opaque predicate to use. Traditionally, for this transformation, array is used. Default=array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--FlattenSplitBasicBlocks BOOLSPEC If true, then basic blocks (sequences of assignment and call statements without intervening branches) will be split up into indiviual blocks. If false, they will be kept intact. Default=true.
--FlattenTrace bool Print a message before each block gets executed. Useful for debugging. Default=false.

For more information, see Chenxi Wang's thesis.


 

Function Splitting

Outline pieces of a function into their own functions. This transformation is useful, for example, to break a large, virtualized, function into smaller, less conspicuous, pieces. Four different splitting methods are supported. The order in which they are tried can affect the naturalness of the resulting code.

OptionArgumentsDescription
--Transform Split Outline pieces of a function
--SplitKinds top, block, deep, recursive Comma-separated list specifying the order in which different split methods are attempted. Default=top,block,deep,recursive.
  • top = split the top-level list of statements into two functions funcname_split_1 and funcname_split_2.
  • block = split a basic block (list of assignment and call statements) into two functions.
  • deep = split out a nested control structure of at least height>2 into its own function funcname_split_1.
  • recursive = same as block, but calls to split functions are also allowed to be split out.
--SplitCount INTSPEC How many times to attempt the split. Default=1.
--SplitName string If set, the split out functions will be named prefix_name_number, otherwise they will be named prefix_originalName_split_number.

Example

This command first tries to split function foo at most 100 times, then applies the block split transformation to the resulting outlined function. Note the use of a regular expression to specify the names of the functions that were generated in the first transformation:

tigress  \
   --Transform=split --Seed=0 --SplitKinds=deep,block,top --SplitCount=100 --Functions=foo \
   --Transform=Split --Seed=0 --SplitKinds=block --SplitCount=100 --Functions=/.\*foo_split.\*/ \
   --out=foo prog.c

 

Function Merging

Merge multiple functions into one. An extra formal argument is added to allow call sites to call any of the functions. This transformation is useful as a precursor to virtualization: if you want to virtualize both foo and bar, first merge them together, then virtualize the result.

The transformation merges the argument list and the local variables of the functions, thereby tying them together.

It is a good idea to run a RndArgs transformation after this one to hide the obvious extra argument that's been added to the function.

There are several ways to merge. In a simple merge, the function bodies are simply put in an if-nest. This is simplistic, of course, but sufficient if you are going to, say, virtualize the merged function. If you set --MergeFlatten=true then constituent functions are first flattened, then the resulting blocks are merged together, and finally a dispatch method is added (switch, goto, or indirect, selected by --MergeFlattenDispatch).

OptionArgumentsDescription
--Transform Merge Merge of two or more functions. Two different types of merge are supported: simple merge (if () function1 else if () function2 else ...) and flatten merge, where the functions are first flattened, and then the resulting blocks are woven together. This transformation modifies the signature of the function (an extra formal selector argument is added that selects between the constituent functions at runtime), and this cannot be done for functions whose address is taken. --Functions=\* merges together all functions in the program whose signatures can be changed, --Functions=%50 merges together about half of them, etc. It is a good idea to follow this transform by a RndArgs transform to hide the extra selector argument.
--MergeName string If set, the merged function will be named prefix_name, otherwise it will be named prefix_originalName1_originalName2. Note that it's unpredictable which function will be the first and the second, so it's better to set the merged named explicitly.
--MergeObfuscateSelect BOOLSPEC Whether the extra parameter passed to the merged function should be obfuscated with opaque expressions or not. Default=true.
--MergeOpaqueStructs list, array, * Type of opaque predicate to use. Traditionally, for this transformation, array is used. Default=array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--MergeFlatten BOOLSPEC Whether to flatten before merging or not. Default=true.
--MergeFlattenDispatch switch, goto, indirect, ? Dispatch method used for flattened merge. Default=switch.
  • switch = dispatch by while(1) {switch (next) {blocks}}
  • goto = dispatch by {labl1: block1; goto block2;}
  • indirect = dispatch by goto* (jtab[next])
  • ? = select an dispatch method at random.

Notes

The merged function is named

   prefix ^ fun1 ^ "_" ^ fun2  ^ "_" ^ ...

where ^ is concatenation.


 

Control-Flow Splitting by Opaque Predicate Insertion

Break up code blocks by inserting opaque predicates. Requires that at least --Transform=InitOpaque option and, preferably, one or more --Transform=UpdateOpaque options have been given previously.

OptionArgumentsDescription
--Transform AddOpaque Add opaque predicates to split up control-flow.
--AddOpaqueCount INTSPEC How many opaques to add to each function. Default=1.
--AddOpaqueKinds call, bug, true, junk, fake, * Comma-separated list of the types of insertions of bogus computation allowed. Default=call,bug,true,junk.
  • call = if (false) RandomFunction()
  • bug = if (false) BuggyStatement else RealStatement
  • true = if (true) RealStatement
  • junk = if (false) asm(".byte random bytes")
  • fake = if (False) NonExistingFunction()
  • * = Turns all options on.

This is the code generated for the arguments options to --AddOpaqueKinds:

Argument Generated code
call
if expr=false then
   call to random existing function
         
fake
if expr=false then
   call to non-existing function
         
true
if expr=true then
   existing statement
         
bug
if expr=true then
   existing statement
else
   buggified version of the statement
        
junk
if expr=false then
   asm(".byte RandomBytes")
         

Notes

fake will result in undefined symbols being generated. You need to coerce the linker to ignore such errors. With gcc you can use this option:

   -Wl,--unresolved-symbols=ignore-in-object-files 

No similar option seems to exist for clang.


 

Function Argument Randomization

Randomize the order of arguments to a function, and optionally add extra bogus arguments. Useful to run after the --Transform=Merge transform (to hide the extra selector argument) or the --Transform=EncodeLiterals --EncodeLiteralsKinds=string transform (to hide the otherwise obvious signature of the generated string encoder function).

OptionArgumentsDescription
--Transform RndArgs Randomize the order of arguments to a function and add extra bogus arguments.
--RndArgsBogusNo INTSPEC Number of bogus arguments to add. Default=0.

Issues

Doesn't work with functions with varargs.

Doesn't work for functions whose address is taken and then called through a function pointer.


 

Encode Literals

Replace integer and/or string literals (such as 42 or "42") with opaque expressions. Requires that at least --Transform=InitOpaque option and, preferably, one or more --Transform=OpaqueUpdate options have been given previously.

Note that the generated string encoding function is trivial, by design. It should itself be transformed, for example using the Virtualize transformation.

OptionArgumentsDescription
--Transform EncodeLiterals Replace literal integers and strings with less obvious expressions.
--EncodeLiteralsKinds integer, string, * Specify the types of literals to encode Default=integer,string.
  • integer = Replace literal integers with opaque expressions
  • string = Replace literal strings with calls to a function that generates them
  • * = Same as integer,string
--EncodeLiteralsEncoderName string The name of the generated encoder function (only for encoded strings). Default=None.


 

Encoding Branches

Branch Functions

This transformation implements a simplistic version of Linn and Debray's Obfuscation of Executable Code to Improve Resistance to Static Disassembly, Linn and Debray's algorithm replaces direct jumps with calls to a special branch function which sets the return address to the target of the original branch, and then returns.

The generated code looks like this, where the call to the branch function bf actually results in a direct jump to lab2:

void bf(unsigned long offset) {
  __asm__  volatile   ("addq  %0, 8(%%rbp)": : "r" (offset));
}

int main() {
   bf((unsigned long)(&& lab2) - (unsigned long)(&& lab3));
   lab3: 
       __asm__  volatile   (".byte 0x76,0x9b,0x8e,0x1b,0x4d":);
   ...
   lab2: ...;
}

By default, a function is flattened prior to direct jumps being replaced by calls to branch function (turn this off with --BranchFunsFlatten=false). This creates more direct jumps and hence more opportunities to apply the branch function transformation.

Before branches can be replaced by calls to a branch function, at least one such function needs to be constructed, using the --Transform=InitBranchFuns transformation:

OptionArgumentsDescription
--Transform InitBranchFuns Create branch functions.
--InitBranchFunsOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to use for branch functions. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--InitBranchFunsCount INTSPEC How many branch functions to create. Default=1.
--InitBranchFunsObfuscate BOOLSPEC Obfuscate the branch function. Default=true.

The branch function is not obfuscated and hence trivial to find. It's therefore a good idea to merge it with other functions in the program.

Our implementations of branch functions doesn't use perfect hash tables, as suggested in Linn and Debray's paper, since this is hard to do as a source-to-source transformation. Rather, we simply pass the offset to jump to as an argument to the branch function.

There are many attacks published on branch functions, including Static Disassembly of Obfuscated Binaries by Christopher Kruegel, William Robertson, Fredrik Valeur and Giovanni Vigna, and Deobfuscation: Reverse engineering obfuscated code by Sharath Udupah, Saumya Debray, and Matias Madou.

X86 Branch Obfuscations

We implement two standard branch obfuscations used by many packers (see Binary-code obfuscations in prevalent packer tools by Kevin A. Roundy and Barton P. Miller):
      push target
      call lab
      ret
lab:
      ret
and
      push target
      ret
OptionArgumentsDescription
--Transform EncodeBranches Replace unconditional branches (gotos) with other constructs.
--EncodeBranchesKinds branchFuns, goto2call, goto2push, * Comma-separated list of the kinds of constructs jumps can be replaced with. Default=branchFuns.
  • branchFuns = Generate calls to branch functions. --Transform=InitBranchFuns must be given prior to this transform
  • goto2call = Replace goto L with push L; call lab; ret; lab: ret
  • goto2push = Replace goto L with push L; ret
  • * = Same as branchFuns,goto2call,goto2push
--EncodeBranchesOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to use in a call to a branch function. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--EncodeBranchesObfuscate BOOLSPEC Obfuscate the branch function call Default=true.
--EncodeBranchesFlatten BOOLSPEC Flatten before replacing jumps. This opens up more opportunities for replacing unconditional branches. Default=true.
--EncodeBranchesReturnAddressOffset integer The offset (in bytes) of the return address on the stack, for branch functions. May differ based on operating system, word size, and compiler. Default=8.

Issues

Our implementation of branch obfuscations have many issues, and should only be used with great care:
  1. It appears as goto2push and goto2call will often cause clang to generate the wrong code.
  2. gcc 4.6 appears to do the right thing.
  3. gcc 4.8 appears to occasionally hang when compiling our generated code.
The issue is that the generated inline assembly code contains jumps. Newer versions of gcc have an asm goto construct which ought to help with this. Clang lacks this feature.

Make sure you set the --Environment=... option appropriately if you are going to use goto2push and goto2call and test the generated code thoroughly. goto2push and goto2call are turned off by default.


 

Encode Arithmetic

Replace integer arithmetic with more complex expressions. Currently, the identities are taken from the book Hacker's Delight. For example, the following identities can be used to encode integer addition:

   x + y = x - ¬ y - 1
         = (x ⊕ y) + 2·(x ∧ y) 
         = (x ∨ y) + (x ∧ y) 
         = 2·(x ∨ y) - (x ⊕ y) 

For example, Tigress might replace

     z = x + y + w
with
  z = (((x ^ y) + ((x & y) << 1)) | w) + 
      (((x ^ y) + ((x & y) << 1)) & w);

Many other encodings are possible, which is good for diversity.

OptionArgumentsDescription
--Transform EncodeArithmetic Replace integer arithmetic with more complex expressions.
--EncodeArithmeitKinds integer Specify the types to encode. Currently, only integer is available. Default=integer.
  • integer = Replace integer arithmetic.


 

Encode Data

Encode integer variables so that they have a non-standard data representation. The goal is for a variable's real value (and the values of intermediate expressions used to compute it) to never be revealed, until it is printed or otherwise escapes the program. For example, an integer variable v could be replaced with:

   v' = a*v + b
where a is a random odd integer and b a random integer.

For example, given this program

int main () {
  int arg1 = ...
  int arg2 = ...
  int a = arg1;
  int b = arg2;
  int x = a*b;
  printf("x=%i\n",x);
}
Tigress might produce the following:
  a = 1789355803 * arg1 + 1391591831;
  b = 1789355803 * arg2 + 1391591831;
  x = ((3537017619 * (a * b) - 3670706997 * a) - 3670706997 * b) + 3171898074;
  printf("x=%i\n", -757949677 * x - 3670706997);

A typical invokation of this transformation lists a collection of local variables and formal parameters, and global variables:

   --Transform=EncodeData --GlobalVariables='g1,g2' --LocalVariables='fun1:L1,L2;fun2:L3' --EncodeDataCodecs=poly1

These variables should all be integers, pointers to integers, arrays of integers, or combinations of these. In the example above, g1 may be an int, L1 an int*, L2 an array of ints, and L3 an array of pointers to ints.

This transformation is based on ideas from several Cloakware/IRDETO papers and patents: Compiler-Based Infrastructure for Software-Protection, Information Hiding in Software with Mixed Boolean-Arithmetic Transforms, System and method for obscuring bit-wise and two's complement integer computations in software.

OptionArgumentsDescription
--Transform EncodeData Replace integer variables with a different encoding. Use --GlobalVariables and --LocalVariables to specify the variables that should be transformed. In addition to the variables specifed, any other variables that are related through aliasing will be transformed. Only integer variables, arrays of integers, and pointers to integers are currently supported. Avoid structs, since our alias analysis algorithm conflates all fields.
--EncodeDataCodecs poly1, xor, add, * Comma-separated list of the kinds of codecs that may be used. Only poly1 currently makes sense; avoid the others. Default=poly1.
  • poly1 = Linear transformation of the form a*x+b.
  • xor = Exclusive-or with a constant.
  • add = Add a constant and promote to next largest integer type. Will fail for the largest integer type.
  • * = Same as poly1,xor,add


 

Opaque Expressions

Several transformations rely on boolean and integer expressions that have a known value, known as opaque predicates and expressions. To construct these, data structures with precise invariants are added to the code. See Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection by Collberg and Nagra for more details.

OptionArgumentsDescription
--Transform InitOpaque Add opaque initialization code. This initialization code has to be added to a function that gets called before any uses of opaque predicates, usually, but not necessarily, to main.
--InitOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to add. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--InitOpaqueCount INTSPEC How many opaque data structures (lists or arrays) to add to the program. They will be split roughly evenly between the different declared opaque structures. Default=1.
--InitOpaqueSize INTSPEC Size of opaque arrays. Default=30.

To frustrate analysis, updates that maintain the invariants should be sprinkled throughout the program. This is done by the --Transform=OpaqueUpdate option.

OptionArgumentsDescription
--Transform UpdateOpaque Add code that makes updates to opaque predicates.
--UpdateOpaqueCount INTSPEC How many updates to opaque data structures to add to the function. Default=1.
--UpdateOpaqueAllowAddNodes bool Is it safe to malloc new nodes for the opaque data structure in this function? Only set to true if the function is called sparingly. Default=false.

Notes

Tigress will generate copious numbers of extra local variables and statements of the form _*__BARRIER_* = 1, _*__BEGIN_* = 1, _*__END_* = 1. They will be removed by any competent compiler, or by the --Transform=CleanUp --CleanUpKinds=annotations transformation.


 

Collecting Entropy

Some transformations need a source of randomness during execution. For this reason, we can insert statements that collect random values, preferably from variables that are input dependent.

At a minimum, you should issue the --Transform=InitEntropy transformation, since this creates the variables that hold the entropy:

OptionArgumentsDescription
--Transform InitEntropy Add initialization of the entropy variables.

You should issue as many --Transform=UpdateEntropy as you can, making sure you collect entropy from variables that are truly input dependent:

OptionArgumentsDescription
--Transform UpdateEntropy Add updates to the entropy variables.
--UpdateEntropyVar IDENTSPEC Add to the entropy variables from these variables. Default=*.

Example

This command initializes the entropy variables in main, and then collects randomness from variables x,y,z in function inputData, from variable packet in function acceptNetworkPacket, and from all variables in function random:

tigress \
   --Transform=InitEntropy --Functions=main \
   --Transform=UpdateEntropy --Functions=inputData --UpdateEntropyVar=x,y,z \
   --Transform=UpdateEntropy --Functions=acceptNetworkPacket --UpdateEntropyVar=packet \
   --Transform=UpdateEntropy --Functions=random --UpdateEntropyVar=\* \
   --out=foo prog.c


 

Download

Tigress is currently not open-source, but is available for binary download (see the FAQ for the reasoning behind this).

Version Mac OS X Linux Release Notes
Unstable Mac OS X 10.9, x86/64 Linux, x86/64 Release notes
1.3 Mac OS X 10.9, x86/64 Linux, x86/64 Release notes
1.2 Mac OS X 10.9, x86/64 Linux, x86/64 Release notes
1.1 Mac OS X 10.9, x86/64 Linux, x86/64 Release notes
1.0 Mac OS X 10.9, x86/64 Linux, x86/64
0.9 Mac OS X 10.9, x86/64 Linux, x86/64
Examples examples.zip


 

Controlling Tigress

To apply a sequence of transformations, Tigress is invoked like this, where OBFTYPE is the name of the obfuscation and IDENTSPEC is one or more functions to which it should be applied:

   tigress --out=OUTFILE.c \
              --Transform=OBFTYPE --Functions=IDENTSPEC [EXTRA_OPTS...] \
              --Transform=OBFTYPE --Functions=IDENTSPEC [EXTRA_OPTS...] \
                         ....
              --Transform=OBFTYPE --Functions=IDENTSPEC [EXTRA_OPTS...] \
           FILE.c
 

A typical invocation looks like this:

    > tigress \
         --Transform=InitOpaque   --Functions=main \                   # First transformation
         --Transform=UpdateOpaque --Functions=f \                      # Second transformation
         --Transform=AddOpaque    --Functions=f --AddOpaqueCount=2 \   # Third transformation
         --out=x.c \                                                   # Output file
         simple1.c                                                     # Input file

Note that Tigress accepts exactly one C file as input. If your project has multiple files you must first merge them together into one:

$TIGRESS_HOME/cilly --merge --keepmerged x1.c x3.c x2.c -o merged.o

The merged source will be in the file merged.o_comb.c which can subsequently be passed to Tigress for transformation. See CIL's documentation to learn more about the merging process. Different invocations of the merger may be necessary if your project is more complex, if, for example, you need to pass different options to different files.

Note that options passed through to the compiler have one dash ("-"), while options passed to Tigress start with two ("--").

Top-Level Options

OptionArgumentsDescription
--Environment string A string that describes the architecture, operating system, and compiler being used. We currently recognize the following two strings x86_64:Linux:Gcc:4.6 and x86_64:Darwin:Clang:5.1. This is mostly necessary because Clang does not support some features (most notably asm goto) that Gcc does. In the future we will use this to provide better support for 32-bit binaries. Default=0.
--out file.c The file to write to.
--Seed INTSPEC The randomization seed. --Seed=0 makes Tigress generate its own seed.
--FilePrefix AUTO, NONE, string Use this if you intend to run tigress multiple times on each file to avoid name clashes. Only set this option once. Default=NONE.
  • AUTO = generate a prefix to add to all symbols
  • NONE = don't add any prefix
  • string = add this prefix
--Verbosity int Tigress' chattiness level. --Verbosity=0 makes Tigress quiet. --Verbosity=1 prints each transformation as it is being applied. Default=0.

Selecting Transformations

Each transformation is specified, at a minimum, by the --Transform option that selects the type of transformation and the --Functions option that selects the function(s) to which it should be applied.

The following transformations are currently available:

Transformation Description
Ident The identity transformation; it does nothing.
Virtualize Turn a function into a specialized interpreter.
Flatten Remove control flow from a function.
Merge Merge two functions into one.
Split Split a function into smaller parts.
InitEntropy Create variables necessary to collect randomness.
UpdateEntropy Collect randomness from input-dependent variables.
RndArgs Reorder function arguments and/or add bogus arguments
InitOpaque Create types and variables necessary to introduce opaque predicates and expressions.
AddOpaque Split up control flow by adding opaque branches.
UpdateOpaque Update upaque variables to make them harder to analyze.
EncodeLiterals Replace literals by less obvious expressions.
EncodeData Replace integer variables with different representations.
InitBranchFuns Create branch functions.
EncodeBranches Replace direct branches with calls to a branch function.
RandomFuns Generate random functions to be used as targets in cracking exercises.
CleanUp Last transformation to run, to clean up the generated code..
Info Print internal information.

Selecting Transformation Targets

To avoid name clashes and to allow you to specify the results of a transformation, prefixes can be added to all new identifiers. For example, after a Split transformation, you may want to perform additional transformations to the newly formed functions, and thus need to know their new names. You can use the --Prefix for this. Also, if you intend to run Tigress multiple times on the same file (rather than applying all transformations in one run), you need to make sure that new names don't clash with old ones. Use --FilePrefix for this.

OptionArgumentsDescription
--Prefix string Add this prefix to each new generated symbol. This is in addition to the --filePrefix. Default is "_number_" where number is the order number of the transformation given on the command line. You can set this for every transformation. Default=_number_.
--Exclude string-list Comma-separated list of the functions to exclude from obfuscation. Useful after an --Functions=* or --Functions=?int option, like this: --Functions=* --Exclude=main
--Functions IDENTSPEC The functions to which the transformation should be applied. See below for how to specify a set of functions.
--GlobalVariables IDENTSPEC The global variables to which the transformation should be applied. Currently only used for the --Transform=EncodeData transformation.
--LocalVariables LOCALSPEC The local variables and formal parameters to which the transformation should be applied. Currently only used for the --Transform=EncodeData transformation.

Thus with the options

   --FilePrefix=AAA_ --Transform=initOpaque --Prefix=BBB

we would generate symbols of the form

   AAA_BBB_opaque_list1

and with the options

   --FilePrefix=AAA_ --Transform=InitOpaque

they would look like this:

   AAA__0__opaque_Node

Argument Specifications

For options that take an integer an argument we provide an INTSPEC notation that allows randomized selection of the value. There's a similar BOOLSPEC notation for booleans.

All transformations require you to specify the set of functions to which they should be applied. Trivally, you can say --Functions=foo to apply the obfuscation only to foo, but frequently you need more flexibility than that. Identifier specifications provide this functionality. Some transformations also use identifier specifications to specify variables, as in --UpdateEntropyVar=\* which would select all variables of a function.

OptionArgumentsDescription
INTSPEC ?, int?int, int The INTSPEC notation allows randomized selection of integer valued options.
  • ? = select a 32-bit random number
  • int?int = select a random integer value in the range [int,int]
  • int = select this value
BOOLSPEC ?, true, false The BOOLSPEC notation allows randomized selection of boolean valued options.
  • ? = select a random boolean value
  • true = select true
  • false = select false
IDENTSPEC *, ?int, %int, /regexp/, string Many transformations require you to specify the set of functions to which they should be applied. Trivally, you can say --Functions=foo to apply the obfuscation only to foo, but frequently you need more flexibility than that. The IDENTSPEC notation provides this functionality. Some transformations also use identifier specifications to specify variables, as in --UpdateEntropyVar=\* which would select all variables of a function.
  • * = select all available identifiers
  • ?int = randomly select int number of identifiers
  • %int = randomly select int percent of available identifiers
  • /regexp/ = select the identifiers that match the regular expression
  • string = select this identifier
LOCALSPEC The LOCALSPEC notation is used to specify a set of local variables and formal parameters. For example, --LocalVariables='main:i,j;foo:\*'=\* would select all variables of foo and i and j of main. The notation is a semicolon-separated list of IDENTSPEC:IDENTSPEC.

Examples

Randomly select 3 functions and "foo":

    --Functions=?3,foo   

Add entropy from all variables in function foo:

    --Transform=UpdateEntropy --Functions=foo --UpdateEntropyVar=\*

Split 20% of all functions:

    --Transform=split --Functions=%20 

Note that some care needs to be exercised when when specifiying identifiers, since some renaming can happen during obfuscation.

Debugging

Use --Transform=Info to print information about the ongoing transformations. This command can be issued multiple times on the command line to see, for example, how control flow graphs are being transformed.

OptionArgumentsDescription
--Transform Info Print internal information.
--InfoKind cfg, fun, linear, WS, DG, CG, alias, global Information to print. For cfg, fun, and linear use --Functions, as usual, to specify which functions to print.
  • cfg = Control Flow Graph
  • fun = Function in internal format
  • linear = Function in internal linearized block format (used as a starting point for flattening and branch functions)
  • WS = Working Set
  • DG = Dependency Graph
  • CG = Call Graph
  • alias = Print the pointer-graphs
  • global = List of global symbols in the original program.


 

OS/Machine Dependence

MAC OS X weirdness

  1. Include the following at the top of your C file, to get past CIL not properly handling some OS X extensions:

    #ifdef __APPLE__
    #include<Availability.h>
    #undef __OSX_AVAILABLE_STARTING
    #define __OSX_AVAILABLE_STARTING(_mac, _iphone)
    #undef __OSX_AVAILABLE_BUT_DEPRECATED
    #define __OSX_AVAILABLE_BUT_DEPRECATED(_osxIntro, _osxDep, _iosIntro, _iosDep)
    #undef __OSX_AVAILABLE_BUT_DEPRECATED_MSG
    #define __OSX_AVAILABLE_BUT_DEPRECATED_MSG(_osxIntro, _osxDep, _iosIntro, _iosDep, _msg)
    #undef __BLOCKS__
    #endif
    
  2. Compile with

    -fgnu89-inline  
    

    to get past a redeclaration bug in MAC OS 10.9. For an explanation, see, for example, http//sourceforge.net/p/resil/tickets/6.

  3. Compile with

    -Wno-builtin-requires-header 
    

    to avoid a spurious warning generated by clang.

32-vs-64-bit machine models

By default, we assume you're generating code for the machine on which you execute Tigress on. If this is not the case, in particular, if your target machine has a different wordsize, you must

  1. set this environment variable with the relevant C type sizes

    CIL_MACHINE="short=2,2 int=4,4 long=4,4 long_long=8,8 pointer=4,4 \
                 alignof_enum=4 float=4,4 double=8,8 long_double=12,12 \
                 void=1 bool=1,1 fun=1,1 alignof_string=1 max_alignment=16 \
                 size_t=unsigned_int wchar_t=int char_signed=true const_string_literals=true \
                 big_endian=false __thread_is_keyword=true __builtin_va_list=true \
                 underscore_name=true";export CIL_MACHINE;
    
  2. run Tigress with the --envmachine option.

For the current version of Tigress, this is really only relevant for the virtualize transformation. See the CIL documentation for more information.


 

Generate Challenge Problems

One of the uses of Tigress is as an educational tool. The --Transform=RandomFuns option will generate a random function that can subsequently be transformed using any combination of Tigress obfuscations, and then given to students as a cracking target.

Depending on the sophistication of your students, you can vary the length of the transformation sequence, the difficulty of the transformations, the options to the transformations, the complexity of the generated challenge function, and either give them source to untangle (a good way to learn about particular transformations), or stripped compiled code (for a more real-world challenge).

Below is part of the script we use to generate take-home exams for our students. It contains two assets, a password check and an expired time check, and it's the students' job to disable these.

# Generate the cleartext challenge program. This is hidden from the students.
# empty.c is just an empty file.
tigress --Verbosity=1 --Seed=$seed6 \
      --Transform=RandomFuns --RandomFunsName=SECRET \
         --RandomFunsType=long \
         --RandomFunsInputSize=1 --RandomFunsStateSize=1 --RandomFunsOutputSize=1 \
         --RandomFunsCodeSize=10 \
         --RandomFunsTimeCheckCount=1 \
         --RandomFunsActivationCodeCheckCount=1 --RandomFunsActivationCode=42 \
         --RandomFunsPasswordCheckCount=1 --RandomFunsPassword=secret \
         --RandomFunsFailureKind=segv \
      --out=6-input.c empty.c

# Generate an empty program with the same interface as the challenge program
# for the students to fill out
tigress --Verbosity=1 --Seed=$seed6 \
      --Transform=RandomFuns --RandomFunsName=SECRET \
         --RandomFunsType=long \
         --RandomFunsInputSize=1 --RandomFunsStateSize=1  --RandomFunsOutputSize=1 \
         --RandomFunsCodeSize=0 \
      --out=6-answer.c empty.c

# Obfuscate the challenge program. 
tigress --Verbosity=1 --Seed=$seed6 --FilePrefix=obf \
      --Transform=InitEntropy \
         --Functions=main\
      --Transform=InitOpaque \
         --Functions=main --InitOpaqueCount=1 --InitOpaqueStructs=list,array\
      --Transform=InitBranchFuns \
         --InitBranchFunsCount=2\
      --Transform=EncodeLiterals \
         --Functions=SECRET --EncodeLiteralsKinds=string --EncodeLiteralsEncoderName=STRINGS\
      --Transform=Virtualize \
         --Functions=STRINGS --VirtualizeDispatch=switch --VirtualizeOperands=stack,registers \
         --VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 \
      --Transform=AddOpaque \
         --Functions=SECRET --AddOpaqueKinds=call,bug,true --AddOpaqueCount=4\
      --Transform=Virtualize \
         --Functions=SECRET --VirtualizeDispatch=indirect --VirtualizeOperands=stack,registers \
         --VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 \
      --Transform=Virtualize \
         --Functions=SECRET --VirtualizeDispatch=ifnest --VirtualizeOperands=stack,registers \
         --VirtualizeMaxMergeLength=2 --VirtualizeSuperOpsRatio=1.0 --VirtualizeNumberOfBogusFuns=1\
      --Transform=EncodeLiterals \
         --Functions=SECRET --EncodeLiteralsKinds=integer \
       --Transform=BranchFuns \
         --Functions=SECRET --BranchFunsFlatten=true \
      --Transform=CleanUp \
         --CleanUpKinds=annotations,constants,names \
      --out=6-challenge.c 6-input.c
OptionArgumentsDescription
--Transform RandomFuns Generate a random function useful as an attack target.
--RandomFunsInputSize INTSPEC Size of input. Default=1.
--RandomFunsStateSize INTSPEC Size of internal state. Default=1.
--RandomFunsOutputSize INTSPEC Size of output. Default=1.
--RandomFunsCodeSize INTSPEC Size of the generated code. Currently only 0 (empty body) and 1 (arbitrary non-zero size) make sense. Default=1.
--RandomFunsType int, long, float, double Type of input/output/state. Default=long.
  • int = C int type
  • long = C long type
  • float = C float type
  • double = C double type
--RandomFunsName string The name of the generated function.
--RandomFunsFailureKind message, abort, segv The manner in which a triggered asset may fail. Comma-separated list. Default=segv.
  • message = Print a message.
  • abort = Call the abort function.
  • segv = Die with a segmentation fault.
--RandomFunsActivationCode int The code the user has to enter (as the first command line arguments) to be allowed to run the program. Default=42.
--RandomFunsPassword string The password the user has to enter (read from standar input) to be allowed to run the program. Default="42".
--RandomFunsTimeCheckCount int The number of checks for expired time (gettimeofday() > someTimeInThePast) to be inserted in the program. Default=0.
--RandomFunsActivationCodeCheckCount int The number of checks for correct activation code to be inserted in the program. Default=0.
--RandomFunsPasswordCheckCount int The number of checks for correct password to be inserted in the program. Probably only 0 and 1 make sense here, since the user will be prompted for a password once for every check. Default=0.


 

Known Issues

  1. The virtualizer only accepts asm functions with literal strings, not arguments.
  2. The virtualizer and flattener completely restructures the code, which means that arithmetic on the program counter is not going to work, such as in this example taken from gcc's comp-goto-1.c torture test:
    goto *(base_addr + insn.f1.offset);
    


 

Transformation Examples

Below you will find a collection of examples showing how to invoke Tigress, and what the resulting transformed code looks like. Perusing these examples is a good first step to building successful attacks on Tigress, such as you are asked to do in the Challenges section.

As you are reading the code, there are a couple of interesting things to note:

Obfuscations based on Opaque Predicates

Add Opaque Branches
Break up code by inserting bogus branches, protected by opaque predicates.
tigress --Verbosity=1  \
   --Transform=InitOpaque --Functions=main \
   --Transform=UpdateOpaque --Functions=fib --UpdateOpaqueCount=10 \
   --Transform=AddOpaque --Functions=fib --AddOpaqueCount=10  --AddOpaqueKinds=call,bug,true,junk \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/opaque.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/opaque c-files/opaque.c
test1.copaque.shopaque.c

Obfuscate Literals
Replace literal integers with opaque expressions.
tigress --Verbosity=1  \
   --Transform=InitOpaque --Functions=main \
   --Transform=EncodeLiterals --Functions=\* \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/obfuscateLiterals.c test1.c 
gcc -Wno-builtin-requires-header -v -fgnu89-inline -o bin-files/obfuscateLiterals c-files/obfuscateLiterals.c

test1.cobfuscateLiterals.shobfuscateLiterals.c

Trivial Randomizations

Randomize Function Arguments
Reorder and add bogus arguments to fib.
tigress --Verbosity=1  \
   --Transform=RndArgs --Seed=0 --RndArgsBogusNo=2?5 --Functions=fib \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/rndArgs.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/rndArgs c-files/rndArgs.c
test1.crndArgs.shrndArgs.c

Split and Merging Functions

Split
Split up fib in as many pieces as possible.
tigress --Verbosity=1  \
   --Transform=Split --Seed=0 --SplitKinds=deep,block,top --SplitCount=100 --Functions=fib \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/split1.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/split1 c-files/split1.c
test1.csplit1.shsplit1.c

Split ⇒ Split
Split up fib in as many pieces as possible, and then split up the resulting functions as well.
tigress --Verbosity=1  \
   --Transform=Split --Seed=0 --SplitKinds=block,top,deep --SplitCount=100 --Functions=fib --SplitName=SPLIT \
   --Transform=Split --Seed=0 --SplitKinds=block --SplitCount=100 --Functions=/.\*SPLIT.\*/ \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/split2.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/split2 c-files/split2.c
test1.csplit2.shsplit2.c

Merge
Merge fib and fac into fac_fib.
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=InitOpaque --Functions=main --InitOpaqueCount=2 --InitOpaqueStructs=list,array \
   --Transform=Merge --Functions=fac,fib \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/merge.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/merge c-files/merge.c
test1.cmerge.shmerge.c

Merge ⇒ Split
Merge fac and fib into fac_fib, and then split up fac_fib.
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=InitOpaque --Functions=main --InitOpaqueCount=2 --InitOpaqueStructs=list,array \
   --Transform=Merge --Functions=fac,fib --MergeName=MERGED \
   --Transform=Split --SplitKinds=block,top,deep --SplitCount=10 --Functions=MERGED \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/merge-split.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/merge-split c-files/merge-split.c
test1.cmerge-split.shmerge-split.c

Control Flow Flattening

Flatten
Flatten fib in test1.c using each of the dispatch methods.
tigress --Verbosity=1  \
   --Transform=Flatten --Functions=fib --FlattenDispatch=dispatch \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/... test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/... c-files/...
sw id go
shout.c shout.c shout.c

Flatten ⇒ Flatten
Flatten fib in test1.c using two levels of flattening.
tigress --Verbosity=1  \
   --Transform=Flatten --Functions=fib --FlattenDispatch=dispatch1 \
   --Transform=Flatten --Functions=fib --FlattenDispatch=dispatch2 \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/... test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/.. c-files/..
sw go id
sw shout.c shout.c shout.c
go shout.c shout.c shout.c
id shout.c shout.c shout.c

Flatten
Flatten all functions with switch dispatch and opaque expressions.
tigress --Verbosity=1  \
   --Transform=Flatten --Functions=fib,fac --FlattenObfuscateNext=false --FlattenDispatch=switch \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/flatten_switch.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/flatten_switch c-files/flatten_switch.c
test1.cflatten_switch_opaque.shflatten_switch_opaque.c

Virtualization

Virtualize
Virtualize fib in test1.c using each of the dispatch methods.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=dispatch \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/... test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/... c-files/...
sw if di id ca li bi ip
shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c

Virtualize ⇒ Virtualize
Virtualize fib in test1.c using two levels of interepretation.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=dispatch1 \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=dispatch2 \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/... test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/.. c-files/..
sw if di id ca li bi ip
sw shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
if shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
di shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
id shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
ca shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
li shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
bi shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c
ip shout.c shout.c shout.c shout.c shout.c shout.c shout.c shout.c

Virtualize
Virtualize fib using a switch dispatch, mixed register and stack arguments, and at most two instruction variants of each kind (i.e., no more than 2 ADD instructions, etc.).
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=switch \
   --VirtualizeMaxDuplicateOps=2 --VirtualizeOperands=* \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize_mixed.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize_mixed c-files/virtualize_mixed.c
test1.c.virtualize_mixed.shvirtualize_mixed.c

Virtualize
Virtualize fib using a switch dispatch, register and stack arguments, at most two instruction variants of each kind, and superoperators of length no more than 10.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=switch \
   --VirtualizeMaxDuplicateOps=2 --VirtualizeOperands=* \
   --VirtualizeSuperOpsRatio=2.0 --VirtualizeMaxMergeLength=10 \
   --VirtualizeOptimizeBody=true \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize_super.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize_super c-files/virtualize_super.c
test1.c.virtualize_super.shvirtualize_super.c

Virtualize
Virtualize fib using a switch dispatch, register and stack arguments, at most two instruction variants of each kind, obfuscate operators of length no more than 10, add opaque expressions to the dispatch, and split up instruction handlers using opaque predicates.
tigress --Verbosity=1  \
   --Transform=InitOpaque --Functions=main \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=switch \
   --VirtualizeMaxDuplicateOps=2 --VirtualizeOperands=* \
   --VirtualizeSuperOpsRatio=2.0 --VirtualizeMaxMergeLength=10 \
   --VirtualizeOptimizeBody=true \
   --VirtualizeMaxOpaque=5\
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize_obfuscate.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize_obfuscate c-files/virtualize_obfuscate.c
test1.c.virtualize_obfuscate.shvirtualize_obfuscate.c

Virtualize
Virtualize fib using an interpolation dispatch, running a bogus function in parallel (to thwart virtual PC pattern matching attempts), and inserting bogus computation between instruction executions (to increase the length of instruction traces).
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=UpdateEntropy --Functions=fac --UpdateEntropyVar=n \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=interpolation \
   --VirtualizeNumberOfBogusFuns=1 --VirtualizeBogusFunKinds=collatz \
   --VirtualizeBogusLoopIterations=10 --VirtualizeBogusLoopKinds=collatz \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize_bogus.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize_bogus c-files/virtualize_bogus.c
test1.c.virtualize_bogus.shvirtualize_bogus.c

Virtualize
Virtualize fib using an ifnest dispatch, and make it reentrant, i.e. call fib from multiple places in the program, executing a few instructions at a time, to make the trace less conspicuous. Make as long superoperators as possible, to further reduce the number of times the dispatch loop executes.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=ifnest \
   --VirtualizeSuperOpsRatio=2.0 --VirtualizeMaxMergeLength=20 \
   --VirtualizeReentrant=true \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize_reentrant.c test2.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize_reentrant c-files/virtualize_reentrant.c
test2.c.virtualize_reentrant.shvirtualize_reentrant.c

Sequences of Transformations

Virtualize ⇒ Split
Virtualize fib, and split up the resulting function in order to make the dispatch loop more statically stealthy.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=switch \
   --VirtualizeMaxDuplicateOps=2 --VirtualizeOperands=* \
   --VirtualizeSuperOpsRatio=2.0 --VirtualizeMaxMergeLength=10 \
   --VirtualizeOptimizeBody=true \
   --Transform=Split --Seed=0 --SplitKinds=deep,block,top --SplitCount=100 --Functions=fib \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize-split.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize-split c-files/virtualize-split.c
test1.c.virtualize-split.shvirtualize-split.c

Virtualize ⇒ Flattening
Virtualize fib using an ifnest dispatch and flatten the resulting function using a goto dispatch.
tigress --Verbosity=1  \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=ifnest \
   --Transform=Flatten --Functions=fib   --FlattenObfuscateNext=true --FlattenDispatch=goto \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize-flatten.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize-flatten c-files/virtualize-flatten.c
test1.c.virtualize-flatten.shvirtualize-flatten.c
Merge ⇒ Flatten
Merge fac and fib into fac_fib, and then flatten fac_fib.
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=InitOpaque --Functions=main --InitOpaqueCount=2 --InitOpaqueStructs=list,array \
   --Transform=Merge --Functions=fac,fib --MergeName=MERGED \
   --Transform=Flatten --Functions=MERGED \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/merge-flatten.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/merge-flatten c-files/merge-flatten.c
test1.cmerge-flatten.shmerge-flatten.c
Flatten ⇒ Merge
Flatten fac and fib and then merge them into fac_fib.
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=InitOpaque --Functions=main --InitOpaqueCount=2 --InitOpaqueStructs=list,array \
   --Transform=Flatten --Functions=fac,fib --FlattenObfuscateNext=true --FlattenDispatch=switch \
   --Transform=Merge --Functions=fac,fib \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/flatten-merge.c test1.c 
gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/flatten-merge c-files/flatten-merge.c
test1.cflatten-merge.shflatten-merge.c
Merge ⇒ Flatten ⇒ RndArgs ⇒ Virtualize ⇒ AddOpaque ⇒ Split
Merge fac and fib, flatten, add bogus arguments, replace literals with opaque expressions, virtualize, split up control flow with opaque predicates, and split up the resulting function.
tigress --Verbosity=1  \
   --Transform=InitEntropy --Functions=main \
   --Transform=InitOpaque --Functions=main --InitOpaqueCount=2 --InitOpaqueStructs=list,array \
   --Transform=Merge --Functions=fac,fib --MergeName=MERGED\
   --Transform=Flatten --Functions=MERGED --FlattenObfuscateNext=true --FlattenDispatch=indirect \
   --Transform=RndArgs --RndArgsBogusNo=2?5 --Functions=MERGED \
   --Transform=EncodeLiterals --Functions=MERGED \
   --Transform=Virtualize --Functions=MERGED --VirtualizeDispatch=ifnest \
   --Transform=UpdateOpaque --Functions=MERGED --UpdateOpaqueCount=10 \
   --Transform=AddOpaque --Functions=MERGED --AddOpaqueCount=10  --AddOpaqueKinds=call,bug,true,junk \
   --Transform=Split --SplitKinds=deep,block,top --SplitCount=100 --Functions=MERGED --SplitName=SPLIT\
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/combined1.c test1.c 
gcc -Wno-builtin-requires-header -v -fgnu89-inline -o bin-files/combined1 c-files/combined1.c

test1.ccombined1.shcombined1.c
Virtualize ⇒ Virtualize
Virtualize fib twice, calling Tigress twice from the command line. Use the --FilePrefix option to avoid name clashes.
tigress --Verbosity=1 --FilePrefix=x \
   --FilePrefix=v1 \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=ifnest \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/v1.c test1.c

tigress --Verbosity=1 --FilePrefix=x \
   --FilePrefix=v2 \
   --Transform=Virtualize --Functions=fib --VirtualizeDispatch=ifnest \
   --Transform=CleanUp --CleanUpKinds=annotations \
   --out=c-files/virtualize-virtualize-prefix.c c-files/v1.c

gcc -Wno-builtin-requires-header -fgnu89-inline -o bin-files/virtualize-virtualize-prefix c-files/virtualize-virtualize-prefix.c
test1.cvirtualize-virtualize-prefix.shvirtualize-virtualize-prefix.c


 

All Options

OptionArgumentsDescription
--Environment string A string that describes the architecture, operating system, and compiler being used. We currently recognize the following two strings x86_64:Linux:Gcc:4.6 and x86_64:Darwin:Clang:5.1. This is mostly necessary because Clang does not support some features (most notably asm goto) that Gcc does. In the future we will use this to provide better support for 32-bit binaries. Default=0.
--out file.c The file to write to.
--Seed INTSPEC The randomization seed. --Seed=0 makes Tigress generate its own seed.
--FilePrefix AUTO, NONE, string Use this if you intend to run tigress multiple times on each file to avoid name clashes. Only set this option once. Default=NONE.
  • AUTO = generate a prefix to add to all symbols
  • NONE = don't add any prefix
  • string = add this prefix
--Verbosity int Tigress' chattiness level. --Verbosity=0 makes Tigress quiet. --Verbosity=1 prints each transformation as it is being applied. Default=0.
INTSPEC ?, int?int, int The INTSPEC notation allows randomized selection of integer valued options.
  • ? = select a 32-bit random number
  • int?int = select a random integer value in the range [int,int]
  • int = select this value
BOOLSPEC ?, true, false The BOOLSPEC notation allows randomized selection of boolean valued options.
  • ? = select a random boolean value
  • true = select true
  • false = select false
IDENTSPEC *, ?int, %int, /regexp/, string Many transformations require you to specify the set of functions to which they should be applied. Trivally, you can say --Functions=foo to apply the obfuscation only to foo, but frequently you need more flexibility than that. The IDENTSPEC notation provides this functionality. Some transformations also use identifier specifications to specify variables, as in --UpdateEntropyVar=\* which would select all variables of a function.
  • * = select all available identifiers
  • ?int = randomly select int number of identifiers
  • %int = randomly select int percent of available identifiers
  • /regexp/ = select the identifiers that match the regular expression
  • string = select this identifier
LOCALSPEC The LOCALSPEC notation is used to specify a set of local variables and formal parameters. For example, --LocalVariables='main:i,j;foo:\*'=\* would select all variables of foo and i and j of main. The notation is a semicolon-separated list of IDENTSPEC:IDENTSPEC.
--Prefix string Add this prefix to each new generated symbol. This is in addition to the --filePrefix. Default is "_number_" where number is the order number of the transformation given on the command line. You can set this for every transformation. Default=_number_.
--Exclude string-list Comma-separated list of the functions to exclude from obfuscation. Useful after an --Functions=* or --Functions=?int option, like this: --Functions=* --Exclude=main
--Functions IDENTSPEC The functions to which the transformation should be applied. See below for how to specify a set of functions.
--GlobalVariables IDENTSPEC The global variables to which the transformation should be applied. Currently only used for the --Transform=EncodeData transformation.
--LocalVariables LOCALSPEC The local variables and formal parameters to which the transformation should be applied. Currently only used for the --Transform=EncodeData transformation.
--Transform Virtualize Turn a function into an interpreter.
--VirtualizeShortIdents bool Generate shorter identifiers to produce interpreters suitable for publication. Default=false.
--VirtualizeIsWindows bool Set this to true if you're on Windows rather than a Unix system. Currently only relevant when generating bogus functions.
--VirtualizeDispatch switch, direct, indirect, call, ifnest, linear, binary, interpolation, ? Select the interpreter's dispatch method. Default=switch.
  • switch = dispatch by while(){switch(next){...}}
  • direct = dispatch by direct threading
  • indirect = dispatch by indirect threading
  • call = dispatch by call threading
  • ifnest = dispatch by nested if-statements
  • linear = dispatch by searching a table using linear search
  • binary = dispatch by searching a table using binary search
  • interpolation = dispatch by searching a table using interpolation search
  • ? = Pick a random dispatch method
--VirtualizeOperands stack, registers, mixed, ? Type of operands to allow in the ISA. Default=stack.
  • stack = use only stack arguments to instructions
  • registers = use only register arguments to instructions
  • * = same as stack,registers
  • ? = select one an argument at random.
--VirtualizeMaxDuplicateOps INTSPEC Number of ADD instructions, for example, with different signatures. Default=0.
--VirtualizeRandomOps bool Should opcodes be randomized, or go from 0..n? Default=true.
--VirtualizeSuperOpsRatio Float>0.0 Desired number of super operators. Default=0.0.
--VirtualizeMaxMergeLength INTSPEC Longest sequence of instructions to be merged into one. Default=0.
--VirtualizeMaxOpaque INTSPEC Number of opaques to add to each instruction handler. Default=0.
--VirtualizeNumberOfBogusFuns INTSPEC Weave the execution of random functions into the execution of the original program. This makes certain kinds of pattern-based dynamic analysis more difficult. Default=0.
--VirtualizeBogusFunKinds trivial, arithSeq, collatz, * The kind of bogus function to generate. Comma-separated list. Default=arithSeq,collatz.
  • trivial = insert a trivial computation
  • arithSeq = insert a simple arithmetic loop
  • collatz = insert a computation of the Collatz sequence
  • * = select all options
--VirtualizeBogusLoopKinds trivial, arithSeq, collatz, * Insert a bogus loop for each instruction list. This will extend the length of the trace, making dynamic analysis more difficult. Default=collatz.
  • trivial = insert a trivial computation
  • arithSeq = insert a simple arithmetic loop
  • collatz = insert a computation of the Collatz sequence
  • * = select all options
--VirtualizeBogusLoopIterations INTSPEC Adjust this value to balance performance and trace length. Default=0.
--VirtualizeReentrant bool Make the function reentrant. Default=false.
--VirtualizeOptimizeBody BOOLSPEC Clean up after superoperator generation by optimizing the body of the generated function. Default=false.
--VirtualizeOptimizeTreeCode BOOLSPEC Do constant folding etc. prior to interpreter generation. Default=false.
--VirtualizeTrace bool Insert tracing code to show the stack and the virtual instructions executing. Default=false.
--VirtualizeComment bool Insert comments in the generated interpreter. Default=false.
--VirtualizeDump tree, ISA, instrs, types, vars, strings, calls, bytes, array, stack, * Dump internal data structures used by the virtualizer. Comma-separated list. Default=dump nothing.
  • tree = dump the expression trees generated from the CIL representation
  • ISA = dump the Instruction Set Architecture
  • instrs = dump the generated virtual instructions
  • types = dump the types found
  • vars = dump the local variables found
  • strings = dump the strings found
  • calls = dump the function calls found
  • bytes = dump the bytecode array
  • array = dump the instruction array
  • stack = dump the evaluation stack
  • * = select all options
--Transform Flatten Flatten a function using Chenxi Wang's algorithm
--FlattenDispatch switch, goto, indirect, ? Dispatch method. Default=switch.
  • switch = dispatch by while(1) {switch (next) {blocks}}
  • goto = dispatch by {labl1: block1; goto block2;}
  • indirect = dispatch by goto* (jtab[next])
  • ? = select an dispatch method at random.
--FlattenObfuscateNext BOOLSPEC Whether the dispatch variable should be obfuscated with opaque expressions or not. Default=true.
--FlattenOpaqueStructs list, array, * Type of opaque predicate to use. Traditionally, for this transformation, array is used. Default=array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--FlattenSplitBasicBlocks BOOLSPEC If true, then basic blocks (sequences of assignment and call statements without intervening branches) will be split up into indiviual blocks. If false, they will be kept intact. Default=true.
--FlattenTrace bool Print a message before each block gets executed. Useful for debugging. Default=false.
--Transform Split Outline pieces of a function
--SplitKinds top, block, deep, recursive Comma-separated list specifying the order in which different split methods are attempted. Default=top,block,deep,recursive.
  • top = split the top-level list of statements into two functions funcname_split_1 and funcname_split_2.
  • block = split a basic block (list of assignment and call statements) into two functions.
  • deep = split out a nested control structure of at least height>2 into its own function funcname_split_1.
  • recursive = same as block, but calls to split functions are also allowed to be split out.
--SplitCount INTSPEC How many times to attempt the split. Default=1.
--SplitName string If set, the split out functions will be named prefix_name_number, otherwise they will be named prefix_originalName_split_number.
--Transform Merge Merge of two or more functions. Two different types of merge are supported: simple merge (if () function1 else if () function2 else ...) and flatten merge, where the functions are first flattened, and then the resulting blocks are woven together. This transformation modifies the signature of the function (an extra formal selector argument is added that selects between the constituent functions at runtime), and this cannot be done for functions whose address is taken. --Functions=\* merges together all functions in the program whose signatures can be changed, --Functions=%50 merges together about half of them, etc. It is a good idea to follow this transform by a RndArgs transform to hide the extra selector argument.
--MergeName string If set, the merged function will be named prefix_name, otherwise it will be named prefix_originalName1_originalName2. Note that it's unpredictable which function will be the first and the second, so it's better to set the merged named explicitly.
--MergeObfuscateSelect BOOLSPEC Whether the extra parameter passed to the merged function should be obfuscated with opaque expressions or not. Default=true.
--MergeOpaqueStructs list, array, * Type of opaque predicate to use. Traditionally, for this transformation, array is used. Default=array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--MergeFlatten BOOLSPEC Whether to flatten before merging or not. Default=true.
--MergeFlattenDispatch switch, goto, indirect, ? Dispatch method used for flattened merge. Default=switch.
  • switch = dispatch by while(1) {switch (next) {blocks}}
  • goto = dispatch by {labl1: block1; goto block2;}
  • indirect = dispatch by goto* (jtab[next])
  • ? = select an dispatch method at random.
--Transform RndArgs Randomize the order of arguments to a function and add extra bogus arguments.
--RndArgsBogusNo INTSPEC Number of bogus arguments to add. Default=0.
--Transform InitOpaque Add opaque initialization code. This initialization code has to be added to a function that gets called before any uses of opaque predicates, usually, but not necessarily, to main.
--InitOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to add. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--InitOpaqueCount INTSPEC How many opaque data structures (lists or arrays) to add to the program. They will be split roughly evenly between the different declared opaque structures. Default=1.
--InitOpaqueSize INTSPEC Size of opaque arrays. Default=30.
--Transform AddOpaque Add opaque predicates to split up control-flow.
--AddOpaqueCount INTSPEC How many opaques to add to each function. Default=1.
--AddOpaqueKinds call, bug, true, junk, fake, * Comma-separated list of the types of insertions of bogus computation allowed. Default=call,bug,true,junk.
  • call = if (false) RandomFunction()
  • bug = if (false) BuggyStatement else RealStatement
  • true = if (true) RealStatement
  • junk = if (false) asm(".byte random bytes")
  • fake = if (False) NonExistingFunction()
  • * = Turns all options on.
--Transform UpdateOpaque Add code that makes updates to opaque predicates.
--UpdateOpaqueCount INTSPEC How many updates to opaque data structures to add to the function. Default=1.
--UpdateOpaqueAllowAddNodes bool Is it safe to malloc new nodes for the opaque data structure in this function? Only set to true if the function is called sparingly. Default=false.
--Transform InitBranchFuns Create branch functions.
--InitBranchFunsOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to use for branch functions. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--InitBranchFunsCount INTSPEC How many branch functions to create. Default=1.
--InitBranchFunsObfuscate BOOLSPEC Obfuscate the branch function. Default=true.
--Transform EncodeBranches Replace unconditional branches (gotos) with other constructs.
--EncodeBranchesKinds branchFuns, goto2call, goto2push, * Comma-separated list of the kinds of constructs jumps can be replaced with. Default=branchFuns.
  • branchFuns = Generate calls to branch functions. --Transform=InitBranchFuns must be given prior to this transform
  • goto2call = Replace goto L with push L; call lab; ret; lab: ret
  • goto2push = Replace goto L with push L; ret
  • * = Same as branchFuns,goto2call,goto2push
--EncodeBranchesOpaqueStructs list, array, * Comma-separated list of the kinds of opaque constructs to use in a call to a branch function. Default=list,array.
  • list = Generate opaque expressions using linked lists
  • array = Generate opaque expressions using arrays
  • * = Same as list,array
--EncodeBranchesObfuscate BOOLSPEC Obfuscate the branch function call Default=true.
--EncodeBranchesFlatten BOOLSPEC Flatten before replacing jumps. This opens up more opportunities for replacing unconditional branches. Default=true.
--EncodeBranchesReturnAddressOffset integer The offset (in bytes) of the return address on the stack, for branch functions. May differ based on operating system, word size, and compiler. Default=8.
--Transform InitEntropy Add initialization of the entropy variables.
--Transform UpdateEntropy Add updates to the entropy variables.
--UpdateEntropyVar IDENTSPEC Add to the entropy variables from these variables. Default=*.
--Transform EncodeLiterals Replace literal integers and strings with less obvious expressions.
--EncodeLiteralsKinds integer, string, * Specify the types of literals to encode Default=integer,string.
  • integer = Replace literal integers with opaque expressions
  • string = Replace literal strings with calls to a function that generates them
  • * = Same as integer,string
--EncodeLiteralsEncoderName string The name of the generated encoder function (only for encoded strings). Default=None.
--Transform EncodeArithmetic Replace integer arithmetic with more complex expressions.
--EncodeArithmeitKinds integer Specify the types to encode. Currently, only integer is available. Default=integer.
  • integer = Replace integer arithmetic.
--Transform EncodeData Replace integer variables with a different encoding. Use --GlobalVariables and --LocalVariables to specify the variables that should be transformed. In addition to the variables specifed, any other variables that are related through aliasing will be transformed. Only integer variables, arrays of integers, and pointers to integers are currently supported. Avoid structs, since our alias analysis algorithm conflates all fields.
--EncodeDataCodecs poly1, xor, add, * Comma-separated list of the kinds of codecs that may be used. Only poly1 currently makes sense; avoid the others. Default=poly1.
  • poly1 = Linear transformation of the form a*x+b.
  • xor = Exclusive-or with a constant.
  • add = Add a constant and promote to next largest integer type. Will fail for the largest integer type.
  • * = Same as poly1,xor,add
--Transform RandomFuns Generate a random function useful as an attack target.
--RandomFunsInputSize INTSPEC Size of input. Default=1.
--RandomFunsStateSize INTSPEC Size of internal state. Default=1.
--RandomFunsOutputSize INTSPEC Size of output. Default=1.
--RandomFunsCodeSize INTSPEC Size of the generated code. Currently only 0 (empty body) and 1 (arbitrary non-zero size) make sense. Default=1.
--RandomFunsType int, long, float, double Type of input/output/state. Default=long.
  • int = C int type
  • long = C long type
  • float = C float type
  • double = C double type
--RandomFunsName string The name of the generated function.
--RandomFunsFailureKind message, abort, segv The manner in which a triggered asset may fail. Comma-separated list. Default=segv.
  • message = Print a message.
  • abort = Call the abort function.
  • segv = Die with a segmentation fault.
--RandomFunsActivationCode int The code the user has to enter (as the first command line arguments) to be allowed to run the program. Default=42.
--RandomFunsPassword string The password the user has to enter (read from standar input) to be allowed to run the program. Default="42".
--RandomFunsTimeCheckCount int The number of checks for expired time (gettimeofday() > someTimeInThePast) to be inserted in the program. Default=0.
--RandomFunsActivationCodeCheckCount int The number of checks for correct activation code to be inserted in the program. Default=0.
--RandomFunsPasswordCheckCount int The number of checks for correct password to be inserted in the program. Probably only 0 and 1 make sense here, since the user will be prompted for a password once for every check. Default=0.
--Transform CleanUp Transformation to run last, to clean up the generated code.
--CleanUpKinds names, annotations, constants, * Specify types of cleanup to perform Default=names,annotations,fold.
  • names = Replace identifiers with less obvious ones
  • annotations = Remove annotations that Tigress uses internally. Tigress should not be called again on a file that has had annotations removed
  • constants = Fold constant expressions
  • * = Same as names,annotations,fold
--Transform Info Print internal information.
--InfoKind cfg, fun, linear, WS, DG, CG, alias, global Information to print. For cfg, fun, and linear use --Functions, as usual, to specify which functions to print.
  • cfg = Control Flow Graph
  • fun = Function in internal format
  • linear = Function in internal linearized block format (used as a starting point for flattening and branch functions)
  • WS = Working Set
  • DG = Dependency Graph
  • CG = Call Graph
  • alias = Print the pointer-graphs
  • global = List of global symbols in the original program.


 

Challenge Programs

Here we provide pre-compiled challenge programs generated by Tigress. They have various levels of difficulty and can be used to evaluate the performance of reverse engineering techniques and de-virtualization tools. They are also useful in a pedagogical setting, giving budding reverse engineers the opportunity to cut their teeth on increasingly more challenging targets.

Source Programs

The challenges all take the following form:

#include 
#include 

long foo (long x) {
   ...
}
int main(int argc, char** argv) {
   long x = atoi(argv[1]);
   long y = foo(x);
   printf("%lu\n", y);
} 

Information Recovery Types

There are three types of information that can be recovered:

A particular challenge may specify the type of information to be recovered, or leave this to the reverse engineer.

Attack Types

There are two types of attacks that can be launched:

Contest Rules

  1. A black-box attack, (such as guessing the internals of foo simply by feeding it inputs and examining the outputs) is not considered a successful breach.
  2. Side-channel attacks (attacks that feed inputs to the program and examine behavior such as energy use) are accepted.
  3. Manual as well as automatic tool-based attacks are accepted.
  4. Static as well as dynamic attacks are accepted.
  5. The de-virtualized source should be in C, compilable with gcc, and should have the same behavior as the original binary.
  6. The winner is determined by the time of arrival of the email at our servers.
  7. A panel of judges from DAPA will determine whether a submitted solution constitutes a successful breach.

Prizes

  1. A successful source recovery class attack will be rewarded with a small cash or book prize. The amount will depend on the perceived difficulty of the challenge, but will be on the order of USD100, and/or a copy of Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection, signed by the authors.
  2. A successful non-class attack will be rewarded with a certificate issued by DAPA.

Submission Procedures

A successful breach must contain the following information:

  1. a statement specifying the nature of the attack (source, data, or metadata recovery; singular or class attack);
  2. a short description of the techniques used in the reverse engineering effort (manual or automatic attack, static or dynamic attack, etc.);
  3. a list of any tools used in the reverse engineering effort (disassemblers, decompilers, own scripts, etc.);
  4. an estimate of the amount of time (in person hours) used in the attack;
  5. a short description of the educational and professional experience of the attacker(s).
  6. for class attacks, the following additional data should be submitted:
    1. an attack script written in a well-known programming language for which there exists a free Linux implementation;
    2. a makefile that, when invoked, executes the script on the binary files of the challenge, producing de-virtualized programs as output.

The attack description should be sent in an email to us, consisting of all the relevant information above.

Descriptions of Training Problems

These trivial problems are for training purposes only, and there is no need to send us emails when you have cracked them. Some exercises are provided both as source code and binary. The source code exercises are a useful way to get to know Tigress' transformations and what's necessary to undo them, before embarking on a more challenging binary code analysis.

Descriptions of Challenges

Download

VersionLinks
Tigress 0.9 Mac OS X 10.9, x86/64.
Tigress 0.9 Linux, x86/64.


 

Learn More! Get Involved!

The following text is the standard reference for software protection:

If you want to learn more, please consider attending the next Int. Summer School on Information Security and Protection (ISSISP), the fifth in the series, which will take place in Verona Italy, July 28-August 2. The summer school is open to graduate students and computing professionals. Previous summer schools were held in Beijing (2010), Gent (2011), Tucson (2012), and in Xi'an (2013).

Also, please get involved in the software protection community by joining DAPA, The Digital Asset Protection Association.


 

Contributing

We welcome contributors who want to extend Tigress with new transformations. Send us email if you desire source code access. Keep in mind that you will have to be fluent in OCaml and CIL.

Acknowledgments

Contributors


 

Frequently Asked Questions