ido-lavi

16/11/2024

Imagine writing a message you don’t want anyone else to figure what it says. You might scramble in a way only you can understand.

‍JavaScript obfuscation works similarly—it scrambles code to make it difficult to read but still allows it to work the same way.

When JavaScript is obfuscated, it becomes harder for others to:

Steal the code (e.g., if it’s shared online).
Access sensitive information (such as hidden passwords or unique algorithms / logics).
Modify or hack the code, since it’s challenging to interpret.

For example, here’s a simple function:

    
function greet() {
    console.log("Hello, world!");
}

greet();

‍

After obfuscation, it might look something like this:

    
(function(_0x91fa09,_0x109b4b){
    var _0x3978e8=_0x11a8,_0xc89537=_0x91fa09();
    while(!![]){
        try{
            var _0x413f85=parseInt(_0x3978e8(0x12f))/0x1*(-parseInt(_0x3978e8(0x12d))/0x2)
            +parseInt(_0x3978e8(0x133))/0x3
            +-parseInt(_0x3978e8(0x134))/0x4
            +parseInt(_0x3978e8(0x12a))/0x5
            +-parseInt(_0x3978e8(0x132))/0x6*(parseInt(_0x3978e8(0x135))/0x7)
            +-parseInt(_0x3978e8(0x12b))/0x8*(-parseInt(_0x3978e8(0x130))/0x9)
            +-parseInt(_0x3978e8(0x12c))/0xa;
            if(_0x413f85===_0x109b4b)break;
            else _0xc89537['push'](_0xc89537['shift']());
        }catch(_0x3ba422){
            _0xc89537['push'](_0xc89537['shift']());
        }
    }
}(_0x5e12,0x26265));

function greet(){
    var _0x110708=_0x11a8;
    console[_0x110708(0x12e)](_0x110708(0x131));
}

function _0x11a8(_0x70a44d,_0x23b2c3){
    var _0x5e12c1=_0x5e12();
    return _0x11a8=function(_0x11a8ca,_0x530c9e){
        _0x11a8ca=_0x11a8ca-0x12a;
        var _0xe405db=_0x5e12c1[_0x11a8ca];
        return _0xe405db;
    },_0x11a8(_0x70a44d,_0x23b2c3);
}

function _0x5e12(){
    var _0x66d464=[
        '261779HlKuww','913385dwKjlD','8RZBXYn',
        '1509700TYcQmr','1432mxvjkA','log','201OSqMvM',
        '1935207VsdOPw','Hello,\x20world!',
        '24tkdLoe','878922Xlggqf','359756ZENfWa'
    ];
    _0x5e12=function(){return _0x66d464;};
    return _0x5e12();
}

greet();

‍

The obfuscated version is much harder to read but still does the same thing, you are welcome to try and run it yourself!

‍

2. Understanding the AST (Abstract Syntax Tree)

To build our obfuscation tool, we need to understand what is Abstract Syntax Tree (AST). Think of the AST as a detailed blueprint, organizing code into a hierarchy where each part has a specific role. This structure allows us to efficiently locate and modify code elements for obfuscation.

‍

How Code Transforms into an AST

Think of the AST as a map that breaks your code into labeled parts, each with its own job. Imagine your code as a sentence, and the AST as a way of diagramming each word based on its role. Just like nouns, verbs, and adjectives have unique roles in a sentence, each part of the AST represents a specific action or purpose in your code.

‍

‍

In reality, the AST for the same code in the image will look like this:

    
{
  "type": "Program",
  "start": 0,
  "end": 29,
  "body": [
    {
      "type": "ExpressionStatement",
      "start": 0,
      "end": 29,
      "expression": {
        "type": "CallExpression",
        "start": 0,
        "end": 28,
        "callee": {
          "type": "MemberExpression",
          "start": 0,
          "end": 11,
          "object": {
            "type": "Identifier",
            "start": 0,
            "end": 7,
            "name": "console"
          },
          "property": {
            "type": "Identifier",
            "start": 8,
            "end": 11,
            "name": "log"
          },
          "computed": false,
          "optional": false
        },
        "arguments": [
          {
            "type": "Literal",
            "start": 12,
            "end": 27,
            "value": "Hello, world!",
            "raw": "\"Hello, world!\""
          }
        ],
        "optional": false
      }
    }
  ],
  "sourceType": "module"
}

‍

You are welcome to copy the original code and see yourself the full AST structure in a website called AST Explorer - here.

‍

Understanding the AST Structure

I know the example AST structure might seem like a lot to process- but let’s try to break it down a bit.

1. The Outer Container

    
{
  "type": "Program",
  "start": 0,
  "end": 29,
  "body": [...],
  "sourceType": "module"
}

‍

Think of this as the big box that holds everything. It's like saying "Here's a complete program that runs from character 0 to character 29."

‍

2. The expression statement

    
{
  "type": "ExpressionStatement",
  "start": 0,
  "end": 29,
  "expression": {...}
}

‍

This is like saying "Here's a single line of code that does something." In our case, it's calling console.log().

‍

3. The function call

    
{
  "type": "CallExpression",
  "callee": {...},
  "arguments": [...]
}

‍

This represents the actual action of calling a function. Like when you're calling someone, you need:

Who you're calling (the callee)
What you want to say (the arguments)

‍

4. The function name (The callee)

    
{
  "type": "MemberExpression",
  "object": {
    "type": "Identifier",
    "name": "console"
  },
  "property": {
    "type": "Identifier",
    "name": "log"
  }
}

‍

This is like an address book entry:

First, find "console" (the object)
Then look for "log" inside it (the property)Together they make console.log

‍

5. The message (The arguments)

    
{
  "type": "Literal",
  "value": "Hello, world!",
  "raw": "\"Hello, world!\""
}

‍

This is the actual message we want to print. It's called a "Literal" because it's literally just the text "Hello, world!".

‍

Putting It All Together

If we were to describe this in plain English:

We have a program
Inside it, we're making a function call
The function we're calling is log which lives inside console
We're passing one argument: the text "Hello, world!"

Think of it like giving instructions to a friend:

"Hey, could you find the thing called 'console', look for its 'log' feature, and use it to display 'Hello, world!' on the screen please?"

‍

Visual Hierarchy

    
Program
└── ExpressionStatement
    └── CallExpression (console.log())
        ├── Callee (console.log)
        │   ├── Object (console)
        │   └── Property (log)
        └── Arguments
            └── Literal ("Hello, world!")

‍

This tree-like structure lets us work with specific parts of the code. With an AST, we can find and change specific parts to obfuscate.

If you want to learn more about how AST Parsing works in JavaScript I highly recommend to read this article, it’s just great 🎉

‍

Practical Approach

Libraries for AST Manipulation

To work with the AST in JavaScript, we’ll use three libraries:

Esprima: Turns JavaScript code into an AST.
Estraverse: Allows us to navigate (or “traverse”) the AST and make changes.
Escodegen: Converts the modified AST back into JavaScript code.

To install these libraries, run:

    
npm install esprima estraverse escodegen

‍

3. Step-by-Step Techniques for Obfuscation

Technique 1: Encoding Strings in a Base64 Array

Strings like "password" or "username" provide easy clues about code functionality. Encoding strings as Base64 in an array makes them unreadable, adding a layer of complexity.

‍

How It Works

Find all strings in the code, such as "Hello, world!".
Convert each string to Base64 and store it in an array.
Replace each string with a function call that retrieves and decodes it.

This makes the code a bit trickier to decipher at first glance, adding an extra layer against those who review it.

‍

Example Code

Here’s how we encode strings in Base64 and store them in an array:

    
// Utility to encode strings in Base64
function base64Encode(str) {
    return Buffer.from(str).toString('base64');
}

function encodeStringsInArray(ast) {
    const strings = []; // Array to store encoded strings
    const stringIndices = {}; // Track each string's index
    const decoderFuncName = '_0x' + Math.random().toString(36).substring(2, 8); // Random function name

    // Traverse the AST using estraverse to locate and rename identifiers
    estraverse.traverse(ast, {
        enter(node) {
            // Look for string literals
            if (node.type === 'Literal' && typeof node.value === 'string') {
                if (!(node.value in stringIndices)) {
                    stringIndices[node.value] = strings.length;
                    strings.push(base64Encode(node.value)); // Encode in Base64 before storing
                }
                // Replace the original string with a function call that retrieves and decodes it
                node.type = 'CallExpression';
                node.callee = { type: 'Identifier', name: decoderFuncName };
                node.arguments = [{ type: 'Literal', value: stringIndices[node.value] }];
            }
        }
    });

    // Define the decoder function that retrieves and decodes Base64 strings
    const decodeFunc = `
        function ${decoderFuncName}(_0xindex) {
            const _0xstrings = ${JSON.stringify(strings)};
            return atob(_0xstrings[_0xindex]);
        }
    `;
    const decodeFuncAst = esprima.parseScript(decodeFunc);
    ast.body.unshift(decodeFuncAst.body[0]);
}

‍

Output Example

Original:

    
console.log("Hello, world!");
console.log("Goodbye, world!");

‍

Obfuscated:

    
function _0xte7yht(_0xindex) {
    const _0xstrings = ["SGVsbG8sIHdvcmxkIQ==", "R29vZGJ5ZSwgd29ybGQh"];
    return atob(_0xstrings[_0xindex]);
}

console.log(_0xte7yht(0));
console.log(_0xte7yht(1));

‍

In this obfuscated example:

_0xstrings holds Base64-encoded versions of each string in the original code.
Each string in _0xstrings is accessed using atob() to decode it back to a readable string.
console.log(_0xte7yht(0)) decodes and prints "Hello, world!", while console.log(_0xte7yht(1)) decodes and prints "Goodbye, world!".

This makes the original strings much harder to read and understand.

‍

Technique 2: Renaming Identifiers

When reading a code and it’s identifiers we can gain a basic understanding of what it does. Identifiers are names for functions, variables, and parameters (like username, password, and login). By renaming identifiers, we make it hard for people to understand what each part does.

‍

How It Works

Find all names in the code (like function authenticate(user)).
Replace each name with a random, unreadable name.

‍

Example Code

Here’s how we rename identifiers:

    
// Generates a random obfuscated name in the form '_0x' followed by a random 6-character string
function getObfuscatedName() {
    return '_0x' + Math.random().toString(36).substring(2, 8);
}

function renameIdentifiers(ast) {
    const nameMap = {}; // Stores original names with their obfuscated versions

    // Traverse the AST to locate and rename identifiers
    estraverse.traverse(ast, {
        enter(node) { 
            // Rename declared function and variable names to obfuscated versions
            if ((node.type === 'FunctionDeclaration' || node.type === 'VariableDeclarator') && node.id) {
                if (!nameMap[node.id.name]) {
                    nameMap[node.id.name] = getObfuscatedName();
                }
                node.id.name = nameMap[node.id.name];

            // Update all references to previously renamed identifiers
            } else if (node.type === 'Identifier' && nameMap[node.name]) {
                node.name = nameMap[node.name];
            }
        }
    });
}

‍

Output Example

Original:

    
function authenticate(user) {
    const password = "1234";
    if (user === "admin" && password === "1234") {
        return true;
    }
    return false;
}

‍

Obfuscated:

    
function _0xabc123(_0x456def) {
    const _0x123abc = "1234";
    if (_0x456def === "admin" && _0x123abc === "1234") {
        return true;
    }
    return false;
}

‍

Technique 3: Adding Decoy Variables

My favorite feature! Increasing the code's size. Decoy variables are meaningless variables that don’t affect functionality but add complexity, making it harder to identify key variables.

‍

How It Works

Create a new random variable.
Assign it a random value.
Insert it into the code.

‍

Example Code

Here’s how we add decoy variables to the code:

    
    // Generate a decoy variable as a string, using a random hexadecimal value
    // getObfuscatedName() creates an obfuscated variable name taken from 'Technique 2'
function addDecoyVariable(ast) {
    const decoyVar = `
        var ${getObfuscatedName()} = parseInt('0x${Math.floor(Math.random() * 1000).toString(16)}', 16);
    `;

    // Parse the decoy variable string into an AST node
    const decoyAst = esprima.parseScript(decoyVar);

    // Insert the decoy variable AST node at the start of the main AST's body
    ast.body.unshift(decoyAst.body[0]);
}

‍

Output Example

Original:

    
function greet() {
    console.log("Hello, world!");
}

‍

Obfuscated:

    
var _0xabc123 = parseInt('0x4d2', 16); // Decoy variable
function greet() {
    console.log("Hello, world!");
}

‍

Technique 4: Wrapping Code in an IIFE

An IIFE (Immediately Invoked Function Expression) wraps the code in a function that’s executed immediately, creating a "shell" around the code.

‍

How It Works

Wrap the code in a self-calling function.
This isolates variables and functions from the global scope, hiding them and reducing the chance they’ll interfere with or be accessed by other parts of the program.

‍

Why It Helps

Wrapping code in an IIFE provides a protective layer, reducing the exposure of internal variables and functions when debugging it. This keeps the code self-contained, making it harder to analyze inside logic from the main scope of the code. This is especially useful for obfuscation, as it creates an additional step for someone trying to understand or manipulate your code.

‍

Example Code

Here’s how we wrap the code in an IIFE:

    
 // Generate the code as a string, wrapping it in an IIFE structure
function wrapInIIFE(ast) {
    const wrappedCode = `
        (function() {
            ${escodegen.generate(ast)} // Convert the AST to code and insert it here
        })();
    `;

    // Parse the wrapped code string back into an AST format
    return esprima.parseScript(wrappedCode);
}

‍

Output Example

Original:

    
function greet() {
    console.log("Hello, world!");
}

‍

Obfuscated:

    
(function() {
    function greet() {
        console.log("Hello, world!");
    }
})();

‍

4. Putting It All Together

Here’s the complete code for the obfuscator with detailed comments. This combines all techniques to obfuscate any JavaScript input.

    
const esprima = require('esprima');      // Parse JavaScript into an AST
const estraverse = require('estraverse'); // Traverse and modify the AST
const escodegen = require('escodegen');  // Convert AST back to JavaScript

// Generates a random name for an obfuscated identifier
function getObfuscatedName() {
    return '_0x' + Math.random().toString(36).substring(2, 8);
}

// Helper function to encode strings in Base64
function base64Encode(str) {
    return Buffer.from(str).toString('base64');
}

// Technique 1: Encode strings in Base64
function encodeStringsInArray(ast) {
    const strings = [];
    const stringIndices = {};
    const decoderFuncName = getObfuscatedName();

    estraverse.traverse(ast, {
        enter(node) {
            if (node.type === 'Literal' && typeof node.value === 'string') {
                if (!(node.value in stringIndices)) {
                    stringIndices[node.value] = strings.length;
                    strings.push(base64Encode(node.value));
                }
                node.type = 'CallExpression';
                node.callee = { type: 'Identifier', name: decoderFuncName };
                node.arguments = [{ type: 'Literal', value: stringIndices[node.value] }];
            }
        }
    });

    const decodeFunc = `
        function ${decoderFuncName}(_0xindex) {
            const _0xstrings = ${JSON.stringify(strings)};
            return atob(_0xstrings[_0xindex]);
        }
    `;
    const decodeFuncAst = esprima.parseScript(decodeFunc);
    ast.body.unshift(decodeFuncAst.body[0]);
}

// Technique 2: Rename identifiers to obscure names
function renameIdentifiers(ast) {
    const nameMap = {};

    estraverse.traverse(ast, {
        enter(node) {
            if ((node.type === 'FunctionDeclaration' || node.type === 'VariableDeclarator') && node.id) {
                if (!nameMap[node.id.name]) {
                    nameMap[node.id.name] = getObfuscatedName();
                }
                node.id.name = nameMap[node.id.name];
            } else if (node.type === 'Identifier' && nameMap[node.name]) {
                node.name = nameMap[node.name];
            }
        }
    });
}

// Technique 3: Add meaningless decoy variables
function addDecoyVariable(ast) {
    const decoyVar = `
        var ${getObfuscatedName()} = parseInt('0x${Math.floor(Math.random() * 1000).toString(16)}', 16);
    `;
    const decoyAst = esprima.parseScript(decoyVar);
    ast.body.unshift(decoyAst.body[0]);
}

// Technique 4: Wrap the code in an IIFE
function wrapInIIFE(ast) {
    const wrappedCode = `
        (function() {
            ${escodegen.generate(ast)}
        })();
    `;
    return esprima.parseScript(wrappedCode);
}

// Main obfuscation function combining all techniques
function obfuscate(inputCode) {
    const ast = esprima.parseScript(inputCode);

    encodeStringsInArray(ast);
    renameIdentifiers(ast);
    addDecoyVariable(ast);
    const wrappedAst = wrapInIIFE(ast);

    return escodegen.generate(wrappedAst);
}

// Usage example
const code = `
    function greetUser(role) {
		    const adminMessage = "Welcome, Admin!";
		    const guestMessage = "Greetings, Guest!";
		    
		    if (role === "admin") {
		        console.log(adminMessage);
		    } else {
		        console.log(guestMessage);
		    }
		}

		// Test cases
		greetUser("admin");

`;

const obfuscatedCode = obfuscate(code);
console.log('Obfuscated Code:\n', obfuscatedCode);

‍

Final Output Example

Original:

    
function greetUser(role) {
    const adminMessage = "Welcome, Admin!";
    const guestMessage = "Greetings, Guest!";
    
    if (role === "admin") {
        console.log(adminMessage);
    } else {
        console.log(guestMessage);
    }
}

// Test cases
greetUser("admin");

‍

Obfuscated:

    
(function () {
    var _0xhyqf5w = parseInt('0xd4', 16);
    function _0xv0k6sc(_0xindex) {
        const _0xfrts7e = [
            'V2VsY29tZSwgQWRtaW4h',
            'R3JlZXRpbmdzLCBHdWVzdCE=',
            'YWRtaW4='
        ];
        return atob(_0xfrts7e[_0xindex]);
    }
    function _0xf5eil4(role) {
        const _0xedyv0z = _0xv0k6sc(0);
        const _0x1mktfi = _0xv0k6sc(1);
        if (role === _0xv0k6sc(2)) {
            console.log(_0xedyv0z);
        } else {
            console.log(_0x1mktfi);
        }
    }
    _0xf5eil4(_0xv0k6sc(2));
}());

‍

And there you have it - the final JavaScript obfuscator code after combining all the techniques we’ve learned, making your JavaScript code much harder to read and reverse-engineer!

A Beginner’s Guide to JavaScript Obfuscation

2. Understanding the AST (Abstract Syntax Tree)

How Code Transforms into an AST

Understanding the AST Structure

1. The Outer Container

2. The expression statement

3. The function call

4. The function name (The callee)

5. The message (The arguments)

Putting It All Together

Visual Hierarchy

Practical Approach

Libraries for AST Manipulation

3. Step-by-Step Techniques for Obfuscation

Technique 1: Encoding Strings in a Base64 Array

How It Works

Example Code

Output Example

Technique 2: Renaming Identifiers

How It Works

Example Code

Output Example

Technique 3: Adding Decoy Variables

How It Works

Example Code

Output Example

Technique 4: Wrapping Code in an IIFE

How It Works

Why It Helps

Example Code

Output Example

4. Putting It All Together

Final Output Example