Malware Detection

Firmware Backdoors

Sam Thomas » s.l.thomas@cs.bham.ac.uk
University of Birmingham

Outline

  • Backdoors as a class of malware
  • Problems related to their detection
  • Current state–of–the–art in backdoor detection
  • Proposed system to detect backdoors (in firmware)

What is a Backdoor?

What is a backdoor?

“…a mechanism surreptitiously introduced into a computer system to facilitate unauthorised access to a system…”

– Zhang et al.

What is a Backdoor?

  • Hard–coded credentials
  • Hidden, undocumented functionality (extra services, commands, etc.)
  • Cryptographic backdoors (e.g. Dual_EC_DRBG)

Backdoors in Firmware

  • Embedded device firmware security is already a disaster:
    • Poor coding practices (laced with vulnerabilities from the 90’s)
    • Internet–facing “debug” interfaces

Scale of the Problem

Scale of the Problem

  • Consumer router
  • For some configurations backdoor is exploitable via WAN (ergo Internet–facing)
  • Backdoor facilitates complete control over device

Scale of the Problem

Scale of the Problem

TCP–32764

  • Backdoor found in routers by many manufacturers (Cisco, Linksys, Netgear, etc.)
  • Reasonable amount of firmware exploitable via WAN without special configuration

Scale of the Problem

TCP–32764

Scale of the Problem

Scale of the Problem

  • Backdoor enables access to device configuration panel without authentication
  • xmlset_roodkcableoj28840ybtideeditby04882joelbackdoor_teslmx
  • As simple as changing your browser’s user–agent string

Detection of Malware in Firmware

Overview

  • Lots of devices, lots of firmware
  • Heterogeneous in their architecture (ARM, MIPS, PPC, etc.)
  • Multiple firmware versions for each device
  • In general, can’t execute firmware
  • Manual analysis takes significant time and a degree of expertise

Backdoor Detection “by hand”

  1. Obtain firmware image
  2. Extract firmware filesystem
  3. Manually analyse each program/library using disassembler (e.g. IDA Pro)

Backdoor Detection “by hand”

Backdoor Detection “by hand”

  • Possible to automate some parts of analysis:
    • Extraction of filesystem
  • General analysis is impractical and expensive
  • Backdoors are dissimilar to other security vulnerabilities:
    • Generally not bugs in code
    • Often inserted deliberately

Types of Program Analysis

Static Analysis

  • Focus on so–called dead–code listing
  • Possible even without hardware/emulator required to run firmware/program

Types of Program Analysis

Dynamic analysis

  • Focus on analysing the changing state of a running program
  • It’s possible to detect such analysis being performed and hence programs can defend themselves (anti–debugging)

Current Research

A-WEASEL

Holz et. al., 2013

  • Generalised algorithm to perform backdoor detection via dynamic analysis
  • Builds on differential/delta debugging
  • Interacts as a client with a server application with a known protocol
  • Attempts to identify suspicious code paths
  • Requires a live system and the ability to inject a debugging stub into said system

A-WEASEL

Holz et. al., 2013

Avatar

Zaddach et. al. 2014

Avatar

Zaddach et. al. 2014

  • Generalised framework for performing dynamic analysis of embedded device firmware
  • Offers a hybrid approach to emulation: uses device itself and symbol execution via a host computer
  • Requires physical, intrusive access to hardware

My Research

Goals

  • Would like to perform large-scale, lightweight analysis of firmware
  • But… predicting program behaviour is not possible in the general case (reduction to Halting problem)
  • Would like to check firmware for potential anomalous behaviour (backdoors)
  • Focus on devices with largest market share (embedded Linux-based systems)

Problems

  • Backdoors are hard to find – not many examples
  • Modern “malware” analysis utilises classifiers derived from machine learning algorithms:
    • Require large enough data set of benign/malicious examples: not possible for backdoors
    • How to apply to backdoor detection?

Proposed System

  • A classifier to infer the “type” of a given executable
  • A domain-specific language to encode expected executable behaviours as functionality profiles (that can be validated)
  • (Due to static analysis) optimistically attempt to validate the usage of identified anomalous executables

Prototype: BackScan

Collaborative work with Tom Chothia, Flavio D. Garcia & Christopher Green

Prototype Overview

  1. Unpack firmware (using existing methods)
  2. For each executable:
    1. Classify executable
    2. Attempt to match the estimated functionality against the corresponding profile
    3. If anomalous attempt to establish the usage of said executable
    4. Report anomalies to analyst

Machine learning

Classifier

Essentially learn a function:

classify : Executable → ExecutableClass

Possible Methods

  • Supervised learning (requires labelled training set)
  • Unsupervised learning (clusters data into “similar” groups)

Feature Selection

  • High-level homogeneous attributes consistent amongst multiple architectures:
    • Imported and exported functions
    • Strings
  • Apply standard techniques (part of machine learning package – WEKA) to select features with highest utility (Information Gain and Information Gain Ratio)

Difficulties

  • Extraction of executables from binary blobs is imprecise (with current tools):
  • For the features selected, this matters: additional strings distort the input to the training algorithm
  • Develop an algorithm to rebuild ELF executable prior to feature extraction

Binary Functionality Description Language

Interpreter

profile : Executable × ExecutableClass → {true, false}

Specification

<top-level> ::= rule <ident>(<args>) = <expr>                  <rule> ::= <base-rule>
              | import <string>                                         | <ident>

     <expr> ::= <rule>(<values>)                          <base-rule> ::= import_exists
              | let <ident> = <expr> in <expr>                          | export_exists
              | if <expr> then <expr> else <expr>                       | string_exists
              | ! <expr>                                                | function_ref
              | <expr> <binary-op> <expr>                               | string_ref
              | <value> <comp-op> <value>                               | architecture
              | forall <ident>(<values>) => <expr>                      | endianness
              | exists <ident>(<values>) => <expr>

    <value> ::= <const>                                        <const> ::= <bool>
              | <ident>                                                  | <int>
              | <value> <arith-op> <value>                               | <string>

<variables> ::= <variable>                                      <args> ::= ε
              | <variable> , <variables>                                 | <variables>

 <arith-op> ::= + | - | * | / | % | & | ^ | | | ~ | << | >>   <values> ::= <value>
  <comp-op> ::= == | != | < | > | <= | >=                                | <value> , <values>
 <logic-op> ::= || | && | ^^

Predicates

  • Verification of symbol existence:
    • import_exists/export_exists/string_exists
  • Verification of symbol use within code/data:
    • function_ref/string_ref
  • Architecture meta-data:
    • architecture/endianness

Predicates

  • Quantified predicates over identified function arguments:
    • exists <function-name> (arg0 : type0, …) => …
    • forall <function-name> (arg0 : type0, …) => …
  • …evaluate to true if the expression specified after => evaluates to true with the estimated arguments passed to the function inserted into its evaluation context.

Examples

exists puts(msg: string) => msg == "Hello, World"

…evaluates to true if at least one use of the function puts is passed “Hello, World” as an argument.

Examples

Profile of a strictly TCP service

import "prelude"

rule tcp_only() = tcp()
               && !udp()

Profile of a (rather odd) UDP-based service

import "prelude"

rule picky_udp() = outgoing_udp()
                && !incoming_udp()

Simple HTTP server

import "prelude"

rule httpd() =
 -- No incoming/outgoing UDP traffic (from prelude)
    !udp() 

 -- Incoming/outgoing TCP traffic (from prelude)
 && tcp()  

 -- Expect to listen on port 80 and/or port 443 (SSL/TLS)
 && forall htons(x: int) => (x == 80 || x == 443)             

 -- May read/write files (from prelude)
 && read_write_filesystem()

Validation

Executable usage

  • Firmware is never executed; how to know if a given executable is even used?
  • Static analysis is imperfect and only an estimation:
    • Overestimate: more false positives – hence more paranoia
    • Underestimate: less false positives – might miss obvious cases
    • No concrete guarantees in either case

Estimation of executables used

  1. Scan firmware image for “default” boot/init scripts
  2. Parse shell scripts:
    1. Handle further shell scripts by (2)
    2. Handle executables by scanning for calls to system; handling shell scripts by (2) and executables by (4)

Assumptions

  • All conditional branches in scripts are possible
  • All calls to system with static strings as arguments in executables are possible
  • All calls to system without static strings are not considered

Results

Classifier

  • Performance (of chosen classifier):
    • On training data using 10-fold cross validation: ~96%
    • On separate test data (460 firmware images/~18,000 executables): ~95%
  • Weighted average performance over all classes (correct to 3 s.f.):
    • TP rate: 0.966
    • FP rate: 0.000
    • Precision: 0.956
    • Recall: 0.966

Classifier

Performance on common services (correct to 3 s.f.):

Labels TP rate FP rate
web-server 0.957 0.001
ftp-server 0.956 0.001
ssh-daemon 0.960 0.000
telnet-daemon 0.976 0.000
busybox 0.996 0.000

Artificial instances

  • Developed new backdoor: embeddable remote control service
  • Inserted into common executables found within firmware:
    • mini_httpd
    • utelnetd
  • Successfully identified modified executables as anomalous even with varying optimisation levels applied and differing target architectures (ARM and MIPS)

Real-world examples

  • Overall majority of executables checked conformed to their functionality profile (quite a positive result)
  • The majority of those that didn’t were benign, but contained unexpected functionality:
    • Web-server with a built-in DNS resolver.

Real-world examples

  • Identified previously undocumented functionality in some D-Link web-servers:
    • Allows for unauthenticated device reconfiguration with shell access to the device
  • Identified documented backdoor present in a number of instances of Tenda router firmware

Security Analysis

…or is it possible to evade BackScan?

Sometimes.

  • Static linking binaries removes certain meta-data:
    • Can no longer identify imported library functions
  • Insertion of superfluous meta-data (e.g. strings) to fool the classifier:
    • Executable may be identified as some other class of executable
  • Both cases increase the final workload for the analyst but will be detected by the tool as anomalous (at the cost of more False Positives)

Complications

  • Functionality of some executables is very general (e.g. Busybox):
    • Impossible to derive a meaningful functionality profile
  • Misses environmental features that can influence configuration:
    • Listening port numbers for various server software stored within configuration files, etc.
  • Both are beyond the lightweight analysis techniques proposed

Future Work

  • Automatic derivation of functionality profiles based on a given set of known non-anomalous example executables for a given class
  • Construct more complex functionality profiles for more classes of executable
  • Rebuild classifier using a classifier derived from unsupervised learning
    • Possible M.Sc. project? ͡° ͜ʖ ͡°

Questions?