Firmware Backdoors

Sam Thomas » s.l.thomas@cs.bham.ac.uk
2nd year Ph.D. student

University of Birmingham

Outline

  • Introductory points;
  • Backdoors as a class of malware;
  • Problems related to their detection;
  • Current state–of–the–art in backdoor detection.

Embedded devices

  • Routers;
  • IP-cameras;
  • Monitoring systems;
  • …;
  • ECUs in train CAN network.

What is firmware?

The software that provides basic functionality to a given device.

We want to know:

  • If we can trust the firmware when the device originates from the other side of the world;
  • How can we establish this trust.

What is a Backdoor?

What is a backdoor?

“…a mechanism surreptitiously introduced into a computer system to facilitate unauthorised access to a system…”

– Zhang et al.

What is a Backdoor?

  • Hard–coded credentials;
  • Hidden, undocumented functionality (extra services, commands, etc.);
  • Cryptographic backdoors (e.g. Dual_EC_DRBG).

Backdoors in Firmware

  • Embedded device firmware security is a disaster:
    • Poor coding practices (laced with vulnerabilities from the 90’s);
    • Internet–facing “debug” interfaces;

Scale of the Problem

Scale of the Problem

  • Consumer router;
  • For some configurations backdoor is exploitable via WAN (ergo Internet–facing);
  • Backdoor facilitates complete control over device.

Scale of the Problem

Scale of the Problem

TCP–32764

  • Backdoor found in routers by many manufacturers (Cisco, Linksys, Netgear, etc.);
  • Reasonable amount of firmware exploitable via WAN without special configuration.

Scale of the Problem

TCP–32764

Scale of the Problem

Scale of the Problem

  • Backdoor enables access to device configuration panel without authentication;
  • xmlset_roodkcableoj28840ybtideeditby04882joelbackdoor_teslmx;
  • As simple as changing your browser’s user–agent string.

Scale of the Problem

In other words, a lots of devices we assume trusted are vulnerable by design.

  • We should be worried;
  • We can’t implicitly trust a given device we deploy performs only its intended purpose.

Detection of Malware in Firmware

Overview

  • Lots of devices, lots of firmware;
  • Heterogeneous in their architecture (ARM, MIPS, PPC, etc.);
  • Multiple firmware versions for each device;
  • In general, can’t run firmware;
  • Manual analysis takes significant time and a degree of expertise.

My Research

Goals

  • Would like to perform automated, large-scale, lightweight analysis of firmware;
  • But… predicting program behaviour is not possible in the general case (reduction to Halting problem);
  • Would like to check firmware for potential anomalous behaviour (backdoors);
  • Focus on devices with largest market share (embedded Linux-based systems).

What is a classifier?

Given an input and previous knowledge, produce a label corresponding to the input as output.

Input in our case is a program.

Problems

  • Backdoors are hard to find – not many examples;
  • Modern “malware” analysis utilises classifiers derived from machine learning algorithms:
    • Require large enough data set of benign/malicious examples: not possible for backdoors;
    • How to apply to backdoor detection?

Proposed System

  • A classifier to infer the “type” of a given piece of software;
  • A domain-specific language to encode expected software behaviours as functionality profiles (that can be validated);
  • (Due to static analysis) optimistically attempt to validate the usage of identified anomalous software.

Prototype: BackScan

Collaborative work with Tom Chothia, Flavio D. Garcia & Christopher Green

Prototype Overview

  1. Obtain firmware from manufacturer;
  2. Unpack firmware (giving us programs);
  3. For each program:
    1. Classify program;
    2. Attempt to match the estimated functionality against the corresponding profile;
    3. If anomalous attempt to establish the usage of said program;
    4. Report anomalies.

Results

Classifier

  • Performance (of chosen classifier):
    • On training data using 10-fold cross validation: ~96%
    • On separate test data (460 firmware images/~18,000 programs): ~95%
  • Weighted average performance over all classes (correct to 3 s.f.):
    • TP rate: 0.966
    • FP rate: 0.000
    • Precision: 0.956
    • Recall: 0.966

Classifier

Performance on common services found within firmware (correct to 3 s.f.):

Labels TP rate FP rate
web-server 0.957 0.001
ftp-server 0.956 0.001
ssh-daemon 0.960 0.000
telnet-daemon 0.976 0.000
busybox 0.996 0.000

Artificial instances

  • Developed new backdoor: embeddable remote control service;
  • Inserted into common executables found within firmware:
    • mini_httpd
    • utelnetd
  • Successfully identified modified executables as anomalous even with varying optimisation levels applied and differing target architectures (ARM and MIPS).

Real-world examples

  • Overall majority of programs checked conformed to their functionality profile (quite a positive result);
  • The majority of those that didn’t were benign, but contained unexpected functionality:
    • Web-server with a built-in DNS resolver.

Real-world examples

  • Identified previously undocumented functionality in some D-Link web-servers:
    • Allows for unauthenticated device reconfiguration with shell access to the device.
  • Identified documented backdoor present in a number of instances of Tenda router firmware.

What do these results say?

  • Overall situation is not terrible;
  • But… any device (including those used within trains) has the potential to facilitate backdoor access;
  • Many of the vulnerabilities are trivial to exploit by attackers;
  • Compromised embedded devices can provide an attack vector to compromise part of a larger system.

Future Work

  • Automatic derivation of functionality profiles based on a given set of known non-anomalous example executables for a given class;
  • Construct more complex functionality profiles for more classes of executable;
  • Rebuild classifier using a classifier derived from semi-supervised learning.

Questions?