References
- Tutorial:
- Papers:
- resources:
- https://github.com/GCMiner/GCMiner
Introduction
- 语法分析:也称为词法分析,是将输入的源代码转换为标记(token)序列,并检查它们是否符合语言的语法规则的过程。它主要关注的是源代码的结构是否正确。
- 语义分析:是在语法分析的基础上对标记序列进行进一步的处理,确定它们的含义并进行语言上下文的解释。它主要关注的是源代码的含义是否正确
Zest 算法支持两种:
- 基于覆盖率进行突变
- 利用Junit的 Assume Api 来检验 输入的语法有效性
Requirements
- maven
- java 9+ (果然只有国内项目普遍使用java8)
Build JQF
git clone https://github.com/rohanpadhye/jqf
cd jqf
mvn package
Tutorial
基本是GitHub上的Tutorial Docs, 添加部分自己的理解。
0x01 Fuzzing-with-Zest
source code
- CalendarLogic.java
- 被测试对象
- CalendarGenerator.java
- 测试样板生成器(随机)
- CalendarTest.java
- 测试脚本
源代码,见 https://github.com/rohanpadhye/jqf/wiki/Fuzzing-with-Zest
具体的代码解释,也请见原文 和 官方 Docs
测试框架为 junit-quickcheck
https://pholser.github.io/junit-quickcheck/site/0.8.2/usage/other-types.html
大致源代码含义:
CalendarLogic.java
import java.util.GregorianCalendar;
import static java.util.GregorianCalendar.*;
/* Application logic */
public class CalendarLogic {
// Returns true iff cal is in a leap year
public static boolean isLeapYear(GregorianCalendar cal) {
int year = cal.get(YEAR);
if (year % 4 == 0) {
if (year % 100 == 0) {
return false;
}
return true;
}
return false;
}
// Returns either of -1, 0, 1 depending on whether c1 is <, =, > than c2
public static int compare(GregorianCalendar c1, GregorianCalendar c2) {
int cmp;
cmp = Integer.compare(c1.get(YEAR), c2.get(YEAR));
if (cmp == 0) {
cmp = Integer.compare(c1.get(MONTH), c2.get(MONTH));
if (cmp == 0) {
cmp = Integer.compare(c1.get(DAY_OF_MONTH), c2.get(DAY_OF_MONTH));
if (cmp == 0) {
cmp = Integer.compare(c1.get(HOUR), c2.get(HOUR));
if (cmp == 0) {
cmp = Integer.compare(c1.get(MINUTE), c2.get(MINUTE));
if (cmp == 0) {
cmp = Integer.compare(c1.get(SECOND), c2.get(SECOND));
if (cmp == 0) {
cmp = Integer.compare(c1.get(MILLISECOND), c2.get(MILLISECOND));
}
}
}
}
}
}
return cmp;
}
}
CalendarGenerator.java
import java.util.GregorianCalendar;
import java.util.TimeZone;
import com.pholser.junit.quickcheck.generator.GenerationStatus;
import com.pholser.junit.quickcheck.generator.Generator;
import com.pholser.junit.quickcheck.random.SourceOfRandomness;
import static java.util.GregorianCalendar.*;
public class CalendarGenerator extends Generator<GregorianCalendar> {
Cycles completed CalendarGenerator() {
super(GregorianCalendar.class); // Register the type of objects that we can create
}
// This method is invoked to generate a single test case
@Override
public GregorianCalendar generate(SourceOfRandomness random, GenerationStatus __ignore__) {
// Initialize a calendar object
GregorianCalendar cal = new GregorianCalendar();
cal.setLenient(true); // This allows invalid dates to silently wrap (e.g. Apr 31 ==> May 1).
// Randomly pick a day, month, and year
cal.set(DAY_OF_MONTH, random.nextInt(31) + 1); // a number between 1 and 31 inclusive
cal.set(MONTH, random.nextInt(12) + 1); // a number between 1 and 12 inclusive
cal.set(YEAR, random.nextInt(cal.getMinimum(YEAR), cal.getMaximum(YEAR)));
// Optionally also pick a time
if (random.nextBoolean()) {
cal.set(HOUR, random.nextInt(24));
cal.set(MINUTE, random.nextInt(60));
cal.set(SECOND, random.nextInt(60));
}
// Let's set a timezone
// First, get supported timezone IDs (e.g. "America/Los_Angeles")
String[] allTzIds = TimeZone.getAvailableIDs();
// Next, choose one randomly from the array
String tzId = random.choose(allTzIds);
TimeZone tz = TimeZone.getTimeZone(tzId);
// Assign it to the calendar
cal.setTimeZone(tz);
// Return the randomly generated calendar object
return cal;
}
}
CalendarTest.java
import java.util.*;
import static java.util.GregorianCalendar.*;
import static org.junit.Assert.*;
import static org.junit.Assume.*;
import org.junit.runner.RunWith;
import com.pholser.junit.quickcheck.*;
import com.pholser.junit.quickcheck.generator.*;
import edu.berkeley.cs.jqf.fuzz.*;
@RunWith(JQF.class)
public class CalendarTest {
@Fuzz
public void testLeapYear(@From(CalendarGenerator.class) GregorianCalendar cal) {
// Assume that the date is Feb 29
assumeTrue(cal.get(MONTH) == FEBRUARY);
assumeTrue(cal.get(DAY_OF_MONTH) == 29);
// Under this assumption, validate leap year rules
assertTrue(cal.get(YEAR) + " should be a leap year", CalendarLogic.isLeapYear(cal));
}
@Fuzz
public void testCompare(@Size(max=100) List<@From(CalendarGenerator.class) GregorianCalendar> cals) {
// Sort list of calendar objects using our custom comparator function
Collections.sort(cals, CalendarLogic::compare);
// If they have an ordering, then the sort should succeed
for (int i = 1; i < cals.size(); i++) {
Calendar c1 = cals.get(i-1);
Calendar c2 = cals.get(i);
(c1.equals(c2)); // Assume that we have distinct dates
assertTrue(c1 + " should be before " + c2, c1.before(c2)); // Then c1 < c2
}
}
}
编译
# 编译class文件
javac -cp .:$(../JQF/scripts/classpath.sh) CalendarLogic.java CalendarGenerator.java CalendarTest.java
# 执行测试,但产生以下报错:
# Assumption is too strong; too many inputs discarded
java -cp .:$(../JQF/scripts/classpath.sh) org.junit.runner.JUnitCore CalendarTest
上面会报错,作者卖了个关子,大致是说,CalendarGenerator.generate()
是一个标准的伪随机源,绝大多数数据很有可能不会触发到设置的testLeapYear
启用 Zest 算法
../JQF/bin/jqf-zest -c .:$(../JQF/scripts/classpath.sh) CalendarTest testLeapYear
# mvn 的方式
mvn jqf:fuzz -Dclass=CalendarTest -Dmethod=testLeapYear
很酷!太Auto 优雅了。
部分输出结果:
- Unique failure inputs:
- fuzz-results/failures
- Common successful inputs:
- fuzz-results/corpus
在 Ctrl + C 手动中止后,可以使用如下命令来查看和分析失败的样例。
../JQF/bin/jqf-repro -c .:$(../JQF/scripts/classpath.sh) CalendarTest testLeapYear fuzz-results/failures/id_000000
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
.id_000000 ::= FAILURE (java.lang.AssertionError)
E
Time: 0.062
There was 1 failure:
1) testLeapYear(CalendarTest)
java.lang.AssertionError: 28271600 should be a leap year
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.assertTrue(Assert.java:42)
at CalendarTest.testLeapYear(CalendarTest.java:21)
FAILURES!!!
Tests run: 1, Failures: 1
28271600 是闰年,可被 400 整除
jqf-zest -c .:$(${JQF_HOME}/scripts/classpath.sh) CalendarTest testCompare
# output:
# COOL!
Semantic Fuzzing with Zest
--------------------------
Test name: CalendarTest#testCompare
Instrumentation: Janala
Results directory: /home/fe1w0/SoftwareAnalysis/DynamicAnalysis/tutorial/fuzz-results
Elapsed time: 11s (no time limit)
Number of executions: 6,254 (no trial limit)
Valid inputs: 6,144 (98.24%)
Cycles completed: 1
Unique failures: 1
Queue size: 36 (5 favored last cycle)
Current parent input: 0 (favored) {564/700 mutations}
Execution speed: 776/sec now | 523/sec overall
Total coverage: 32 branches (0.05% of map)
Valid coverage: 32 branches (0.05% of map)
小节
JQF 设置属性的方式,有点类似从反射或者set的方式,对object中的属性进行设置,但example 给的变量有点不太auto,
JQF 的 driver 编写流程:
- 初始化阶段:
- 编写 生成器,需要符合一定规范,以供 test driver 使用。
public class CalendarGenerator extends Generator<GregorianCalendar>
public CalendarGenerator() { super(GregorianCalendar.class); // Register the type of objects that we can create }
@Override public GregorianCalendar generate(SourceOfRandomness random, GenerationStatus __ignore__) { ... ... ret cal // Return the randomly generated calendar object }
- 通过 反射或者set的方式,对需要设置的field进行随机赋值,如
cal.setLenient(true);
、cal.set(MONTH, random.nextInt(12) + 1);
等等
- 编写 生成器,需要符合一定规范,以供 test driver 使用。
- 测试阶段,编写 test driver :
- 基本要求,类注解
@RunWith(JQF.class)
- 测试函数注解,
@Fuzz
- 测试函数参数注解
public void testLeapYear(@From(CalendarGenerator.class) GregorianCalendar cal)
public void testCompare(@Size(max=100) List<@From(CalendarGenerator.class) GregorianCalendar> cals)
- 断言与函数测试脚本编写
- 基本要求,类注解
0x02 Writing a JQF test using MVN
Setup JQF mvn envs
参考Fuzzing a Compiler · rohanpadhye/JQF Wiki (github.com), 一下为 利用mvn 进行JQF的环境部署和fuzzing
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>examples</groupId>
<artifactId>zest-tutorial</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<!-- JQF: test dependency for @Fuzz annotation -->
<dependency>
<groupId>edu.berkeley.cs.jqf</groupId>
<artifactId>jqf-fuzz</artifactId>
<!-- confirm the latest version at: https://mvnrepository.com/artifact/edu.berkeley.cs.jqf -->
<version>1.8</version>
<scope>test</scope>
</dependency>
<!-- JUnit-QuickCheck: API to write generators -->
<dependency>
<groupId>com.pholser</groupId>
<artifactId>junit-quickcheck-generators</artifactId>
<version>1.0</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<!-- The JQF plugin, for invoking the command `mvn jqf:fuzz` -->
<plugin>
<groupId>edu.berkeley.cs.jqf</groupId>
<artifactId>jqf-maven-plugin</artifactId>
<!-- confirm the latest version at: https://mvnrepository.com/artifact/edu.berkeley.cs.jqf -->
<version>1.8</version>
</plugin>
</plugins>
</build>
</project>
0x03 The Guidance interface
原文: The Guidance interface · rohanpadhye/JQF Wiki (github.com)
One of JQF’s main features is the ability to extend the feedback-directed input generation technique, via the Guidance interface. Fuzzing with AFL is just one of the many ways that this interface can be used. You can implement this interface to provide your own input-generation technique and run JQF using the static method GuidedFuzzing.run()
.
其中最重要的是,如果要执行JQF,可以使用
GuidedFuzzing.run()
要想使用 自定义的 Guidance Class,首先需要再次修改项目的框架,需要从头设置新的Main函数起始处。
参考与推荐: https://github.com/rohanpadhye/JQF/issues/102
本节的知识点也受该issue启发
对于 The Guidance interface
来说,最好的方式是重新编写一个 Driver,类似于 ZestDriver fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/ei/ZestDriver.java
或者 PerfFuzzDriverfuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/afl/PerfFuzzDriver.java
。
In a nutshell, the Guidance interface specifies the following methods:
public interface Guidance {
boolean hasInput();
InputStream getInput();
void handleResult(Result result, Throwable error);
Consumer<TraceEvent> generateCallBack(Thread thread);
}
The first three methods are invoked once per iteration of the main fuzzing. In pseudo-code, JQF runs the following fuzzing loop:
while (guidance.hasInput()) {
// Create a pseudo-random-number generator from the input file
Random rng = new StreamBackedRandom(guidance.getInput());
// Generate the args to the test function using QuickCheck and custom RNG
Object[] args = generateInput(rng);
try {
// Execute the test with generated inputs (i.e. args)
runTest(args); // this generates many trace events
guidance.handleResult(SUCCESS, null);
} catch (AssumptionViolatedException e) {
guidance.handleResult(INVALID, e);
} catch (Throwable t) {
if (isExpected(e)) {
guidance.handleResult(SUCCESS, e);
} else {
guidance.handleResult(FAILURE, e);
}
}
}
Override 函数
boolean hasInput();
函数作用: hasInput 用于检验是否还有 input 用于下一个 run.
InputStream getInput();
函数作用: getInput 函数用于得到下一个 input。
Consumer<TraceEvent> generateCallBack(Thread thread);
函数作用: 对于给定的 thread ,输出 一个 Consumer 函数,该匿名函数需要以
TraceEvent
作为 函数参数。
⚠️ 需要注意: generateCallBack 在 JQF中的 执行顺序上优于 handleResult,处于在 run 过程中。
具体原因在于,generateCallBack 的 处理流程:
1.edu/berkeley/cs/jqf/fuzz/junit/GuidedFuzzing.java#run
中 将guidance::generateCallBack
设置为 edu/berkeley/cs/jqf/instrument/tracing/SingleSnoop.java
中的 callbackGenerator
SingleSnoop.setCallbackGenerator(guidance::generateCallBack);
2.edu/berkeley/cs/jqf/instrument/tracing/SingleSnoop.java
中含有调用 generateCallBack,代码如下:
/** A supplier of callbacks for each thread (does nothing by default). */
static Function<Thread
, Consumer<TraceEvent>> callbackGenerator = (t) -> (e) -> {};
void handleResult(Result result, Throwable error);
函数作用: 在一次 trial 结束后,被调用,一般用于处于 执行状态,
success
,failure
,避免更多的assumption failures,探索更多的 Exceptions.
⚠️需要注意: 在JQF中,run 只有一次,但 trial 存在多次,且有run调用;trial 的数量 可以设置,也会影响到 hasInput 的判断。
Code learning
执行流程
// edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java
插桩思路
@Override
public Class<?> findClass(String name) throws ClassNotFoundException {
byte[] originalBytecode;
// Try to read the class file in as a resource
String internalName = name.replace('.', '/');
String path = internalName.concat(".class");
try (InputStream in = super.getResourceAsStream(path)) {
if (in == null) {
throw new ClassNotFoundException("Cannot find class " + name);
}
originalBytecode = in.readAllBytes();
} catch (IOException e) {
throw new ClassNotFoundException("I/O exception while loading class.", e);
}
assert (originalBytecode != null);
byte[] bytesToLoad;
try {
byte[] instrumented = transformer.transform(this, internalName, null, null, originalBytecode);
if (instrumented != null) {
bytesToLoad = instrumented;
} else {
bytesToLoad = originalBytecode;
}
} catch (IllegalClassFormatException e) {
bytesToLoad = originalBytecode;
}
return defineClass(name, bytesToLoad, 0, bytesToLoad.length);
}
继续跟进 transformer.transform
到 janala.instrument.SnoopInstructionTransformer
Coverage
Event Class
设计思路,类似 LLVM 的思路,可以Trace Event,调用不同的 Listener
package edu.berkeley.cs.jqf.instrument.tracing.events;
import janala.logger.inst.MemberRef;
/**
* An interface representing by a trace event such as CALL, RETURN or BRANCH.
*
*
* @author Rohan Padhye
*/
public abstract class TraceEvent {
protected final int iid;
protected final MemberRef containingMethod;
protected final int lineNumber;
public TraceEvent(int iid, MemberRef method, int lineNumber) {
this.iid = iid;
this.containingMethod = method;
this.lineNumber = lineNumber;
}
public int getIid() {
return iid;
}
public String getFileName() {
if (containingMethod == null) {
return "<unknown>";
}
String owner = containingMethod.getOwner();
int idxOfDollar = owner.indexOf('$');
if (idxOfDollar >= 0) {
return owner.substring(0, idxOfDollar) + ".java";
} else {
return owner + ".java";
}
}
public int getLineNumber() {
return lineNumber;
}
public String getContainingClass() {
if (containingMethod == null) {
return "";
} else {
return containingMethod.getOwner();
}
}
public String getContainingMethodName() {
if (containingMethod == null) {
return "<unknown>";
} else {
return containingMethod.getName();
}
}
public String getContainingMethodDesc() {
if (containingMethod == null) {
return "(?)";
} else {
return containingMethod.getDesc();
}
}
public abstract void applyVisitor(TraceEventVisitor v);
}
Counter Class
Counter Class 中存在 counts 存储 values。
/** The size of the counter map. */
protected final int size;
/** The counter map as an array of integers. */
protected final int[] counts;
其中对 counts赋值或者产生影响的的函数如下:
protected int incrementAtIndex(int index, int delta) {
// return (this.counts[index] = this.counts[index] + delta)
return (this.counts[index] += delta);
}
public int increment(int key, int delta) {
return incrementAtIndex(idx(key), delta);
}
public int increment(int key) {
return incrementAtIndex(idx(key), 1);
}
private int idx(int key) {
return Hashing.hash(key, size);
}
public int increment1(int k1, int k2) {
return incrementAtIndex(idx1(k1, k2), 1);
}
private int idx1(int k1, int k2) {
return Hashing.hash1(k1, k2, size);
}
public void setAtIndex(int idx, int value) {
this.counts[idx] = value;
}
Coverage Class
需要检查子类的问题,ZestGuidance
中存在 Coverage 类
/**
* Updates this coverage with bits from the parameter.
*
* @param that the run coverage whose bits to OR
*
* @return <code>true</code> iff <code>that</code> is not a subset
* of <code>this</code>, causing <code>this</code> to change.
*/
@Override
public boolean updateBits(ICoverage that) {
boolean changed = false;
if (that.getCounter().hasNonZeros()) {
for (int idx = 0; idx < COVERAGE_MAP_SIZE; idx++) {
int before = this.counter.getAtIndex(idx);
int after = before | hob(that.getCounter().getAtIndex(idx));
if (after != before) {
this.counter.setAtIndex(idx, after);
changed = true;
}
}
}
return changed;
}
SampleGeometric
package xyz.fe1w0.java.other.misc;
import java.util.Random;
import static java.lang.Math.ceil;
import static java.lang.Math.log;
public class SampleGeometric {
public static int sampleGeometric(Random random, double mean) {
double p = 1 / mean;
double uniform = random.nextDouble();
return (int) ceil(log(1 - uniform) / log(1 - p));
}
public static void main(String[] args) {
Random random = new Random();
System.out.println(sampleGeometric(random, 4.0));
}
}
sampleGeometric 的大致分布:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def sample_geometric(uniform, mean):
p = 1 / mean
return int(np.ceil(np.log(1 - uniform) / np.log(1 - p)))
# 定义变量范围
uniform_values = np.linspace(0.01, 0.99, 100) # uniform 在0.01到0.99之间取100个点
mean_values = np.linspace(1, 100, 100) # 均值在1到10之间取100个点
# 计算对应的几何分布样本
samples = np.zeros((len(uniform_values), len(mean_values)))
for i, uniform in enumerate(uniform_values):
for j, mean in enumerate(mean_values):
samples[i, j] = sample_geometric(uniform, mean)
# 创建三维图像
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# 绘制三维图像
X, Y = np.meshgrid(mean_values, uniform_values)
ax.plot_surface(X, Y, samples.T, cmap='viridis')
# 设置坐标轴标签和标题
ax.set_xlabel('Mean')
ax.set_ylabel('Uniform')
ax.set_zlabel('Sample')
ax.set_title('Geometric Distribution Sample')
plt.show()
不太准确,对于 Uniform -> 1, 则 Sample -> +∞;对于 Uniform -> 0, 则 Sample -> 0.
Mean 越大,Uniform 与 Sample的 曲线,趋向直接 -> +∞。
FuzzChains(待续)
FuzzChains 是基于JQF的Java模糊测试进行的改进工作,将其改为类似AFL-GO的定向模糊测试工具。FuzzChains 有三个功能,1) 可以根据输入模版构建输入对象;2)将原先的Zest 基于 Total Coverage的覆盖率计算,改为基于目标调用图的覆盖率计算;3)预处理阶段时,对目标危险函数进行静态插桩,以作为Fuzzing的有效性判断。
FuzzChains – Overview
预处理与模糊测试
对象生成算法
运行截图
Great article! I really appreciate the clear and detailed insights you’ve provided on this topic. It’s always refreshing to read content that breaks things down so well, making it easy for readers to grasp even complex ideas. I also found the practical tips you’ve shared to be very helpful. Looking forward to more informative posts like this! Keep up the good work!