References

Tutorial:
Papers:
- JQF:Java 中基于覆盖率引导的属性测试 – 每日头条 (kknews.cc)
- 原 JQF的Readme有介绍
  - [1812.00078] Semantic Fuzzing with Zest (arxiv.org)
resources:
- https://github.com/GCMiner/GCMiner

Introduction

语法分析：也称为词法分析，是将输入的源代码转换为标记（token）序列，并检查它们是否符合语言的语法规则的过程。它主要关注的是源代码的结构是否正确。
语义分析：是在语法分析的基础上对标记序列进行进一步的处理，确定它们的含义并进行语言上下文的解释。它主要关注的是源代码的含义是否正确

Zest 算法支持两种:

基于覆盖率进行突变
利用Junit的 Assume Api 来检验输入的语法有效性

Requirements

maven
java 9+ (是的，对于8版本的组件，可以还需要自己用高版本重新编译)

Build JQF

git clone https://github.com/rohanpadhye/jqf
cd jqf
mvn package

Tutorial

基本是GitHub上的Tutorial Docs, 添加部分自己的理解。

0x01 Fuzzing-with-Zest

source code

CalendarLogic.java
- 被测试对象
CalendarGenerator.java
- 测试样板生成器（随机）
CalendarTest.java
- 测试脚本

源代码，见 https://github.com/rohanpadhye/jqf/wiki/Fuzzing-with-Zest
具体的代码解释，也请见原文和官方 Docs
测试框架为 junit-quickcheck
https://pholser.github.io/junit-quickcheck/site/0.8.2/usage/other-types.html
大致源代码含义:

CalendarLogic.java

import java.util.GregorianCalendar;
import static java.util.GregorianCalendar.*;

/* Application logic */
public class CalendarLogic {
    // Returns true iff cal is in a leap year
    public static boolean isLeapYear(GregorianCalendar cal) {
        int year = cal.get(YEAR);
        if (year % 4 == 0) {
            if (year % 100 == 0) {
                return false;
            }
            return true;
        }
        return false;
    }

    // Returns either of -1, 0, 1 depending on whether c1 is <, =, > than c2
    public static int compare(GregorianCalendar c1, GregorianCalendar c2) {
        int cmp;
        cmp = Integer.compare(c1.get(YEAR), c2.get(YEAR));
        if (cmp == 0) {
            cmp = Integer.compare(c1.get(MONTH), c2.get(MONTH));
            if (cmp == 0) {
                cmp = Integer.compare(c1.get(DAY_OF_MONTH), c2.get(DAY_OF_MONTH));
                if (cmp == 0) {
                    cmp = Integer.compare(c1.get(HOUR), c2.get(HOUR));
                    if (cmp == 0) {
                        cmp = Integer.compare(c1.get(MINUTE), c2.get(MINUTE));
                        if (cmp == 0) {
                            cmp = Integer.compare(c1.get(SECOND), c2.get(SECOND));
                            if (cmp == 0) {
                                cmp = Integer.compare(c1.get(MILLISECOND), c2.get(MILLISECOND));
                            }
                        }
                    }
                }
            }
        }
        return cmp;
    }
}

CalendarGenerator.java

import java.util.GregorianCalendar;
import java.util.TimeZone;

import com.pholser.junit.quickcheck.generator.GenerationStatus;
import com.pholser.junit.quickcheck.generator.Generator;
import com.pholser.junit.quickcheck.random.SourceOfRandomness;

import static java.util.GregorianCalendar.*;

public class CalendarGenerator extends Generator<GregorianCalendar> {

    Cycles completed CalendarGenerator() {
        super(GregorianCalendar.class); // Register the type of objects that we can create
    }

    // This method is invoked to generate a single test case
    @Override
    public GregorianCalendar generate(SourceOfRandomness random, GenerationStatus __ignore__) {
        // Initialize a calendar object
        GregorianCalendar cal = new GregorianCalendar();
        cal.setLenient(true); // This allows invalid dates to silently wrap (e.g. Apr 31 ==> May 1).

        // Randomly pick a day, month, and year
        cal.set(DAY_OF_MONTH, random.nextInt(31) + 1); // a number between 1 and 31 inclusive
        cal.set(MONTH, random.nextInt(12) + 1); // a number between 1 and 12 inclusive
        cal.set(YEAR, random.nextInt(cal.getMinimum(YEAR), cal.getMaximum(YEAR)));

        // Optionally also pick a time
        if (random.nextBoolean()) {
            cal.set(HOUR, random.nextInt(24));
            cal.set(MINUTE, random.nextInt(60));
            cal.set(SECOND, random.nextInt(60));
        }

        // Let's set a timezone
        // First, get supported timezone IDs (e.g. "America/Los_Angeles")
        String[] allTzIds = TimeZone.getAvailableIDs();

        // Next, choose one randomly from the array
        String tzId = random.choose(allTzIds);
        TimeZone tz = TimeZone.getTimeZone(tzId);

        // Assign it to the calendar
        cal.setTimeZone(tz);

	// Return the randomly generated calendar object
        return cal;
    }
}

CalendarTest.java

import java.util.*;
import static java.util.GregorianCalendar.*;
import static org.junit.Assert.*;
import static org.junit.Assume.*;

import org.junit.runner.RunWith;
import com.pholser.junit.quickcheck.*;
import com.pholser.junit.quickcheck.generator.*;
import edu.berkeley.cs.jqf.fuzz.*;

@RunWith(JQF.class)
public class CalendarTest {

    @Fuzz
    public void testLeapYear(@From(CalendarGenerator.class) GregorianCalendar cal) {
        // Assume that the date is Feb 29
        assumeTrue(cal.get(MONTH) == FEBRUARY);
        assumeTrue(cal.get(DAY_OF_MONTH) == 29);
		
        // Under this assumption, validate leap year rules
        assertTrue(cal.get(YEAR) + " should be a leap year", CalendarLogic.isLeapYear(cal));
    }

    @Fuzz
    public void testCompare(@Size(max=100) List<@From(CalendarGenerator.class) GregorianCalendar> cals) {
        // Sort list of calendar objects using our custom comparator function
        Collections.sort(cals, CalendarLogic::compare);

        // If they have an ordering, then the sort should succeed
        for (int i = 1; i < cals.size(); i++) {
            Calendar c1 = cals.get(i-1);
            Calendar c2 = cals.get(i);
            
(c1.equals(c2)); // Assume that we have distinct dates
            assertTrue(c1 + " should be before " + c2, c1.before(c2));  // Then c1 < c2
        }
    }
}

编译

# 编译class文件
javac -cp .:$(../JQF/scripts/classpath.sh) CalendarLogic.java CalendarGenerator.java CalendarTest.java

# 执行测试，但产生以下报错:
# Assumption is too strong; too many inputs discarded
java -cp .:$(../JQF/scripts/classpath.sh) org.junit.runner.JUnitCore CalendarTest

上面会报错，作者卖了个关子，大致是说，CalendarGenerator.generate() 是一个标准的伪随机源，绝大多数数据很有可能不会触发到设置的testLeapYear

启用 Zest 算法

../JQF/bin/jqf-zest -c .:$(../JQF/scripts/classpath.sh) CalendarTest testLeapYear

# mvn 的方式
mvn jqf:fuzz -Dclass=CalendarTest -Dmethod=testLeapYear

很酷！太Auto 优雅了。

部分输出结果:

Unique failure inputs:
- fuzz-results/failures
Common successful inputs:
- fuzz-results/corpus

在 Ctrl + C 手动中止后，可以使用如下命令来查看和分析失败的样例。

../JQF/bin/jqf-repro -c .:$(../JQF/scripts/classpath.sh) CalendarTest testLeapYear fuzz-results/failures/id_000000
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
.id_000000 ::= FAILURE (java.lang.AssertionError)
E
Time: 0.062
There was 1 failure:
1) testLeapYear(CalendarTest)
java.lang.AssertionError: 28271600 should be a leap year
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at CalendarTest.testLeapYear(CalendarTest.java:21)

FAILURES!!!
Tests run: 1,  Failures: 1

28271600 是闰年，可被 400 整除

jqf-zest -c .:$(${JQF_HOME}/scripts/classpath.sh) CalendarTest testCompare

# output:
# COOL!
Semantic Fuzzing with Zest
--------------------------

Test name:            CalendarTest#testCompare
Instrumentation:      Janala
Results directory:    /home/fe1w0/SoftwareAnalysis/DynamicAnalysis/tutorial/fuzz-results
Elapsed time:         11s (no time limit)
Number of executions: 6,254 (no trial limit)
Valid inputs:         6,144 (98.24%)
Cycles completed:     1
Unique failures:      1
Queue size:           36 (5 favored last cycle)
Current parent input: 0 (favored) {564/700 mutations}
Execution speed:      776/sec now | 523/sec overall
Total coverage:       32 branches (0.05% of map)
Valid coverage:       32 branches (0.05% of map)

小节
JQF 设置属性的方式，有点类似从反射或者set的方式，对object中的属性进行设置，但example 给的变量有点不太auto，

JQF 的 driver 编写流程:

初始化阶段:
1. 编写生成器，需要符合一定规范，以供 test driver 使用。
  1. public class CalendarGenerator extends Generator<GregorianCalendar>
  2. public CalendarGenerator() { super(GregorianCalendar.class); // Register the type of objects that we can create }
  3. @Override public GregorianCalendar generate(SourceOfRandomness random, GenerationStatus __ignore__) { ... ... ret cal // Return the randomly generated calendar object }
2. 通过反射或者set的方式，对需要设置的field进行随机赋值，如cal.setLenient(true); 、cal.set(MONTH, random.nextInt(12) + 1);等等
测试阶段，编写 test driver :
1. 基本要求，类注解 @RunWith(JQF.class)
2. 测试函数注解，@Fuzz
3. 测试函数参数注解
  1. public void testLeapYear(@From(CalendarGenerator.class) GregorianCalendar cal)
  2. public void testCompare(@Size(max=100) List<@From(CalendarGenerator.class) GregorianCalendar> cals)
4. 断言与函数测试脚本编写

0x02 Writing a JQF test using MVN

Setup JQF mvn envs

参考Fuzzing a Compiler · rohanpadhye/JQF Wiki (github.com)，一下为利用mvn 进行JQF的环境部署和fuzzing

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>examples</groupId>
    <artifactId>zest-tutorial</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
    </properties>

    <dependencies>
        <!-- JQF: test dependency for @Fuzz annotation -->
        <dependency>
            <groupId>edu.berkeley.cs.jqf</groupId>
            <artifactId>jqf-fuzz</artifactId>
            <!-- confirm the latest version at: https://mvnrepository.com/artifact/edu.berkeley.cs.jqf -->
            <version>1.8</version> 
            <scope>test</scope>
        </dependency>

        <!-- JUnit-QuickCheck: API to write generators -->
        <dependency>
            <groupId>com.pholser</groupId>
            <artifactId>junit-quickcheck-generators</artifactId>
            <version>1.0</version>
            <scope>test</scope>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <!-- The JQF plugin, for invoking the command `mvn jqf:fuzz` -->
            <plugin>
                <groupId>edu.berkeley.cs.jqf</groupId>
                <artifactId>jqf-maven-plugin</artifactId>
                <!-- confirm the latest version at: https://mvnrepository.com/artifact/edu.berkeley.cs.jqf -->
                <version>1.8</version>
            </plugin>
        </plugins>
    </build>
</project>

0x03 The Guidance interface

原文: The Guidance interface · rohanpadhye/JQF Wiki (github.com)

One of JQF’s main features is the ability to extend the feedback-directed input generation technique, via the Guidance interface. Fuzzing with AFL is just one of the many ways that this interface can be used. You can implement this interface to provide your own input-generation technique and run JQF using the static method GuidedFuzzing.run().

其中最重要的是，如果要执行JQF，可以使用 GuidedFuzzing.run()

要想使用自定义的 Guidance Class，首先需要再次修改项目的框架，需要从头设置新的Main函数起始处。

参考与推荐: https://github.com/rohanpadhye/JQF/issues/102
本节的知识点也受该issue启发

对于 The Guidance interface来说，最好的方式是重新编写一个 Driver，类似于 ZestDriver fuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/ei/ZestDriver.java 或者 PerfFuzzDriverfuzz/src/main/java/edu/berkeley/cs/jqf/fuzz/afl/PerfFuzzDriver.java。

In a nutshell, the Guidance interface specifies the following methods:

public interface Guidance {
    boolean hasInput();
    InputStream getInput();
    void handleResult(Result result, Throwable error);
    Consumer<TraceEvent> generateCallBack(Thread thread);
}

The first three methods are invoked once per iteration of the main fuzzing. In pseudo-code, JQF runs the following fuzzing loop:

while (guidance.hasInput()) {
    // Create a pseudo-random-number generator from the input file
    Random rng = new StreamBackedRandom(guidance.getInput());
    // Generate the args to the test function using QuickCheck and custom RNG
    Object[] args = generateInput(rng);
    try {
        // Execute the test with generated inputs (i.e. args)
        runTest(args); // this generates many trace events
        guidance.handleResult(SUCCESS, null);
    } catch (AssumptionViolatedException e) {
        guidance.handleResult(INVALID, e);
    } catch (Throwable t) {
        if (isExpected(e)) {
            guidance.handleResult(SUCCESS, e);
        } else {
            guidance.handleResult(FAILURE, e);
        }
    }
}

Override 函数

boolean hasInput();

函数作用: hasInput 用于检验是否还有 input 用于下一个 run.

InputStream getInput();

函数作用: getInput 函数用于得到下一个 input。

`Consumer<TraceEvent> generateCallBack(Thread thread);`

函数作用: 对于给定的 thread ，输出一个 Consumer 函数，该匿名函数需要以 TraceEvent 作为函数参数。
⚠️ 需要注意: generateCallBack 在 JQF中的执行顺序上优于 handleResult，处于在 run 过程中。

具体原因在于，generateCallBack 的处理流程：

1.edu/berkeley/cs/jqf/fuzz/junit/GuidedFuzzing.java#run 中将guidance::generateCallBack 设置为 edu/berkeley/cs/jqf/instrument/tracing/SingleSnoop.java 中的 callbackGenerator

SingleSnoop.setCallbackGenerator(guidance::generateCallBack);

2.edu/berkeley/cs/jqf/instrument/tracing/SingleSnoop.java 中含有调用 generateCallBack，代码如下:

    /** A supplier of callbacks for each thread (does nothing by default). */
    static Function<Thread
    , Consumer<TraceEvent>> callbackGenerator = (t) -> (e) -> {};

void handleResult(Result result, Throwable error);

函数作用: 在一次 trial 结束后，被调用，一般用于处于执行状态，success, failure，避免更多的assumption failures，探索更多的 Exceptions.
⚠️需要注意: 在JQF中，run 只有一次，但 trial 存在多次，且有run调用；trial 的数量可以设置，也会影响到 hasInput 的判断。

Code learning

执行流程

// edu/berkeley/cs/jqf/fuzz/junit/quickcheck/FuzzStatement.java

插桩思路


    @Override
    public Class<?> findClass(String name) throws ClassNotFoundException {
        byte[] originalBytecode;

        // Try to read the class file in as a resource
        String internalName = name.replace('.', '/');
        String path = internalName.concat(".class");
        try (InputStream in = super.getResourceAsStream(path)) {
            if (in == null) {
                throw new ClassNotFoundException("Cannot find class " + name);
            }
            originalBytecode = in.readAllBytes();
        } catch (IOException e) {
            throw new ClassNotFoundException("I/O exception while loading class.", e);
        }

        assert (originalBytecode != null);

        byte[] bytesToLoad;
        try {
            byte[] instrumented = transformer.transform(this, internalName, null, null, originalBytecode);
            if (instrumented != null) {
                bytesToLoad = instrumented;
            } else {
                bytesToLoad = originalBytecode;
            }
        } catch (IllegalClassFormatException e) {
            bytesToLoad = originalBytecode;
        }
        return defineClass(name, bytesToLoad, 0, bytesToLoad.length);
    }

继续跟进 transformer.transform 到 janala.instrument.SnoopInstructionTransformer

Coverage

Event Class

设计思路，类似 LLVM 的思路，可以Trace Event，调用不同的 Listener

package edu.berkeley.cs.jqf.instrument.tracing.events;

import janala.logger.inst.MemberRef;

/**
 * An interface representing by a trace event such as CALL, RETURN or BRANCH.
 *
 *
 * @author Rohan Padhye
 */
public abstract class TraceEvent {

    protected final int iid;
    protected final MemberRef containingMethod;
    protected final int lineNumber;

    public TraceEvent(int iid, MemberRef method, int lineNumber) {
        this.iid = iid;
        this.containingMethod = method;
        this.lineNumber = lineNumber;
    }

    public int getIid() {
        return iid;
    }

    public String getFileName() {
        if (containingMethod == null) {
            return "<unknown>";
        }
        String owner = containingMethod.getOwner();
        int idxOfDollar = owner.indexOf('$');
        if (idxOfDollar >= 0) {
            return owner.substring(0, idxOfDollar) + ".java";
        } else {
            return owner + ".java";
        }
    }

    public int getLineNumber() {
        return lineNumber;
    }

    public String getContainingClass() {
        if (containingMethod == null) {
            return "";
        } else {
            return containingMethod.getOwner();
        }
    }

    public String getContainingMethodName() {
        if (containingMethod == null) {
            return "<unknown>";
        } else {
            return containingMethod.getName();
        }
    }

    public String getContainingMethodDesc() {
        if (containingMethod == null) {
            return "(?)";
        } else {
            return containingMethod.getDesc();
        }
    }

    public abstract void applyVisitor(TraceEventVisitor v);
}

Counter Class

Counter Class 中存在 counts 存储 values。

    /** The size of the counter map. */
    protected final int size;

    /** The counter map as an array of integers. */
    protected final int[] counts;

其中对 counts赋值或者产生影响的的函数如下:

protected int incrementAtIndex(int index, int delta) {  
    // return (this.counts[index] = this.counts[index] + delta)
    return (this.counts[index] += delta);  
}

public int increment(int key, int delta) {  
    return incrementAtIndex(idx(key), delta);  
}

public int increment(int key) {  
    return incrementAtIndex(idx(key), 1);  
}

private int idx(int key) {  
    return Hashing.hash(key, size);  
}

public int increment1(int k1, int k2) {  
    return incrementAtIndex(idx1(k1, k2), 1);  
}

private int idx1(int k1, int k2) {  
    return Hashing.hash1(k1, k2, size);  
}


public void setAtIndex(int idx, int value) {  
    this.counts[idx] = value;  
}

Coverage Class

需要检查子类的问题，ZestGuidance 中存在 Coverage 类

    /**
     * Updates this coverage with bits from the parameter.
     *
     * @param that the run coverage whose bits to OR
     *
     * @return <code>true</code> iff <code>that</code> is not a subset
     *         of <code>this</code>, causing <code>this</code> to change.
     */
    @Override
    public boolean updateBits(ICoverage that) {
        boolean changed = false;
        if (that.getCounter().hasNonZeros()) {
            for (int idx = 0; idx < COVERAGE_MAP_SIZE; idx++) {
                int before = this.counter.getAtIndex(idx);
                int after = before | hob(that.getCounter().getAtIndex(idx));
                if (after != before) {
                    this.counter.setAtIndex(idx, after);
                    changed = true;
                }
            }
        }
        return changed;
    }

SampleGeometric

package xyz.fe1w0.java.other.misc;  
  
import java.util.Random;  
import static java.lang.Math.ceil;  
import static java.lang.Math.log;  
  
public class SampleGeometric {  
    public static int sampleGeometric(Random random, double mean) {  
        double p = 1 / mean;  
        double uniform = random.nextDouble();  
        return (int) ceil(log(1 - uniform) / log(1 - p));  
    }  
  
    public static void main(String[] args) {  
  
        Random random = new Random();  
        System.out.println(sampleGeometric(random, 4.0));  
  
    }  
}

sampleGeometric 的大致分布：

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

def sample_geometric(uniform, mean):
    p = 1 / mean
    return int(np.ceil(np.log(1 - uniform) / np.log(1 - p)))

# 定义变量范围
uniform_values = np.linspace(0.01, 0.99, 100)  # uniform 在0.01到0.99之间取100个点
mean_values = np.linspace(1, 100, 100)  # 均值在1到10之间取100个点

# 计算对应的几何分布样本
samples = np.zeros((len(uniform_values), len(mean_values)))
for i, uniform in enumerate(uniform_values):
    for j, mean in enumerate(mean_values):
        samples[i, j] = sample_geometric(uniform, mean)

# 创建三维图像
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# 绘制三维图像
X, Y = np.meshgrid(mean_values, uniform_values)
ax.plot_surface(X, Y, samples.T, cmap='viridis')

# 设置坐标轴标签和标题
ax.set_xlabel('Mean')
ax.set_ylabel('Uniform')
ax.set_zlabel('Sample')
ax.set_title('Geometric Distribution Sample')

plt.show()

不太准确，对于 Uniform -> 1, 则 Sample -> +∞；对于 Uniform -> 0, 则 Sample -> 0.
Mean 越大，Uniform 与 Sample的曲线，趋向直接 -> +∞。

FuzzChains（待续）

项目链接:
fe1w0/FuzzChains (github.com)

FuzzChains 是基于JQF的Java模糊测试进行的改进工作，将其改为类似AFL-GO的定向模糊测试工具。FuzzChains 有三个功能，1）可以根据输入模版构建输入对象；2）将原先的Zest 基于 Total Coverage的覆盖率计算，改为基于目标调用图的覆盖率计算；3）预处理阶段时，对目标危险函数进行静态插桩，以作为Fuzzing的有效性判断。

FuzzChains – Overview

预处理与模糊测试

对象生成算法

运行截图