DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
Prerequisites
- python3 and docker
- time and patience
This page is a work in progress
This example is for PDFBox. After Tika's harnesses have been contributed to the oss-fuzz project, we'll update this page.
Steps
- grab the repo:
git clone https://github.com/google/oss-fuzz && cd oss-fuzz - build the image:
python3 infra/helper.py build_image pdfbox - build the project and its fuzzers:
- from pdfbox's github repo
main:python3 infra/helper.py build_fuzzers pdfbox - from a local repo:
python3 infra/helper.py build_fuzzers pdfbox /home/Intellij/my-pdfbox
- from pdfbox's github repo
- run a fuzzer:
python3 infra/helper.py run_fuzzer pdfbox PDFExtractTextFuzzer - reproduce a problem:
python3 infra/helper.py reproduce pdfbox PDFExtractTextFuzzer build/out/pdfbox/timeout-bc0fe673ec0c97982de56ef8ab1ee08eff081a3b
Notes
- When a problem file has been found, it is written to
oss-fuzz/build/out/pdfbox 172282862 REDUCE cov: 2049 ft: 7874 corp: 1460/766Kb lim: 4096 exec/s: 16587 rss: 1233Mb L: 197/4096 MS- This means that there's coverage on 2049 paths, and the fuzzer is aware of 7874 paths
Seeds are so, so important. PDFBox's PDFExtractText Fuzzer had cov=2049 ft=7874 after several hours with no seeds. When I added 1k pdfs as a seed corpus, I hit cov=13886 ft=58782 within a few minutes.
Typical Workflow
- Build the image, fuzz, find bug
- Fix bug in local repo, rebuild image, fuzz again. Find new bugs.
- Repeat.
While it is possible to configure the fuzzer to keep going, some bugs are just easier to hit. The fuzzer will often trigger/discover the super easy to find and won't reveal the true, rare beauties until after the easier bugs are fixed. In general, I got little benefit from running the fuzzer multiple times... at least to start.
When enrolling a new repo or building a new harness, there will likely be lots of findings initially. Once the initial findings are fixed, the maintenance period is not bad. The startup costs are non-trivial, though, in fixing a repo that was not designed with security as the first goal.
Common problems
Not building from a local repo
There are two different things that can go wrong.
- A number of repos in oss-fuzz do not work out of the box with local builds. The error message looks like this: ERROR:__main__:Cannot use local checkout with "WORKDIR: /src". The workaround is straightforward – change the WORKDIR to something else like:
WORKDIR $SRC/project-parent/pdfbox. You'll also have to adjust yourbuild.shslightly to reflect the change in the working directory. Obv, then make sure to open a PR to fix these repos in oss-fuzz!!! - This may have been unique to our slight mods to oss-fuzz, but there were a number of times when I thought that I was working from a local repo, but the build was silently building from the github repo and ignoring my local repo. I got into the habit of removing a parenthesis in my local repo to see if that would cause the build to fail. I think the fix for this was to
rm -rftheoss-fuzz/build/out/pdfboxdirectory and start from scratch.
Other issues
- Drive running out of space. I initially started with Docker on Ubuntu installed with snap. The /var/snap/docker directory took up nearly 600GB after a couple of runs because of the way it caches images/containers/something.
docker system prune -afworked to clear that, but then I had to rebuild my images. I uninstalled Docker in snap and reinstalled withapt, and everything was instantly better on this front.
