Skip to content

Commit bd7e128

Browse files
committed
Add vendetta doc
1 parent cd5a322 commit bd7e128

File tree

1 file changed

+104
-0
lines changed

1 file changed

+104
-0
lines changed

docs/features/vendetta.md

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
2+
# Vendetta
3+
4+
Vendetta is the name of FOSSA's vendored dependency identification feature.
5+
6+
Vendetta hashes files in your first party source code, compares them against
7+
FOSSA's knowledge base, and matches them to common open source components before
8+
finally feeding those matches to a special algorithm that deduces a holistic set
9+
of vendored open source dependencies present in your project.
10+
11+
Vendetta can be run as part of `fossa analyze`. To enable it, add the
12+
`--x-vendetta` flag when you run `fossa analyze`:
13+
14+
```sh
15+
fossa analyze --x-vendetta
16+
```
17+
18+
## How Vendetta Works
19+
20+
When `--x-vendetta` is enabled, the CLI:
21+
22+
1. **Hashes Files**: Creates MD5 hashes of the contents of all relevant files.
23+
2. **Filters Content**: By default, skips directories like `.git/`, and hidden
24+
directories. This includes, from `.fossa.yml`,
25+
`vendoredDependencies.licenseScanPathFilters.exclude`, documented further
26+
below.
27+
5. **Uploads Hashes**: Sends only the hashes to FOSSA's servers.
28+
6. **Receives Matches**: Gets back information about any matching open source
29+
components.
30+
7. **Infers Dependencies**: Feeds the matches to an algorithm that heuristically
31+
identifies the vendored dependencies in your project.
32+
33+
## Data Sent to FOSSA
34+
35+
Vendetta sends _only_ the MD5 hashes of your file contents to FOSSA. The raw
36+
contents are never sent to FOSSA.
37+
38+
## Data Retention
39+
40+
The MD5 hashes are stored permanently in FOSSA.
41+
42+
## Directory Filtering
43+
44+
By default, Vendetta excludes common non-production directories and follows
45+
`.gitignore` patterns:
46+
47+
- Hidden directories.
48+
- Globs as directed by `.gitignore` files.
49+
50+
#### Custom Exclude Filtering
51+
52+
You can customize which files and directories are excluded from Vendetta by
53+
configuring exclude filters in your `.fossa.yml` file. Note that Vendetta scans
54+
currently only support exclude patterns, not `only` patterns.
55+
56+
For example:
57+
```yaml
58+
version: 3
59+
vendoredDependencies:
60+
licenseScanPathFilters:
61+
exclude:
62+
- "**/test/**"
63+
- "**/tests/**"
64+
- "**/spec/**"
65+
- "**/node_modules/**"
66+
- "**/dist/**"
67+
- "**/build/**"
68+
- "**/*.test.js"
69+
- "**/*.spec.ts"
70+
```
71+
72+
**Important Notes:**
73+
74+
- Vendetta scanning only use the `exclude` filters from `licenseScanPathFilters`
75+
— `only` filters are ignored for this use-case.
76+
- Path filters use standard glob patterns (e.g., `**/*` for recursive matching,
77+
`*` for single-directory matching).
78+
- The configuration goes in the
79+
`vendoredDependencies.licenseScanPathFilters.exclude` section.
80+
- These exclude patterns are passed directly to the Ficus scanning engine as
81+
`--exclude` arguments.
82+
- Default exclusions (hidden files, `.gitignore` patterns) are applied in
83+
addition to custom excludes.
84+
85+
## A note on scan times
86+
87+
The first time you run Vendetta on a codebase, it may take a long time to scan.
88+
For example, scanning [Linux](https://github.com/torvalds/linux) for the first
89+
time may take upwards of 60 minutes. This is because most of the files in your
90+
codebase will have never been checked against FOSSA's knowledge base for open
91+
source components, which can take time.
92+
93+
Once you scan the first time however, FOSSA will cache the open source component
94+
matches for each MD5 hash Vendetta provides. This means that subsequent scans of
95+
the same project will be drastically faster. For example, scanning the same
96+
revision of Linux twice in a row should result in the second scan only taking a
97+
few seconds.
98+
99+
100+
The time it takes to scan newer versions of your codebase will depend on how
101+
many files in the new version have not been previously scanned. A file has been
102+
previously scanned if the exact same file has ever been scanned by Vendetta.
103+
FOSSA recommends scanning your codebase on a regular basis to keep scan times
104+
low.

0 commit comments

Comments
 (0)