|
| 1 | + |
| 2 | +# Vendetta |
| 3 | + |
| 4 | +Vendetta is the name of FOSSA's vendored dependency identification feature. |
| 5 | + |
| 6 | +Vendetta hashes files in your first party source code, compares them against |
| 7 | +FOSSA's knowledge base, and matches them to common open source components before |
| 8 | +finally feeding those matches to a special algorithm that deduces a holistic set |
| 9 | +of vendored open source dependencies present in your project. |
| 10 | + |
| 11 | +Vendetta can be run as part of `fossa analyze`. To enable it, add the |
| 12 | +`--x-vendetta` flag when you run `fossa analyze`: |
| 13 | + |
| 14 | +```sh |
| 15 | +fossa analyze --x-vendetta |
| 16 | +``` |
| 17 | + |
| 18 | +## How Vendetta Works |
| 19 | + |
| 20 | +When `--x-vendetta` is enabled, the CLI: |
| 21 | + |
| 22 | +1. **Hashes Files**: Creates MD5 hashes of the contents of all relevant files. |
| 23 | +2. **Filters Content**: By default, skips directories like `.git/`, and hidden |
| 24 | + directories. This includes, from `.fossa.yml`, |
| 25 | + `vendoredDependencies.licenseScanPathFilters.exclude`, documented further |
| 26 | + below. |
| 27 | +5. **Uploads Hashes**: Sends only the hashes to FOSSA's servers. |
| 28 | +6. **Receives Matches**: Gets back information about any matching open source |
| 29 | + components. |
| 30 | +7. **Infers Dependencies**: Feeds the matches to an algorithm that heuristically |
| 31 | + identifies the vendored dependencies in your project. |
| 32 | + |
| 33 | +## Data Sent to FOSSA |
| 34 | + |
| 35 | +Vendetta sends _only_ the MD5 hashes of your file contents to FOSSA. The raw |
| 36 | +contents are never sent to FOSSA. |
| 37 | + |
| 38 | +## Data Retention |
| 39 | + |
| 40 | +The MD5 hashes are stored permanently in FOSSA. |
| 41 | + |
| 42 | +## Directory Filtering |
| 43 | + |
| 44 | +By default, Vendetta excludes common non-production directories and follows |
| 45 | +`.gitignore` patterns: |
| 46 | + |
| 47 | +- Hidden directories. |
| 48 | +- Globs as directed by `.gitignore` files. |
| 49 | + |
| 50 | +#### Custom Exclude Filtering |
| 51 | + |
| 52 | +You can customize which files and directories are excluded from Vendetta by |
| 53 | +configuring exclude filters in your `.fossa.yml` file. Note that Vendetta scans |
| 54 | +currently only support exclude patterns, not `only` patterns. |
| 55 | + |
| 56 | +For example: |
| 57 | +```yaml |
| 58 | +version: 3 |
| 59 | +vendoredDependencies: |
| 60 | + licenseScanPathFilters: |
| 61 | + exclude: |
| 62 | + - "**/test/**" |
| 63 | + - "**/tests/**" |
| 64 | + - "**/spec/**" |
| 65 | + - "**/node_modules/**" |
| 66 | + - "**/dist/**" |
| 67 | + - "**/build/**" |
| 68 | + - "**/*.test.js" |
| 69 | + - "**/*.spec.ts" |
| 70 | +``` |
| 71 | +
|
| 72 | +**Important Notes:** |
| 73 | +
|
| 74 | +- Vendetta scanning only use the `exclude` filters from `licenseScanPathFilters` |
| 75 | + — `only` filters are ignored for this use-case. |
| 76 | +- Path filters use standard glob patterns (e.g., `**/*` for recursive matching, |
| 77 | + `*` for single-directory matching). |
| 78 | +- The configuration goes in the |
| 79 | + `vendoredDependencies.licenseScanPathFilters.exclude` section. |
| 80 | +- These exclude patterns are passed directly to the Ficus scanning engine as |
| 81 | + `--exclude` arguments. |
| 82 | +- Default exclusions (hidden files, `.gitignore` patterns) are applied in |
| 83 | + addition to custom excludes. |
| 84 | + |
| 85 | +## A note on scan times |
| 86 | + |
| 87 | +The first time you run Vendetta on a codebase, it may take a long time to scan. |
| 88 | +For example, scanning [Linux](https://github.com/torvalds/linux) for the first |
| 89 | +time may take upwards of 60 minutes. This is because most of the files in your |
| 90 | +codebase will have never been checked against FOSSA's knowledge base for open |
| 91 | +source components, which can take time. |
| 92 | + |
| 93 | +Once you scan the first time however, FOSSA will cache the open source component |
| 94 | +matches for each MD5 hash Vendetta provides. This means that subsequent scans of |
| 95 | +the same project will be drastically faster. For example, scanning the same |
| 96 | +revision of Linux twice in a row should result in the second scan only taking a |
| 97 | +few seconds. |
| 98 | + |
| 99 | + |
| 100 | +The time it takes to scan newer versions of your codebase will depend on how |
| 101 | +many files in the new version have not been previously scanned. A file has been |
| 102 | +previously scanned if the exact same file has ever been scanned by Vendetta. |
| 103 | +FOSSA recommends scanning your codebase on a regular basis to keep scan times |
| 104 | +low. |
0 commit comments