Skip to content

Commit e06faa0

Browse files
authored
[ANE-2672] Add --x-vendetta flag (#1607)
* Rough draft * Build source unit * Return full ficus analysis result * Propogate full FicusAnalysisResults * Extract ficus result source unit * Use proper metadata format * Insignificant change * Update to accept specific strategies * Fix merge mistake * Improve FicusSpec * Lint * Add vendetta doc * Update changelog * Update ficus cmd * Add reference to vendetta docs * Address PR comments * Document more accurate timing * Fix typo * Use correct snippet scan strategy name * Only show snippet scan result logs if snippet scan is requested * Pattern match acc * Reword error * Move FicusSpec to unit tests and use Test.Effect * Add doc about doing an inital local scan * Simplify spec
1 parent b926d24 commit e06faa0

File tree

21 files changed

+447
-160
lines changed

21 files changed

+447
-160
lines changed

Changelog.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# FOSSA CLI Changelog
22

3+
## 3.14.0
4+
- Adds `--x-vendetta` flag for vendored dependency identification ([#1607](https://github.com/fossas/fossa-cli/pull/1607))
5+
36
## 3.13.1
47
- Add a summary of the snippet scan when the `--x-snippet-scan` flag is used ([#1613](https://github.com/fossas/fossa-cli/pull/1613))
58
- Update snippet scanning documentation ([#1615](https://github.com/fossas/fossa-cli/pull/1615))

docs/features/vendetta.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
2+
# Vendetta
3+
4+
Vendetta is the name of FOSSA's vendored dependency identification feature.
5+
6+
Vendetta hashes files in your first party source code, compares them against
7+
FOSSA's knowledge base, and matches them to common open source components before
8+
finally feeding those matches to a special algorithm that deduces a holistic set
9+
of vendored open source dependencies present in your project.
10+
11+
Vendetta can be run as part of `fossa analyze`. To enable it, add the
12+
`--x-vendetta` flag when you run `fossa analyze`:
13+
14+
```sh
15+
fossa analyze --x-vendetta
16+
```
17+
18+
## How Vendetta Works
19+
20+
When `--x-vendetta` is enabled, the CLI:
21+
22+
1. **Hashes Files**: Creates MD5 hashes of the contents of all relevant files.
23+
2. **Filters Content**: By default, skips directories like `.git/`, and hidden
24+
directories. This includes, from `.fossa.yml`,
25+
`vendoredDependencies.licenseScanPathFilters.exclude`, documented further
26+
below.
27+
5. **Uploads Hashes**: Sends only the hashes to FOSSA's servers.
28+
6. **Receives Matches**: Gets back information about any matching open source
29+
components.
30+
7. **Infers Dependencies**: Feeds the matches to an algorithm that heuristically
31+
identifies the vendored dependencies in your project.
32+
33+
## Data Sent to FOSSA
34+
35+
Vendetta sends _only_ the MD5 hashes of your file contents to FOSSA. The raw
36+
contents are never sent to FOSSA.
37+
38+
## Data Retention
39+
40+
The MD5 hashes are stored permanently in FOSSA.
41+
42+
## Directory Filtering
43+
44+
By default, Vendetta excludes common non-production directories and follows
45+
`.gitignore` patterns:
46+
47+
- Hidden directories.
48+
- Globs as directed by `.gitignore` files.
49+
50+
#### Custom Exclude Filtering
51+
52+
You can customize which files and directories are excluded from Vendetta by
53+
configuring exclude filters in your `.fossa.yml` file. Note that Vendetta scans
54+
currently only support exclude patterns, not `only` patterns.
55+
56+
For example:
57+
```yaml
58+
version: 3
59+
vendoredDependencies:
60+
licenseScanPathFilters:
61+
exclude:
62+
- "**/test/**"
63+
- "**/tests/**"
64+
- "**/spec/**"
65+
- "**/node_modules/**"
66+
- "**/dist/**"
67+
- "**/build/**"
68+
- "**/*.test.js"
69+
- "**/*.spec.ts"
70+
```
71+
72+
**Important Notes:**
73+
74+
- Vendetta scanning only use the `exclude` filters from `licenseScanPathFilters`
75+
— `only` filters are ignored for this use-case.
76+
- Path filters use standard glob patterns (e.g., `**/*` for recursive matching,
77+
`*` for single-directory matching).
78+
- The configuration goes in the
79+
`vendoredDependencies.licenseScanPathFilters.exclude` section.
80+
- These exclude patterns are passed directly to the Ficus scanning engine as
81+
`--exclude` arguments.
82+
- Default exclusions (hidden files, `.gitignore` patterns) are applied in
83+
addition to custom excludes.
84+
85+
## A note on scan times
86+
87+
The first time you run Vendetta on a codebase, it may take a long time to scan.
88+
For example, scanning [Linux](https://github.com/torvalds/linux) for the first
89+
time may take upwards of 60 minutes. This is because most of the files in your
90+
codebase will have never been checked against FOSSA's knowledge base for open
91+
source components, which can take time.
92+
93+
Once you scan the first time however, FOSSA will cache the open source component
94+
matches for each MD5 hash Vendetta provides. This means that subsequent scans of
95+
the same project will be drastically faster. For example, scanning the same
96+
revision of Linux twice in a row should result in the second scan taking only
97+
1-2 minutes.
98+
99+
The time it takes to scan newer versions of your codebase will depend on how
100+
many files in the new version have not been previously scanned. A file has been
101+
previously scanned if the exact same file has ever been scanned by Vendetta.
102+
FOSSA recommends scanning your codebase on a regular basis to keep scan times
103+
low. Additionally, if you intend on running Vendetta as part of your CI
104+
pipeline, it might be best to do a manual run first on a local machine. That
105+
way, future automated scans of your project will be able to benefit from the
106+
initial caching done in the first scan.

docs/references/subcommands/analyze.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,25 @@ Snippet Scanning must also be enabled for your organization, and is only availab
158158

159159
For more detail about how Snippet Scanning works, how to use file filtering during Snippet Scanning, what information is sent to FOSSA's servers and a description of the Snippet Scan Summary, see [the Snippet Scanning feature documentation](../../features/snippet-scanning.md).
160160

161+
### Vendored Dependency Scanning with Vendetta
162+
163+
Vendetta is a feature that identifies the paths of potential open source code
164+
dependencies vendored in your project by comparing file hashes against FOSSA's
165+
knowledge base. This feature helps find dependencies that are included in your
166+
project directly as source.
167+
168+
#### Enabling Vendetta
169+
170+
| Name | Description |
171+
|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
172+
| `--x-vendetta` | Enable vendored dependency scanning during analysis. This experimental feature hashes your source files and checks them against FOSSA's open source component database. |
173+
174+
#### More detail
175+
176+
For more detail about how Vendetta works, how to use file filtering during
177+
scanning, or what information is sent to FOSSA's servers, see
178+
[the Vendetta feature documentation](../../features/vendetta.md).
179+
161180
### Experimental Options
162181

163182
_Important: For support and other general information, refer to the [experimental options overview](../experimental/README.md) before using experimental options._

integration-test/Analysis/FicusSpec.hs

Lines changed: 0 additions & 74 deletions
This file was deleted.

spectrometer.cabal

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -657,6 +657,7 @@ test-suite unit-tests
657657
Erlang.Rebar3TreeSpec
658658
Extra.ListSpec
659659
Extra.TextSpec
660+
Ficus.FicusSpec
660661
Fortran.FpmTomlSpec
661662
Fossa.API.TypesSpec
662663
Go.GlideLockSpec
@@ -758,7 +759,6 @@ test-suite integration-tests
758759
Analysis.CocoapodsSpec
759760
Analysis.ElixirSpec
760761
Analysis.ErlangSpec
761-
Analysis.FicusSpec
762762
Analysis.FixtureExpectationUtils
763763
Analysis.FixtureUtils
764764
Analysis.GoSpec

src/App/Fossa/Analyze.hs

Lines changed: 31 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ import App.Fossa.Config.Analyze (
5151
import App.Fossa.Config.Analyze qualified as Config
5252
import App.Fossa.Config.Common (DestinationMeta (..), destinationApiOpts, destinationMetadata)
5353
import App.Fossa.Ficus.Analyze (analyzeWithFicus)
54+
import App.Fossa.Ficus.Types (FicusAnalysisResults (vendoredDependencyScanResults), FicusStrategy (FicusStrategySnippetScan, FicusStrategyVendetta), FicusVendoredDependencyScanResults (FicusVendoredDependencyScanResults))
5455
import App.Fossa.FirstPartyScan (runFirstPartyScan)
5556
import App.Fossa.Lernie.Analyze (analyzeWithLernie)
5657
import App.Fossa.Lernie.Types (LernieResults (..))
@@ -103,12 +104,12 @@ import Data.Flag (Flag, fromFlag)
103104
import Data.Foldable (traverse_)
104105
import Data.Functor (($>))
105106
import Data.List.NonEmpty qualified as NE
106-
import Data.Maybe (fromMaybe, isJust, mapMaybe)
107+
import Data.Maybe (catMaybes, fromMaybe, isJust, mapMaybe, maybeToList)
107108
import Data.String.Conversion (decodeUtf8, toText)
108109
import Data.Text.Extra (showT)
109110
import Data.Traversable (for)
110111
import Diag.Diagnostic as DI
111-
import Diag.Result (Result (Success), resultToMaybe)
112+
import Diag.Result (resultToMaybe)
112113
import Discovery.Archive qualified as Archive
113114
import Discovery.Filters (AllFilters, MavenScopeFilters, applyFilters, filterIsVSIOnly, ignoredPaths, isDefaultNonProductionPath)
114115
import Discovery.Projects (withDiscoveredProjects)
@@ -302,6 +303,7 @@ analyze cfg = Diag.context "fossa-analyze" $ do
302303
allowedTactics = Config.allowedTacticTypes cfg
303304
withoutDefaultFilters = Config.withoutDefaultFilters cfg
304305
enableSnippetScan = Config.xSnippetScan cfg
306+
enableVendetta = Config.xVendetta cfg
305307

306308
manualSrcUnits <-
307309
Diag.errorBoundaryIO . diagToDebug $
@@ -340,27 +342,27 @@ analyze cfg = Diag.context "fossa-analyze" $ do
340342
if (fromFlag BinaryDiscovery $ Config.binaryDiscoveryEnabled $ Config.vsiOptions cfg)
341343
then analyzeDiscoverBinaries basedir filters
342344
else pure Nothing
345+
let ficusStrategies =
346+
catMaybes
347+
[ if enableSnippetScan then Just FicusStrategySnippetScan else Nothing
348+
, if enableVendetta then Just FicusStrategyVendetta else Nothing
349+
]
343350
maybeFicusResults <-
344351
Diag.errorBoundaryIO . diagToDebug $
345-
if not enableSnippetScan
352+
if null ficusStrategies || filterIsVSIOnly filters
346353
then do
347-
logInfo "Skipping ficus snippet scanning (--x-snippet-scan not set)"
348354
pure Nothing
349355
else
350-
if filterIsVSIOnly filters
351-
then do
352-
logInfo "Running in VSI only mode, skipping snippet-scan"
353-
pure Nothing
354-
else
355-
Diag.context "snippet-scanning"
356-
. runStickyLogger SevInfo
357-
$ analyzeWithFicus
358-
basedir
359-
maybeApiOpts
360-
revision
361-
(Config.licenseScanPathFilters vendoredDepsOptions)
362-
(orgSnippetScanSourceCodeRetentionDays =<< orgInfo)
363-
(Config.debugDir cfg)
356+
Diag.context "ficus-scanning"
357+
. runStickyLogger SevInfo
358+
$ analyzeWithFicus
359+
basedir
360+
maybeApiOpts
361+
revision
362+
ficusStrategies
363+
(Config.licenseScanPathFilters vendoredDepsOptions)
364+
(orgSnippetScanSourceCodeRetentionDays =<< orgInfo)
365+
(Config.debugDir cfg)
364366
let ficusResults = join $ resultToMaybe maybeFicusResults
365367

366368
maybeLernieResults <-
@@ -378,13 +380,22 @@ analyze cfg = Diag.context "fossa-analyze" $ do
378380
vsiResults' :: [SourceUnit]
379381
vsiResults' = fromMaybe [] $ join (resultToMaybe vsiResults)
380382

383+
ficusResults' :: [SourceUnit]
384+
ficusResults' =
385+
maybeToList $
386+
ficusResults
387+
>>= vendoredDependencyScanResults
388+
>>= \(FicusVendoredDependencyScanResults maybeSrcUnit) -> maybeSrcUnit
389+
381390
additionalSourceUnits :: [SourceUnit]
382-
additionalSourceUnits = vsiResults' <> mapMaybe (join . resultToMaybe) [manualSrcUnits, binarySearchResults, dynamicLinkedResults]
391+
additionalSourceUnits = vsiResults' <> ficusResults' <> mapMaybe (join . resultToMaybe) [manualSrcUnits, binarySearchResults, dynamicLinkedResults]
383392
traverse_ (Diag.flushLogs SevError SevDebug) [manualSrcUnits, binarySearchResults, dynamicLinkedResults]
384393
-- Flush logs using the original Result from VSI.
385394
traverse_ (Diag.flushLogs SevError SevDebug) [vsiResults]
386395
-- Flush logs from lernie
387396
traverse_ (Diag.flushLogs SevError SevDebug) [maybeLernieResults]
397+
-- Flush logs from ficus
398+
traverse_ (Diag.flushLogs SevError SevDebug) [maybeFicusResults]
388399

389400
maybeFirstPartyScanResults <-
390401
Diag.errorBoundaryIO . diagToDebug $
@@ -450,7 +461,7 @@ analyze cfg = Diag.context "fossa-analyze" $ do
450461
$ analyzeForReachability projectScans
451462
let reachabilityUnits = onlyFoundUnits reachabilityUnitsResult
452463

453-
let analysisResult = AnalysisScanResult projectScans vsiResults binarySearchResults (Success [] Nothing) manualSrcUnits dynamicLinkedResults maybeLernieResults reachabilityUnitsResult
464+
let analysisResult = AnalysisScanResult projectScans vsiResults binarySearchResults maybeFicusResults manualSrcUnits dynamicLinkedResults maybeLernieResults reachabilityUnitsResult
454465
isDebugMode = isJust (Config.debugDir cfg)
455466
renderScanSummary isDebugMode maybeEndpointAppVersion analysisResult cfg
456467

src/App/Fossa/Analyze/Types.hs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ module App.Fossa.Analyze.Types (
1212

1313
import App.Fossa.Analyze.Project (ProjectResult)
1414
import App.Fossa.Config.Analyze (ExperimentalAnalyzeConfig)
15-
import App.Fossa.Ficus.Types (FicusSnippetScanResults)
15+
import App.Fossa.Ficus.Types (FicusAnalysisResults)
1616
import App.Fossa.Lernie.Types (LernieResults)
1717
import App.Fossa.Reachability.Types (SourceUnitReachability (..))
1818
import App.Types (Mode)
@@ -81,7 +81,7 @@ data AnalysisScanResult = AnalysisScanResult
8181
{ analyzersScanResult :: [DiscoveredProjectScan]
8282
, vsiScanResult :: Result (Maybe [SourceUnit])
8383
, binaryDepsScanResult :: Result (Maybe SourceUnit)
84-
, ficusResult :: Result (Maybe FicusSnippetScanResults)
84+
, ficusResult :: Result (Maybe FicusAnalysisResults)
8585
, fossaDepsScanResult :: Result (Maybe SourceUnit)
8686
, dynamicLinkingResult :: Result (Maybe SourceUnit)
8787
, lernieResult :: Result (Maybe LernieResults)

0 commit comments

Comments
 (0)