Skip to content

Commit 9605a3c

Browse files
authored
Merge pull request #107 from advanced-security/new-vendor-and-pii-patterns
New vendor and PII patterns
2 parents 9b84660 + aef6741 commit 9605a3c

File tree

7 files changed

+561
-2
lines changed

7 files changed

+561
-2
lines changed

README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,12 @@ Custom Secret Scanning Patterns repository.
101101
- IBAN
102102

103103
- Norwegian national identity number/D number
104+
105+
- US Social Security number
106+
107+
- US Individual Taxpayer Identification Number (ITIN)
108+
109+
- UK National Insurance Number
104110

105111

106112
### [RSA Keys](./rsa)
@@ -206,4 +212,14 @@ Custom Secret Scanning Patterns repository.
206212
- Azure Shared Access Signature (SAS) Token
207213

208214
- CircleCI API token
215+
216+
- AWS Key ID (standalone)
217+
218+
- Azure generic key
219+
220+
- Azure generic key (legacy)
221+
222+
- AWS Bedrock API Key
223+
224+
- AWS Bedrock API Key (2)
209225

configs/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -639,7 +639,7 @@ Add these additional matches to the [Secret Scanning Custom Pattern](https://doc
639639
- Not Match:
640640

641641
```regex
642-
^(/|file:///|https?://[A-Za-z]:/)[A-Za-z0-9._-]{3,}+(/[a-z._-]{1,}){2,}/?$
642+
^(/|file:///|https?://[A-Za-z]:/)[A-Za-z0-9._-]{3,}(/[a-z._-]{1,}){2,}/?$
643643
```
644644

645645
</details>

configs/patterns.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ patterns:
327327
# non-secret related content
328328
- ^(?i)(true|false|y(es)?|no?|on|off|0|1|nill|null|none|(\\x[a-f0-9]{2})+)$
329329
# a path
330-
- '^(/|file:///|https?://[A-Za-z]:/)[A-Za-z0-9._-]{3,}+(/[a-z._-]{1,}){2,}/?$'
330+
- '^(/|file:///|https?://[A-Za-z]:/)[A-Za-z0-9._-]{3,}(/[a-z._-]{1,}){2,}/?$'
331331
comments:
332332
- "Looks for secrets in the format of `SECRET=secret` at the start of a line, possibly with an `ENV ` or `export ` prefix"
333333
- "Allows no whitespace in the secret, to cut false positives"

pii/README.md

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,4 +233,159 @@ Add these additional matches to the [Secret Scanning Custom Pattern](https://doc
233233
1111111111[123]|11112222333|01123456978|410185 ?123 ?45|220676 ?123 ?45|01010202010|01010101023
234234
```
235235

236+
</details>
237+
238+
## US Social Security number
239+
240+
241+
242+
_version: v0.1_
243+
244+
**Comments / Notes:**
245+
246+
247+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing
248+
249+
- This can produce false positives, since it doesn't check for all known-invalid numbers
250+
251+
- Examples include 123-45-6789 and 078-05-1120 - the latter is ignored already
252+
253+
254+
<details>
255+
<summary>Pattern Format</summary>
256+
257+
```regex
258+
(?P<area>00[1-9]|0[1-9][0-9]|[1-8][0-9][0-9])-(?P<group>0[1-9]|[1-9][0-9])-(?P<serial>[0-9]{4})
259+
```
260+
261+
</details>
262+
263+
<details>
264+
<summary>Start Pattern</summary>
265+
266+
```regex
267+
\A|[^0-9A-Za-z_-]
268+
```
269+
270+
</details><details>
271+
<summary>End Pattern</summary>
272+
273+
```regex
274+
\z|[^0-9A-Za-z_-]
275+
```
276+
277+
</details>
278+
279+
<details>
280+
<summary>Additional Matches</summary>
281+
282+
Add these additional matches to the [Secret Scanning Custom Pattern](https://docs.github.com/en/enterprise-cloud@latest/code-security/secret-scanning/defining-custom-patterns-for-secret-scanning#example-of-a-custom-pattern-specified-using-additional-requirements).
283+
284+
285+
- Not Match:
286+
287+
```regex
288+
^666-.*$
289+
```
290+
- Not Match:
291+
292+
```regex
293+
^.*-0000$
294+
```
295+
- Not Match:
296+
297+
```regex
298+
^078-05-1120$
299+
```
300+
301+
</details>
302+
303+
## US Individual Taxpayer Identification Number (ITIN)
304+
305+
306+
307+
_version: v0.1_
308+
309+
**Comments / Notes:**
310+
311+
312+
- This can produce false positives, since it doesn't check for all known-invalid numbers
313+
314+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing
315+
316+
317+
<details>
318+
<summary>Pattern Format</summary>
319+
320+
```regex
321+
9[0-9][0-9]-(?:5[0-9]|6[0-5]|7[0-9]|8[0-8]|9[0-24-9])-[0-9]{4}
322+
```
323+
324+
</details>
325+
326+
<details>
327+
<summary>Start Pattern</summary>
328+
329+
```regex
330+
\A|[^0-9A-Za-z_-]
331+
```
332+
333+
</details><details>
334+
<summary>End Pattern</summary>
335+
336+
```regex
337+
\z|[^0-9A-Za-z_-]
338+
```
339+
340+
</details>
341+
342+
## UK National Insurance Number
343+
344+
345+
346+
_version: v0.1_
347+
348+
**Comments / Notes:**
349+
350+
351+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing
352+
353+
354+
<details>
355+
<summary>Pattern Format</summary>
356+
357+
```regex
358+
[A-Z]{2} ?[0-9]{2} ?[0-9]{2} ?[0-9]{2} ?[A-D]
359+
```
360+
361+
</details>
362+
363+
<details>
364+
<summary>Start Pattern</summary>
365+
366+
```regex
367+
\A|[^0-9A-Za-z]
368+
```
369+
370+
</details><details>
371+
<summary>End Pattern</summary>
372+
373+
```regex
374+
\z|[^0-9A-Za-z]
375+
```
376+
377+
</details>
378+
379+
<details>
380+
<summary>Additional Matches</summary>
381+
382+
Add these additional matches to the [Secret Scanning Custom Pattern](https://docs.github.com/en/enterprise-cloud@latest/code-security/secret-scanning/defining-custom-patterns-for-secret-scanning#example-of-a-custom-pattern-specified-using-additional-requirements).
383+
384+
385+
- Not Match:
386+
387+
```regex
388+
^QQ ?12 ?34 ?56 ?[A-D]$
389+
```
390+
236391
</details>

pii/patterns.yml

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,3 +257,63 @@ patterns:
257257
- With no validation of the checksum this can cause a lot of false positives
258258
- The example test data does not have a valid checksum - it is one of the examples used with one digit in the checksum changed
259259
- You can test using the correct checksum, but it is used as a NOT match here to prevent false positives on other test data
260+
261+
- name: US Social Security number
262+
type: us_ssn
263+
regex:
264+
pattern: |
265+
(?P<area>00[1-9]|0[1-9][0-9]|[1-8][0-9][0-9])-(?P<group>0[1-9]|[1-9][0-9])-(?P<serial>[0-9]{4})
266+
start: |
267+
\A|[^0-9A-Za-z_-]
268+
end: |
269+
\z|[^0-9A-Za-z_-]
270+
additional_not_match:
271+
- ^666-.*$
272+
- ^.*-0000$
273+
- ^078-05-1120$
274+
test:
275+
data: |
276+
123-45-6789
277+
start_offset: 0
278+
end_offset: 11
279+
comments:
280+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing
281+
- This can produce false positives, since it doesn't check for all known-invalid numbers
282+
- Examples include 123-45-6789 and 078-05-1120 - the latter is ignored already
283+
284+
- name: US Individual Taxpayer Identification Number (ITIN)
285+
type: us_itin
286+
regex:
287+
pattern: |
288+
9[0-9][0-9]-(?:5[0-9]|6[0-5]|7[0-9]|8[0-8]|9[0-24-9])-[0-9]{4}
289+
start: |
290+
\A|[^0-9A-Za-z_-]
291+
end: |
292+
\z|[^0-9A-Za-z_-]
293+
test:
294+
data: |
295+
912-70-1234
296+
start_offset: 0
297+
end_offset: 11
298+
comments:
299+
- This can produce false positives, since it doesn't check for all known-invalid numbers
300+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing
301+
302+
- name: UK National Insurance Number
303+
type: uk_national_insurance_number
304+
regex:
305+
pattern: |
306+
[A-Z]{2} ?[0-9]{2} ?[0-9]{2} ?[0-9]{2} ?[A-D]
307+
start: |
308+
\A|[^0-9A-Za-z]
309+
end: |
310+
\z|[^0-9A-Za-z]
311+
additional_not_match:
312+
- ^QQ ?12 ?34 ?56 ?[A-D]$
313+
test:
314+
data: |
315+
QQ012345C
316+
start_offset: 0
317+
end_offset: 9
318+
comments:
319+
- There is no checksum, so where this produces false positives there is no reliable way to filter them out with post-processing

0 commit comments

Comments
 (0)