Skip to content

Commit f9487d1

Browse files
authored
Merge pull request #2980 from port-labs/PORT-16644-docs-for-github-repo-search
documentation for ingesting with repository search on Github Ocean
2 parents aba906e + 01313bf commit f9487d1

File tree

1 file changed

+94
-31
lines changed
  • docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean

1 file changed

+94
-31
lines changed

docs/build-your-software-catalog/sync-data-to-catalog/git/github-ocean/github-ocean.md

Lines changed: 94 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ import GitHubResources from './\_github_exporter_supported_resources.mdx'
1010

1111
Port's GitHub self-hosted integration allows you to model GitHub resources in your software catalog and ingest data into them.
1212

13-
1413
## Overview
1514

1615
Here's what you can do with the GitHub integration:
@@ -36,10 +35,11 @@ organizations:
3635
- org2
3736
# ... rest of your mapping (repositoryType, resources, etc.) ...
3837
```
39-
</details>
4038

39+
</details>
4140

4241
:::caution Authentication and configuration requirements:
42+
4343
- **With classic PAT**:
4444
- Specify organizations in port mapping: `organizations: ["org1", "org2", "org3"]`
4545
- **With GitHub App or Fine-grained PAT**: Specify exactly one organization by setting the `githubOrganization` in the environment variables: `githubOrganization: "my-org"`
@@ -49,20 +49,17 @@ organizations:
4949
**Performance consideration:** Syncing multiple organizations will increase the number of API calls to GitHub and may slow down the integration. The more organizations you sync, the longer the resync time and the higher the API rate limit consumption. Consider syncing only the organizations you need.
5050
:::
5151

52-
5352
### Supported resources
5453

5554
The resources that can be ingested from GitHub into Port are listed below.
5655
It is possible to reference any field that appears in the API responses linked below in the mapping configuration.
5756

5857
<GitHubResources/>
5958

60-
6159
## Setup
6260

6361
To install the integration, see the [installation page](./installation).
6462

65-
6663
## Configuration
6764

6865
Port integrations use a [YAML mapping block](/build-your-software-catalog/customize-integrations/configure-mapping#configuration-structure) to ingest data from the third-party api into Port.
@@ -88,15 +85,15 @@ The `repositoryType` parameter filters which repositories are ingested. It corre
8885
<details>
8986
<summary><b>Possible values (Click to expand)</b></summary>
9087

91-
* `all` (default): All repositories accessible to the provided token.
92-
* `public`: Public repositories.
93-
* `private`: Private repositories.
94-
* `forks`: Only forked repositories.
95-
* `sources`: Only non-forked repositories.
88+
- `all` (default): All repositories accessible to the provided token.
89+
- `public`: Public repositories.
90+
- `private`: Private repositories.
91+
- `forks`: Only forked repositories.
92+
- `sources`: Only non-forked repositories.
9693
</details>
9794

9895
See the default mapping below for a usage example.
99-
96+
10097
### Default mapping configuration
10198

10299
This is the default mapping configuration for this integration:
@@ -105,19 +102,19 @@ This is the default mapping configuration for this integration:
105102
<summary><b>Default mapping configuration (Click to expand)</b></summary>
106103

107104
```yaml showLineNumbers
108-
repositoryType: 'all'
105+
repositoryType: "all"
109106
deleteDependentEntities: true
110107
createMissingRelatedEntities: true
111108
resources:
112109
- kind: organization
113110
selector:
114-
query: 'true'
111+
query: "true"
115112
port:
116113
entity:
117114
mappings:
118115
identifier: .login
119116
title: .login
120-
blueprint: '''githubOrganization'''
117+
blueprint: '"githubOrganization"'
121118
properties:
122119
login: .login
123120
id: .id
@@ -133,7 +130,7 @@ resources:
133130
description: if .description then .description else "" end
134131
- kind: repository
135132
selector:
136-
query: 'true'
133+
query: "true"
137134
port:
138135
entity:
139136
mappings:
@@ -151,7 +148,7 @@ resources:
151148
organization: .owner.login
152149
- kind: pull-request
153150
selector:
154-
query: 'true'
151+
query: "true"
155152
state: "open"
156153
port:
157154
entity:
@@ -171,20 +168,18 @@ resources:
171168
prNumber: ".id"
172169
link: ".html_url"
173170
leadTimeHours: >-
174-
(.created_at as $createdAt | .merged_at as $mergedAt |
175-
($createdAt | sub("\\..*Z$"; "Z") | strptime("%Y-%m-%dT%H:%M:%SZ") | mktime) as $createdTimestamp |
176-
($mergedAt | if . == null then null else sub("\\..*Z$"; "Z") |
177-
strptime("%Y-%m-%dT%H:%M:%SZ") | mktime end) as $mergedTimestamp |
178-
if $mergedTimestamp == null then null else
179-
(((($mergedTimestamp - $createdTimestamp) / 3600) * 100 | floor) / 100) end)
171+
(.created_at as $createdAt | .merged_at as $mergedAt |
172+
($createdAt | sub("\\..*Z$"; "Z") | strptime("%Y-%m-%dT%H:%M:%SZ") | mktime) as $createdTimestamp |
173+
($mergedAt | if . == null then null else sub("\\..*Z$"; "Z") |
174+
strptime("%Y-%m-%dT%H:%M:%SZ") | mktime end) as $mergedTimestamp |
175+
if $mergedTimestamp == null then null else
176+
(((($mergedTimestamp - $createdTimestamp) / 3600) * 100 | floor) / 100) end)
180177
relations:
181178
repository: .__repository
182179
```
183180
184181
</details>
185182
186-
187-
188183
## Capabilities
189184
190185
### Ingest Git objects
@@ -199,7 +194,6 @@ The GitHub integration uses a YAML configuration file to describe the ETL proces
199194
200195
The GitHub integration automatically syncs organization-level data (available from **v3.0.0-beta**).
201196
202-
203197
:::tip Organization as parent entity
204198
Organizations serve as parent entities for repositories, teams, and other GitHub resources, helping you model your organizational structure in Port.
205199
:::
@@ -247,14 +241,14 @@ resources:
247241
```
248242
</details>
249243
250-
251244
:::tip Test your mapping
252-
After adding the `file` kind to your mapping configuration, click on the `Resync` button. When you open the mapping configuration again, you will see real examples of files fetched from your GitHub organization.
245+
After adding the `file` kind to your mapping configuration, click on the `Resync` button. When you open the mapping configuration again, you will see real examples of files fetched from your GitHub organization.
253246

254-
This will help you see what data is available to use in your `jq` expressions.
247+
This will help you see what data is available to use in your `jq` expressions.
255248
Click on the `Test mapping` button to test your mapping against the example data.
256249

257250
In any case, the structure of the available data looks like this:
251+
258252
<details>
259253
<summary><b>Available data example (click to expand)</b></summary>
260254

@@ -734,6 +728,7 @@ In any case, the structure of the available data looks like this:
734728
}
735729
}
736730
```
731+
737732
</details>
738733
:::
739734

@@ -783,7 +778,7 @@ For multi-document YAML files (a single file containing multiple YAML documents
783778

784779
You can use one of these methods to ingest multi-document YAML files:
785780

786-
1. Use the `itemsToParse` key to create multiple entities from such a file (see example above).
781+
1. Use the `itemsToParse` key to create multiple entities from such a file (see example above).
787782
2. Map the result to an `array` property.
788783

789784
:::tip Mixed YAML types
@@ -792,13 +787,13 @@ If you have both single-document and multi-document YAML files in your repositor
792787
```yaml
793788
itemsToParse: .content | if type== "object" then [.] else . end
794789
```
795-
:::
796790

791+
:::
797792

798793
#### Ingest raw file content
799794

800795
If you need to ingest the raw content of a file without parsing it, you can use the `skipParsing` key in your file selector.
801-
This is useful when you want to store the file content as a string or YAML property.
796+
This is useful when you want to store the file content as a string or YAML property.
802797

803798
When `skipParsing` is set to `true`, the file content will be kept in its original string format instead of being parsed into a JSON/YAML object.
804799

@@ -835,6 +830,74 @@ resources:
835830
- Only JSON and YAML formats are automatically parsed.
836831
Other file formats can be ingested as raw files, however, some special characters in the file (such as `\n`) may be processed and not preserved.
837832

833+
### Ingest repositories via search API
834+
835+
Port's Github integration allows you to ingest repositories using the [Github repository search API](https://docs.github.com/en/search-github/searching-on-github/searching-for-repositories). This feature provides granular control over ingested repositories, making the integration more capable and flexible.
836+
837+
<details>
838+
<summary><b>Example mapping (click to expand)</b></summary>
839+
840+
```yaml showLineNumbers
841+
resources:
842+
- kind: repository
843+
selector:
844+
query: "true"
845+
repoSearch:
846+
query: "dev in:name archived:false"
847+
port:
848+
entity:
849+
mappings:
850+
identifier: .name
851+
title: .name
852+
blueprint: '"githubRepository"'
853+
properties:
854+
description: if .description then .description else "" end
855+
visibility: if .private then "private" else "public" end
856+
defaultBranch: .default_branch
857+
readme: file://README.md
858+
url: .html_url
859+
language: if .language then .language else "" end
860+
- kind: pull-request
861+
selector:
862+
query: "true"
863+
repoSearch:
864+
query: "dev in:name archived:false" # repo search is also supported in pull requests.
865+
state: open
866+
port:
867+
entity:
868+
mappings:
869+
identifier: .head.repo.name + (.id|tostring)
870+
title: .title
871+
blueprint: '"githubPullRequest"'
872+
properties:
873+
creator: .user.login
874+
assignees: "[.assignees[].login]"
875+
reviewers: "[.requested_reviewers[].login]"
876+
status: .state
877+
closedAt: .closed_at
878+
updatedAt: .updated_at
879+
mergedAt: .merged_at
880+
createdAt: .created_at
881+
prNumber: .id
882+
link: .html_url
883+
relations:
884+
repository: .__repository
885+
```
886+
887+
</details>
888+
889+
The repository search feature supports all resource kinds except `team`, `user`, `file`, and `folder`. To learn more about repository search, see the [GitHub documentation](https://docs.github.com/en/search-github/searching-on-github/searching-for-repositories).
890+
891+
#### Benefits
892+
893+
- **Granular filtering**: Precisely control which repositories are ingested..
894+
895+
#### Limitations
896+
897+
The repository search feature is subject to the limitations of the GitHub Search API:
898+
899+
- **Search results are limited to 1,000 items**: You can only ingest a maximum of 1,000 repositories per search query.
900+
- **Strict rate limits**: The API allows a maximum of 30 requests per minute.
838901

839902
## Examples
840903

0 commit comments

Comments
 (0)