Skip to content

Commit a46debe

Browse files
committed
feat: ✨ script
1 parent 150ec53 commit a46debe

File tree

2 files changed

+318
-1
lines changed

2 files changed

+318
-1
lines changed

README.md

Lines changed: 128 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,128 @@
1-
# PowerShell-File-Combiner-for-LLM-Input
1+
# PowerShell File Combiner for LLM Input
2+
3+
**PowerShell File Combiner** is a PowerShell script that aggregates text-based project files into a single text file, ideal for feeding into large-context-window LLMs like Gemini 2.5 Pro or GPT-4. It generates a directory tree and concatenates file contents while excluding binary files and specified directories, making it perfect for developers preparing codebases for AI analysis.
4+
5+
## Features
6+
7+
- **Directory Tree Generation**: Creates a visual representation of your project’s folder structure.
8+
- **File Content Aggregation**: Combines text-based files into a single output file (`CombinedFile.txt` by default).
9+
- **Binary File Exclusion**: Skips binary files (e.g., `.exe`, `.png`, `.pdf`) based on extensions or null-byte detection.
10+
- **Customizable Exclusions**: Exclude specific directories (e.g., `node_modules`, `bin`) and files.
11+
- **Depth Control**: Limit subdirectory recursion with a configurable `MaxDepth` setting.
12+
- **Error Handling**: Gracefully handles file access issues and logs errors to the output file.
13+
- **UTF-8 Encoding**: Ensures compatibility with diverse character sets.
14+
15+
## Use Cases
16+
17+
- Prepare a codebase for AI-driven code analysis or documentation generation.
18+
- Create a single text file for LLM input to summarize or refactor projects.
19+
- Generate a project overview with directory structure and file contents.
20+
21+
## Installation
22+
23+
1. **Clone the Repository**:
24+
```bash
25+
git clone https://github.com/your-username/PowerShell-FileCombiner.git
26+
```
27+
2. Ensure you have PowerShell 5.1 or later installed (available on Windows or via PowerShell Core on macOS/Linux).
28+
3. Place the script (`Combine-Files.ps1`) in your project directory.
29+
30+
## Usage
31+
32+
Run the script in PowerShell from your project directory:
33+
34+
```powershell
35+
.\Combine-Files.ps1
36+
```
37+
38+
This will:
39+
40+
- Generate a directory tree in `CombinedFile.txt`.
41+
- Append the contents of all text-based files, excluding specified directories and binary files.
42+
- Skip files like the script itself, `CombinedFile.txt`, and others defined in `$ExcludeFiles`.
43+
44+
### Configuration
45+
46+
Edit the script’s `#region Configuration` section to customize:
47+
48+
- **`$OutputPath`**: Set the output file name (default: `CombinedFile.txt`).
49+
- **`$SkipBinaryFiles`**: Set to `$true` to skip binary files (default: `$true`).
50+
- **`$BinaryFileExtensions`**: List of file extensions to treat as binary (e.g., `.exe`, `.png`).
51+
- **`$ExcludeDirectories`**: Directories to skip (e.g., `node_modules`, `bin`).
52+
- **`$ExcludeFiles`**: Files to exclude (e.g., `package-log.json`, the script itself).
53+
- **`$MaxDepth`**: Limit subdirectory depth (`-1` for unlimited).
54+
55+
Example configuration:
56+
57+
```powershell
58+
$OutputPath = "ProjectSummary.txt"
59+
$ExcludeDirectories = @("node_modules", "dist", "build")
60+
$MaxDepth = 3
61+
```
62+
63+
## Example Output
64+
65+
For a project with structure:
66+
67+
```
68+
project/
69+
├── src/
70+
│ ├── main.py
71+
│ └── utils.py
72+
├── README.md
73+
└── bin/
74+
└── script.sh
75+
```
76+
77+
Running the script produces `CombinedFile.txt` like:
78+
79+
```
80+
Directory Structure:
81+
--------------------
82+
|-- src
83+
|-- main.py
84+
|-- utils.py
85+
|-- README.md
86+
--------------------
87+
File Contents:
88+
--------------------
89+
File: C:\project\README.md
90+
--------------------
91+
# Project
92+
This is a sample project...
93+
--------------------
94+
File: C:\project\src\main.py
95+
--------------------
96+
def main():
97+
print("Hello, world!")
98+
...
99+
--------------------
100+
File: C:\project\src\utils.py
101+
--------------------
102+
def helper():
103+
return True
104+
...
105+
```
106+
107+
## Requirements
108+
109+
- PowerShell 5.1 or later (Windows) or PowerShell Core (cross-platform).
110+
- Write permissions in the output directory.
111+
112+
## Contributing
113+
114+
Contributions are welcome! Please:
115+
116+
1. Fork the repository.
117+
2. Create a feature branch (`git checkout -b feature/YourFeature`).
118+
3. Commit your changes (`git commit -m "Add YourFeature"`).
119+
4. Push to the branch (`git push origin feature/YourFeature`).
120+
5. Open a pull request.
121+
122+
## License
123+
124+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
125+
126+
## Keywords
127+
128+
PowerShell, file combiner, LLM input, code aggregation, project summarizer, text file generator, directory tree, AI code analysis, codebase preparation, text-based file processing

combine-files.ps1

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
#region Configuration
2+
$OutputPath = "CombinedFile.txt" # The name of the output file
3+
$CurrentDirectory = Get-Location # The current directory where the script is executed
4+
$SkipBinaryFiles = $true # Set to $true to skip binary files
5+
$BinaryFileExtensions = @(".exe", ".dll", ".png", ".jpg", ".jpeg", ".gif", ".bmp", ".zip", ".rar", ".7z", ".pdf", ".mp3", ".mp4", ".avi", ".psd", ".class", ".jar", ".ico") # Extensions to consider as binary
6+
$ExcludeDirectories = @("node_modules", "obj", "bin", "Migrations") # Directories to exclude
7+
$ExcludeFiles = @("package-log.json", "Combine-Files.ps1", $OutputPath) # Files to exclude (including the script itself and the output)
8+
$MaxDepth = -1 # Set -1 to go through entire structure, or set to a number for example - 3, to search only 3 subfolder levels
9+
10+
11+
#endregion Configuration
12+
13+
#region Helper Functions
14+
15+
# Function to build the directory tree structure
16+
function Build-DirectoryTree {
17+
param (
18+
[string]$Path,
19+
[int]$Depth = 0,
20+
[string]$Indent = ""
21+
)
22+
23+
# Check max depth
24+
if ($MaxDepth -gt 0 -and $Depth -gt $MaxDepth) {
25+
return
26+
}
27+
28+
# Check if directory is in exclude list
29+
if ($ExcludeDirectories -contains ([System.IO.Path]::GetFileName($Path))) {
30+
return
31+
}
32+
33+
$Directories = Get-ChildItem -Path $Path -Directory | Where-Object { -not ($ExcludeDirectories -contains $_.Name) }
34+
35+
for ($i = 0; $i -lt $Directories.Count; $i++) {
36+
$Directory = $Directories[$i]
37+
$IsLast = ($i -eq ($Directories.Count - 1))
38+
39+
$LinePrefix = $Indent
40+
if ($IsLast) {
41+
$LinePrefix += "`-- "
42+
$NextIndent = $Indent + " "
43+
} else {
44+
$LinePrefix += "|-- "
45+
$NextIndent = $Indent + "| "
46+
}
47+
48+
Write-Host "$LinePrefix$($Directory.Name)"
49+
50+
# Append the tree structure line to the output file
51+
"$LinePrefix$($Directory.Name)" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
52+
53+
# Recursively call the function for subdirectories
54+
Build-DirectoryTree -Path $Directory.FullName -Depth ($Depth + 1) -Indent $NextIndent
55+
}
56+
}
57+
58+
# Function to check if a file is likely a binary file
59+
function Is-BinaryFile {
60+
param (
61+
[string]$FilePath
62+
)
63+
64+
$Extension = [System.IO.Path]::GetExtension($FilePath)
65+
66+
if ($BinaryFileExtensions -contains $Extension) {
67+
return $true
68+
}
69+
70+
# Check for null bytes. If present, probably a binary. This is a simplified check.
71+
try {
72+
$Bytes = Get-Content -Path $FilePath -Encoding Byte -ReadCount 4096 -TotalCount 4096
73+
if ($Bytes -contains 0) {
74+
return $true
75+
} else {
76+
return $false
77+
}
78+
79+
}
80+
catch {
81+
# If we can't read the file (e.g., permissions issues), assume it's not a text file.
82+
return $true
83+
}
84+
}
85+
86+
#endregion Helper Functions
87+
88+
#region Main Script
89+
90+
# Clear the output file if it exists
91+
if (Test-Path -Path $OutputPath) {
92+
Remove-Item -Path $OutputPath
93+
}
94+
95+
# Write a header to the output file
96+
"Directory Structure:" | Out-File -FilePath $OutputPath -Encoding UTF8
97+
"--------------------" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
98+
99+
# Build and write the directory tree structure
100+
Build-DirectoryTree -Path $CurrentDirectory.Path
101+
102+
# Write a separator before the file contents start
103+
"--------------------" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
104+
"File Contents:" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
105+
"--------------------" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
106+
107+
# Get all files in the current directory and subdirectories
108+
$Files = Get-ChildItem -Path $CurrentDirectory.Path -File -Recurse
109+
110+
# Create an array to store the filtered files
111+
$FilteredFiles = @()
112+
113+
# Iterate through each file
114+
foreach ($File in $Files) {
115+
# Check if the file name is in the exclude list
116+
if ($ExcludeFiles -contains $File.Name) {
117+
Write-Host "Excluding file by name: $($File.FullName)"
118+
continue # Skip to the next file
119+
}
120+
121+
# Check if the file is located in an excluded directory
122+
$IsInExcludedDir = $false
123+
# Ensure the file's directory is a subdirectory of the current execution path and not the root itself.
124+
# This check is case-insensitive for paths, standard on Windows.
125+
if ($File.DirectoryName.StartsWith($CurrentDirectory.Path, [System.StringComparison]::OrdinalIgnoreCase) -and $File.DirectoryName.Length -gt $CurrentDirectory.Path.Length) {
126+
# Get the part of the file's directory path that is relative to the current execution path.
127+
# e.g., if $CurrentDirectory.Path is 'C:\\Project' and $File.DirectoryName is 'C:\\Project\\src\\bin',
128+
# $RelativeFileDirPath will be '\\src\\bin'.
129+
$RelativeFileDirPath = $File.DirectoryName.Substring($CurrentDirectory.Path.Length)
130+
131+
# Normalize by removing any leading path separators (e.g., '\\src\\bin' becomes 'src\\bin').
132+
$NormalizedRelativePath = $RelativeFileDirPath.TrimStart([System.IO.Path]::DirectorySeparatorChar)
133+
134+
# Split the normalized relative path into its directory components (e.g., 'src', 'bin').
135+
# Empty components (e.g., from '\\\\' in path) are filtered out.
136+
$PathComponents = $NormalizedRelativePath.Split([System.IO.Path]::DirectorySeparatorChar) | Where-Object { $_.Length -gt 0 }
137+
138+
# Check if any of these path components match a name in the $ExcludeDirectories list (case-insensitive).
139+
foreach ($ExcludedDirNameFromList in $ExcludeDirectories) { # e.g., "bin"
140+
# Perform a case-insensitive check for the excluded directory name within the path components
141+
$PathContainsExcludedDir = $PathComponents | Where-Object { $_.Equals($ExcludedDirNameFromList, [System.StringComparison]::OrdinalIgnoreCase) } | Select-Object -First 1
142+
if ($null -ne $PathContainsExcludedDir) {
143+
Write-Host "Excluding file: $($File.FullName) (Reason: Its path contains an excluded directory component matching '$ExcludedDirNameFromList')"
144+
$IsInExcludedDir = $true
145+
break # Found an excluded component; no need to check further for this file.
146+
}
147+
}
148+
}
149+
150+
if ($IsInExcludedDir) {
151+
continue # Skip to the next file
152+
}
153+
154+
# If the file passes all exclusion checks, add it to the filtered files array
155+
$FilteredFiles += $File
156+
}
157+
158+
159+
# Iterate through each file
160+
foreach ($File in $FilteredFiles) {
161+
162+
if ($SkipBinaryFiles -and (Is-BinaryFile -FilePath $File.FullName)) {
163+
Write-Host "Skipping binary file: $($File.FullName)"
164+
"Skipping binary file: $($File.FullName)" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
165+
continue # Skip to the next file
166+
}
167+
168+
Write-Host "Processing file: $($File.FullName)"
169+
170+
# Write the file path to the output file
171+
"--------------------" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
172+
"File: $($File.FullName)" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
173+
"--------------------" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
174+
175+
try {
176+
# Read the file content and append it to the output file
177+
Get-Content -Path $File.FullName -Encoding UTF8 -ErrorAction Stop | Out-File -FilePath $OutputPath -Append -Encoding UTF8
178+
179+
#Add a blank line
180+
"" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
181+
}
182+
catch {
183+
Write-Warning "Error reading file $($File.FullName): $($_.Exception.Message)"
184+
"Error reading file $($File.FullName): $($_.Exception.Message)" | Out-File -FilePath $OutputPath -Append -Encoding UTF8
185+
}
186+
}
187+
188+
Write-Host "Script completed. Combined file created at: $($OutputPath)"
189+
190+
#endregion Main Script

0 commit comments

Comments
 (0)