Skip to content

Commit 8d57c49

Browse files
Merge pull request #45 from VectorlyApp/packaging
v1.1.0: Restructure package and make it PyPI publishable
2 parents 5346612 + da0522e commit 8d57c49

38 files changed

+1656
-159
lines changed

.github/workflows/tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ jobs:
3636
key: ${{ runner.os }}-uv-${{ hashFiles('pyproject.toml') }}
3737

3838
- name: Install dependencies
39-
run: uv sync
39+
run: uv sync --extra dev
4040

4141
- name: Lint
4242
run: uv run pylint $(git ls-files '*.py')

README.md

Lines changed: 107 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -148,113 +148,151 @@ This substitutes parameter values and injects `auth_token` from cookies. The JSO
148148

149149
- Python 3.12+
150150
- Google Chrome (stable)
151-
- [uv (Python package manager)](https://github.com/astral-sh/uv)
151+
- [uv (Python package manager)](https://github.com/astral-sh/uv) (optional, for development)
152152
- macOS/Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`
153153
- Windows (PowerShell): `iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex`
154154
- OpenAI API key
155155

156-
## Set up Your Environment 🔧
156+
## Installation
157157

158-
### Linux
158+
### From PyPI (Recommended)
159+
160+
**Note:** We recommend using a virtual environment to avoid dependency conflicts.
161+
162+
```bash
163+
# Create and activate a virtual environment
164+
# Option 1: Using uv (recommended - handles Python version automatically)
165+
uv venv web-hacker-env
166+
source web-hacker-env/bin/activate # On Windows: web-hacker-env\Scripts\activate
167+
uv pip install web-hacker
168+
169+
# Option 2: Using python3 (if Python 3.12+ is your default)
170+
python3 -m venv web-hacker-env
171+
source web-hacker-env/bin/activate # On Windows: web-hacker-env\Scripts\activate
172+
pip install web-hacker
173+
174+
# Option 3: Using pyenv (if you need a specific Python version)
175+
pyenv install 3.12.3 # if not already installed
176+
pyenv local 3.12.3
177+
python -m venv web-hacker-env
178+
source web-hacker-env/bin/activate # On Windows: web-hacker-env\Scripts\activate
179+
pip install web-hacker
180+
181+
# Troubleshooting: If pip is not found, recreate the venv or use:
182+
python -m ensurepip --upgrade # Install pip in the venv
183+
pip install web-hacker
184+
```
185+
186+
### From Source (Development)
187+
188+
For development or if you want the latest code:
159189

160190
```bash
161-
# 1) Clone and enter the repo
191+
# Clone the repository
162192
git clone https://github.com/VectorlyApp/web-hacker.git
163193
cd web-hacker
164194

165-
# 2) Create & activate virtual environment (uv)
166-
uv venv --prompt web-hacker
167-
source .venv/bin/activate # Windows: .venv\\Scripts\\activate
195+
# Create and activate virtual environment
196+
python3 -m venv web-hacker-env
197+
source web-hacker-env/bin/activate # On Windows: web-hacker-env\Scripts\activate
168198

169-
# 3) Install exactly what lockfile says
170-
uv sync
199+
# Install in editable mode
200+
pip install -e .
171201

172-
# 4) Install in editable mode via uv (pip-compatible interface)
202+
# Or using uv (faster)
203+
uv venv web-hacker-env
204+
source web-hacker-env/bin/activate
173205
uv pip install -e .
174-
175-
# 5) Configure environment
176-
cp .env.example .env # then edit values
177-
# or set directly
178-
export OPENAI_API_KEY="sk-..."
179206
```
180207

181-
### Windows
208+
## Quickstart (Easiest Way) 🚀
182209

183-
```powershell
184-
# 1) Clone and enter the repo
185-
git clone https://github.com/VectorlyApp/web-hacker.git
186-
cd web-hacker
187-
188-
# 2) Install uv (if not already installed)
189-
iwr https://astral.sh/uv/install.ps1 -UseBasicParsing | iex
210+
The fastest way to get started is using the quickstart script, which automates the entire workflow:
190211

191-
# 3) Create & activate virtual environment (uv)
192-
uv venv --prompt web-hacker
193-
.venv\Scripts\activate
212+
```bash
213+
# Make sure web-hacker is installed
214+
pip install web-hacker
194215

195-
# 4) Install in editable mode via uv (pip-compatible interface)
196-
uv pip install -e .
216+
# Set your OpenAI API key
217+
export OPENAI_API_KEY="sk-..."
197218

198-
# 5) Configure environment
199-
copy .env.example .env # then edit values
200-
# or set directly
201-
$env:OPENAI_API_KEY="sk-..."
219+
# Run the quickstart script
220+
python quickstart.py
202221
```
203222

223+
The quickstart script will:
224+
1. ✅ Automatically launch Chrome in debug mode
225+
2. 📊 Start browser monitoring (you perform actions)
226+
3. 🤖 Discover routines from captured data
227+
4. 📝 Show you how to execute the discovered routine
228+
229+
**Note:** The quickstart script is included in the repository. If you installed from PyPI, you can download it from the [GitHub repository](https://github.com/VectorlyApp/web-hacker/blob/main/quickstart.py).
230+
204231
## Launch Chrome in Debug Mode 🐞
205232

206-
### Instructions for MacOS
233+
> 💡 **Tip:** The [quickstart script](#quickstart-easiest-way-🚀) automatically launches Chrome for you. You only need these manual instructions if you're not using the quickstart script.
207234
208-
```
209-
# You should see JSON containing a webSocketDebuggerUrl like:
210-
# ws://127.0.0.1:9222/devtools/browser/*************************************# Create temporary chrome user directory
211-
mkdir $HOME/tmp
212-
mkdir $HOME/tmp/chrome
235+
### macOS
236+
237+
```bash
238+
# Create temporary Chrome user directory
239+
mkdir -p $HOME/tmp/chrome
213240

214-
# Launch Chrome app in debug mode (this exposes websocket for controlling and monitoring the browser)
241+
# Launch Chrome in debug mode
215242
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
216243
--remote-debugging-address=127.0.0.1 \
217244
--remote-debugging-port=9222 \
218245
--user-data-dir="$HOME/tmp/chrome" \
219-
'--remote-allow-origins=*' \
246+
--remote-allow-origins=* \
220247
--no-first-run \
221248
--no-default-browser-check
222249

223-
224-
# Verify chrome is running in debug mode
250+
# Verify Chrome is running
225251
curl http://127.0.0.1:9222/json/version
226-
227-
# You should see JSON containing a webSocketDebuggerUrl like:
228-
# ws://127.0.0.1:9222/devtools/browser/*************************************
229252
```
230253

231-
### Instructions for Windows
254+
### Windows
232255

233-
```
256+
```powershell
234257
# Create temporary Chrome user directory
235-
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\\tmp\\chrome" | Out-Null
258+
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\tmp\chrome" | Out-Null
236259
237-
# Locate Chrome (adjust path if Chrome is installed elsewhere)
238-
$chrome = "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe"
260+
# Locate Chrome
261+
$chrome = "C:\Program Files\Google\Chrome\Application\chrome.exe"
239262
if (!(Test-Path $chrome)) {
240-
$chrome = "C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe"
263+
$chrome = "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe"
241264
}
242265
243-
# Launch Chrome in debug mode (exposes DevTools WebSocket)
266+
# Launch Chrome in debug mode
244267
& $chrome `
245268
--remote-debugging-address=127.0.0.1 `
246269
--remote-debugging-port=9222 `
247-
--user-data-dir="$env:USERPROFILE\\tmp\\chrome" `
270+
--user-data-dir="$env:USERPROFILE\tmp\chrome" `
248271
--remote-allow-origins=* `
249272
--no-first-run `
250273
--no-default-browser-check
251274
252-
253-
# Verify Chrome is running in debug mode
275+
# Verify Chrome is running
254276
(Invoke-WebRequest http://127.0.0.1:9222/json/version).Content
277+
```
278+
279+
### Linux
280+
281+
```bash
282+
# Create temporary Chrome user directory
283+
mkdir -p $HOME/tmp/chrome
255284

256-
# You should see JSON containing a webSocketDebuggerUrl like:
257-
# ws://127.0.0.1:9222/devtools/browser/*************************************
285+
# Launch Chrome in debug mode (adjust path if needed)
286+
google-chrome \
287+
--remote-debugging-address=127.0.0.1 \
288+
--remote-debugging-port=9222 \
289+
--user-data-dir="$HOME/tmp/chrome" \
290+
--remote-allow-origins=* \
291+
--no-first-run \
292+
--no-default-browser-check
293+
294+
# Verify Chrome is running
295+
curl http://127.0.0.1:9222/json/version
258296
```
259297

260298
## HACK (reverse engineer) WEB APPS 👨🏻‍💻
@@ -265,6 +303,12 @@ The reverse engineering process follows a simple three-step workflow:
265303
2. **Discover** — Let the AI agent analyze the captured data and generate a reusable Routine
266304
3. **Execute** — Run the discovered Routine with different parameters to automate the task
267305

306+
### Quick Start (Recommended)
307+
308+
**Easiest way:** Use the [quickstart script](#quickstart-easiest-way-🚀) which automates the entire workflow.
309+
310+
### Manual Workflow (Step-by-Step)
311+
268312
Each step is detailed below. Start by ensuring Chrome is running in debug mode (see [Launch Chrome in Debug Mode](#launch-chrome-in-debug-mode-🐞) above).
269313

270314
### 0. Legal & Privacy Notice ⚠️
@@ -277,7 +321,7 @@ Use the CDP browser monitor to block trackers and capture network, storage, and
277321
**Run this command to start monitoring:**
278322

279323
```bash
280-
python scripts/browser_monitor.py --host 127.0.0.1 --port 9222 --output-dir ./cdp_captures --url about:blank --incognito
324+
web-hacker-monitor --host 127.0.0.1 --port 9222 --output-dir ./cdp_captures --url about:blank --incognito
281325
```
282326

283327
The script will open a new tab (starting at `about:blank`). Navigate to your target website, then manually perform the actions you want to automate (e.g., search, login, export report). Keep Chrome focused during this process. Press `Ctrl+C` and the script will consolidate transactions and produce a HAR automatically.
@@ -313,7 +357,7 @@ Use the **routine-discovery pipeline** to analyze captured data and synthesize a
313357
314358
**Linux/macOS (bash):**
315359
```bash
316-
python scripts/discover_routines.py \
360+
web-hacker-discover \
317361
--task "Recover API endpoints for searching for trains and their prices" \
318362
--cdp-captures-dir ./cdp_captures \
319363
--output-dir ./routine_discovery_output \
@@ -323,7 +367,7 @@ python scripts/discover_routines.py \
323367
**Windows (PowerShell):**
324368
```powershell
325369
# Simple task (no quotes inside):
326-
python scripts/discover_routines.py --task "Recover the API endpoints for searching for trains and their prices" --cdp-captures-dir ./cdp_captures --output-dir ./routine_discovery_output --llm-model gpt-5
370+
web-hacker-discover --task "Recover the API endpoints for searching for trains and their prices" --cdp-captures-dir ./cdp_captures --output-dir ./routine_discovery_output --llm-model gpt-5
327371
```
328372

329373
**Example tasks:**
@@ -372,21 +416,21 @@ Run the example routine:
372416
```bash
373417
# Using a parameters file:
374418

375-
python scripts/execute_routine.py \
419+
web-hacker-execute \
376420
--routine-path example_routines/amtrak_one_way_train_search_routine.json \
377421
--parameters-path example_routines/amtrak_one_way_train_search_input.json
378422

379423
# Or pass parameters inline (JSON string):
380424

381-
python scripts/execute_routine.py \
425+
web-hacker-execute \
382426
--routine-path example_routines/amtrak_one_way_train_search_routine.json \
383427
--parameters-dict '{"origin": "BOS", "destination": "NYP", "departureDate": "2026-03-22"}'
384428
```
385429

386430
Run a discovered routine:
387431

388432
```bash
389-
python scripts/execute_routine.py \
433+
web-hacker-execute \
390434
--routine-path routine_discovery_output/routine.json \
391435
--parameters-path routine_discovery_output/test_parameters.json
392436
```

pyproject.toml

Lines changed: 56 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,71 @@ build-backend = "hatchling.build"
66

77
[project]
88
name = "web-hacker"
9-
version = "0.1.0"
10-
description = " Reverse engineer any web app!"
9+
version = "1.1.0"
10+
description = "SDK for reverse engineering web apps"
1111
readme = "README.md"
12-
requires-python = ">=3.12.3,<3.13" # pinning to 3.12.x
12+
requires-python = ">=3.12.3,<3.13"
13+
license = {text = "Apache-2.0"}
14+
authors = [
15+
{name = "Vectorly", email = "contact@vectorly.app"}
16+
]
17+
keywords = [
18+
"web-scraping",
19+
"automation",
20+
"cdp",
21+
"chrome-devtools",
22+
"api-discovery",
23+
"reverse-engineering",
24+
"browser-automation",
25+
"sdk",
26+
]
27+
classifiers = [
28+
"Development Status :: 4 - Beta",
29+
"Intended Audience :: Developers",
30+
"License :: OSI Approved :: Apache Software License",
31+
"Programming Language :: Python :: 3",
32+
"Programming Language :: Python :: 3.12",
33+
"Topic :: Software Development :: Libraries :: Python Modules",
34+
"Topic :: Internet :: WWW/HTTP :: Browsers",
35+
"Topic :: Software Development :: Testing",
36+
]
1337
dependencies = [
14-
"ipykernel>=6.29.5",
1538
"openai>=2.6.1",
1639
"pydantic>=2.11.4",
17-
"pylint>=3.0.0",
18-
"pytest>=8.3.5",
1940
"python-dotenv>=1.2.1",
2041
"requests>=2.31.0",
2142
"websockets>=15.0.1",
2243
"websocket-client>=1.6.0",
2344
"beautifulsoup4>=4.14.2",
2445
]
2546

47+
[project.optional-dependencies]
48+
dev = [
49+
"ipykernel>=6.29.5",
50+
"pylint>=3.0.0",
51+
"pytest>=8.3.5",
52+
]
53+
54+
[project.scripts]
55+
web-hacker-monitor = "web_hacker.scripts.browser_monitor:main"
56+
web-hacker-discover = "web_hacker.scripts.discover_routines:main"
57+
web-hacker-execute = "web_hacker.scripts.execute_routine:main"
58+
59+
[project.urls]
60+
Homepage = "https://www.vectorly.app"
61+
Documentation = "https://github.com/VectorlyApp/web-hacker#readme"
62+
Repository = "https://github.com/VectorlyApp/web-hacker"
63+
Issues = "https://github.com/VectorlyApp/web-hacker/issues"
64+
2665
[tool.hatch.build.targets.wheel]
27-
packages = ["src"]
66+
packages = ["web_hacker"]
67+
68+
[tool.hatch.build.targets.sdist]
69+
include = [
70+
"/web_hacker",
71+
"/tests",
72+
"/example_routines",
73+
"README.md",
74+
"LICENSE",
75+
"pyproject.toml",
76+
]

0 commit comments

Comments
 (0)