Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions IMPLEMENTATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Implementation Summary: Multi-Tab/Window Support

This document summarizes the implementation of multi-tab/window support in the MCP Selenium server.

## Changes

1. **Added five new tools to `src/lib/server.js` for window management:**
* `get_window_handles`: Retrieves all active window handles.
* `get_current_window_handle`: Gets the handle of the currently focused window.
* `switch_to_window`: Switches focus to a specific window by its handle.
* `switch_to_latest_window`: Switches to the most recently opened window.
* `close_current_window`: Closes the currently active window without ending the session.

2. **Created `docs/MULTI_TAB_USAGE.md`:**
* Provides detailed usage examples and best practices for the new window management tools.

3. **Created `docs/CHANGELOG_TAB_SUPPORT.md`:**
* Documents the new features and explains how they remain backward compatible.

4. **Updated `README.md`:**
* Added a new section documenting the multi-tab/window management tools.

## Testing Guidance

To ensure the new tools function correctly, follow these testing steps:

1. **Start a browser session** using the `start_browser` tool.
2. **Open a new tab/window** by clicking a link that opens in a new tab (e.g., `<a href="..." target="_blank">`).
3. **Use `get_window_handles`** to verify that multiple handles are returned.
4. **Use `switch_to_latest_window`** to switch to the new tab.
5. **Perform an action** (e.g., `get_element_text`) to confirm the context has switched.
6. **Use `close_current_window`** to close the new tab.
7. **Verify that the original tab** is still active and responsive.
73 changes: 73 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ A Model Context Protocol (MCP) server implementation for Selenium WebDriver, ena
- Handle keyboard input
- Take screenshots
- Upload files
- Window Management
- Support for headless mode

## Supported Browsers
Expand Down Expand Up @@ -433,6 +434,78 @@ None required
}
```

### get_window_handles
Gets all window handles.

**Parameters:**
None required

**Example:**
```json
{
"tool": "get_window_handles",
"parameters": {}
}
```

### get_current_window_handle
Gets the current window handle.

**Parameters:**
None required

**Example:**
```json
{
"tool": "get_current_window_handle",
"parameters": {}
}
```

### switch_to_window
Switches to a window by its handle.

**Parameters:**
- `handle` (required): The handle of the window to switch to
- Type: string

**Example:**
```json
{
"tool": "switch_to_window",
"parameters": {
"handle": "CDwindow-ABC"
}
}
```

### switch_to_latest_window
Switches to the most recently opened window.

**Parameters:**
None required

**Example:**
```json
{
"tool": "switch_to_latest_window",
"parameters": {}
}
```

### close_current_window
Closes the currently active window.

**Parameters:**
None required

**Example:**
```json
{
"tool": "close_current_window",
"parameters": {}
}
```

## License

Expand Down
19 changes: 19 additions & 0 deletions docs/CHANGELOG_TAB_SUPPORT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Changelog: Multi-Tab/Window Support

## New Features

- **Added five new tools for multi-tab/window management:**
- `get_window_handles`: Retrieves all active window handles.
- `get_current_window_handle`: Gets the handle of the currently focused window.
- `switch_to_window`: Switches focus to a specific window by its handle.
- `switch_to_latest_window`: Switches to the most recently opened window.
- `close_current_window`: Closes the currently active window.

## Backward Compatibility

This update is fully backward compatible. Existing tools are unaffected.

- The `close_session` tool still closes the entire browser session, including all tabs.
- All element interaction tools (`click_element`, `send_keys`, etc.) operate on the currently focused tab, preserving existing behavior.

Workflows that do not involve multiple tabs will continue to function as before without any changes.
78 changes: 78 additions & 0 deletions docs/MULTI_TAB_USAGE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Multi-Tab/Window Usage Guide

This guide provides examples and best practices for using the new multi-tab/window management tools.

## Available Tools

- `get_window_handles`: Retrieves all active window handles.
- `get_current_window_handle`: Gets the handle of the currently focused window.
- `switch_to_window`: Switches focus to a specific window by its handle.
- `switch_to_latest_window`: Switches to the most recently opened window.
- `close_current_window`: Closes the currently active window.

## Example Workflow

Here’s a common workflow for handling multiple tabs:

1. **Start a browser and navigate to a page.**
```json
{
"tool": "start_browser",
"browser": "chrome"
}
{
"tool": "navigate",
"url": "https://example.com"
}
```

2. **Click a link that opens a new tab.**
```json
{
"tool": "click_element",
"by": "css",
"value": "a[target='_blank']"
}
```

3. **Get all window handles to see the new tab's handle.**
```json
{
"tool": "get_window_handles"
}
```
*Output might look like: `Window handles: CDwindow-ABC, CDwindow-DEF`*

4. **Switch to the new tab.**
You can either switch by the specific handle or use `switch_to_latest_window`.
```json
{
"tool": "switch_to_latest_window"
}
```

5. **Perform actions in the new tab.**
```json
{
"tool": "get_element_text",
"by": "css",
"value": "h1"
}
```

6. **Close the new tab and switch back to the original.**
```json
{
"tool": "close_current_window"
}
{
"tool": "switch_to_window",
"handle": "CDwindow-ABC"
}
```

## Best Practices

- **Always get handles after opening a new tab:** Don't assume the handle format. Call `get_window_handles` to get the correct identifiers.
- **Use `switch_to_latest_window` for simplicity:** It's the easiest way to switch to a newly opened tab without needing to manage handles manually.
- **Be mindful of context:** After closing a tab, the driver's focus may be lost. Always switch back to a valid window handle to continue working.
108 changes: 107 additions & 1 deletion src/lib/server.js
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,112 @@ server.tool(
}
);

// Window Management Tools
server.tool(
"get_window_handles",
"gets all window handles",
{},
async () => {
try {
const driver = getDriver();
const handles = await driver.getAllWindowHandles();
return {
content: [{ type: 'text', text: `Window handles: ${handles.join(', ')}` }]
};
} catch (e) {
return {
content: [{ type: 'text', text: `Error getting window handles: ${e.message}` }]
};
}
}
);

server.tool(
"get_current_window_handle",
"gets the current window handle",
{},
async () => {
try {
const driver = getDriver();
const handle = await driver.getWindowHandle();
return {
content: [{ type: 'text', text: `Current window handle: ${handle}` }]
};
} catch (e) {
return {
content: [{ type: 'text', text: `Error getting current window handle: ${e.message}` }]
};
}
}
);

server.tool(
"switch_to_window",
"switches to a window by its handle",
{
handle: z.string().describe("The handle of the window to switch to")
},
async ({ handle }) => {
try {
const driver = getDriver();
await driver.switchTo().window(handle);
return {
content: [{ type: 'text', text: `Switched to window: ${handle}` }]
};
} catch (e) {
return {
content: [{ type: 'text', text: `Error switching to window: ${e.message}` }]
};
}
}
);

server.tool(
"switch_to_latest_window",
"switches to the most recently opened window",
{},
async () => {
try {
const driver = getDriver();
const handles = await driver.getAllWindowHandles();
if (handles.length > 0) {
const latestHandle = handles[handles.length - 1];
await driver.switchTo().window(latestHandle);
return {
content: [{ type: 'text', text: `Switched to latest window: ${latestHandle}` }]
};
} else {
return {
content: [{ type: 'text', text: 'No windows available to switch to' }]
};
}
} catch (e) {
return {
content: [{ type: 'text', text: `Error switching to latest window: ${e.message}` }]
};
}
}
);

server.tool(
"close_current_window",
"closes the currently active window",
{},
async () => {
try {
const driver = getDriver();
await driver.close();
return {
content: [{ type: 'text', text: 'Current window closed' }]
};
} catch (e) {
return {
content: [{ type: 'text', text: `Error closing current window: ${e.message}` }]
};
}
}
);

server.tool(
"close_session",
"closes the current browser session",
Expand Down Expand Up @@ -477,4 +583,4 @@ process.on('SIGINT', cleanup);

// Start the server
const transport = new StdioServerTransport();
await server.connect(transport);
await server.connect(transport);