Skip to content

Commit fdf6cdb

Browse files
author
ci-bot
committed
when transcribing, append to the chat box
1 parent 954d98b commit fdf6cdb

File tree

2 files changed

+188
-11
lines changed

2 files changed

+188
-11
lines changed
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# Audio Transcription UX Improvement
2+
3+
## Summary
4+
5+
Modified the audio transcription behavior in the Remix AI Assistant to append transcribed text to the input box instead of immediately executing it as a prompt. This allows users to review and edit the transcription before sending.
6+
7+
## Changes Made
8+
9+
### File Modified
10+
- `libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx`
11+
12+
### Previous Behavior
13+
When audio transcription completed, the transcribed text was immediately sent as a prompt to the AI assistant:
14+
15+
```typescript
16+
onTranscriptionComplete: async (text) => {
17+
if (sendPromptRef.current) {
18+
await sendPromptRef.current(text)
19+
trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true })
20+
}
21+
}
22+
```
23+
24+
### New Behavior
25+
When audio transcription completes, the behavior depends on whether the transcription ends with "run":
26+
27+
**If transcription ends with "run":** The prompt is automatically executed (with "run" removed)
28+
**If transcription does NOT end with "run":** The text is appended to the input box for review
29+
30+
```typescript
31+
onTranscriptionComplete: async (text) => {
32+
// Check if transcription ends with "run" (case-insensitive)
33+
const trimmedText = text.trim()
34+
const endsWithRun = /\brun\b\s*$/i.test(trimmedText)
35+
36+
if (endsWithRun) {
37+
// Remove "run" from the end and execute the prompt
38+
const promptText = trimmedText.replace(/\brun\b\s*$/i, '').trim()
39+
if (promptText) {
40+
await sendPrompt(promptText)
41+
trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPromptAutoRun', name: 'SpeechToTextPromptAutoRun', isClick: true })
42+
}
43+
} else {
44+
// Append transcription to the input box for user review
45+
setInput(prev => prev ? `${prev} ${text}`.trim() : text)
46+
// Focus the textarea so user can review/edit
47+
if (textareaRef.current) {
48+
textareaRef.current.focus()
49+
}
50+
trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true })
51+
}
52+
}
53+
```
54+
55+
### Code Cleanup
56+
Removed unused code that was only needed for the previous immediate execution behavior:
57+
58+
1. **Removed ref declaration:**
59+
```typescript
60+
// Ref to hold the sendPrompt function for audio transcription callback
61+
const sendPromptRef = useRef<((prompt: string) => Promise<void>) | null>(null)
62+
```
63+
64+
2. **Removed useEffect that updated the ref:**
65+
```typescript
66+
// Update ref for audio transcription callback
67+
useEffect(() => {
68+
sendPromptRef.current = sendPrompt
69+
}, [sendPrompt])
70+
```
71+
72+
## Benefits
73+
74+
1. **User Control:** Users can now review and edit the transcription before sending
75+
2. **Error Correction:** If the speech-to-text makes mistakes, users can fix them
76+
3. **Better UX:** Users can append multiple transcriptions or combine voice with typing
77+
4. **Flexibility:** Transcriptions can be modified to add context or clarification
78+
5. **Voice Command Execution:** Users can say "run" at the end to immediately execute the prompt
79+
6. **Hands-free Operation:** The "run" command enables completely hands-free prompt execution
80+
81+
## User Flow
82+
83+
### Standard Flow (Without "run")
84+
1. User clicks the microphone button to start recording
85+
2. User speaks their prompt (e.g., "Explain how to create an ERC-20 token")
86+
3. User clicks the microphone button again to stop recording
87+
4. **Transcribing status** is shown while processing
88+
5. **Transcribed text appears in the input box** (NEW)
89+
6. Input textarea is automatically focused (NEW)
90+
7. User can review, edit, or append to the transcription (NEW)
91+
8. User clicks send button or presses Enter to submit the prompt
92+
93+
### Auto-Execute Flow (With "run")
94+
1. User clicks the microphone button to start recording
95+
2. User speaks their prompt ending with "run" (e.g., "Explain how to create an ERC-20 token run")
96+
3. User clicks the microphone button again to stop recording
97+
4. **Transcribing status** is shown while processing
98+
5. **Prompt is automatically executed** with "run" removed (NEW)
99+
6. AI response begins streaming immediately (hands-free execution)
100+
101+
## Implementation Details
102+
103+
### "Run" Detection Logic
104+
The implementation uses a word-boundary regex to detect if the transcription ends with "run":
105+
106+
```typescript
107+
const endsWithRun = /\brun\b\s*$/i.test(trimmedText)
108+
```
109+
110+
Key features:
111+
- **Case-insensitive:** Matches "run", "Run", "RUN", etc.
112+
- **Word boundary:** Only matches "run" as a complete word, not as part of another word
113+
- **Trailing whitespace:** Ignores any spaces after "run"
114+
115+
Examples:
116+
- ✅ "Explain ERC-20 tokens run" → Auto-executes
117+
- ✅ "Help me debug this run" → Auto-executes
118+
- ✅ "Create a contract RUN" → Auto-executes (case-insensitive)
119+
- ❌ "Explain running contracts" → Does NOT auto-execute (word boundary)
120+
- ❌ "Tell me about runtime" → Does NOT auto-execute (word boundary)
121+
122+
### Smart Text Appending
123+
The implementation intelligently handles existing input (when NOT auto-executing):
124+
- If input is empty: Sets the transcription as the input
125+
- If input exists: Appends the transcription with a space separator
126+
- Always trims whitespace for clean formatting
127+
128+
```typescript
129+
setInput(prev => prev ? `${prev} ${text}`.trim() : text)
130+
```
131+
132+
### Auto-focus
133+
After transcription (when NOT auto-executing), the textarea is automatically focused so the user can immediately start editing:
134+
135+
```typescript
136+
if (textareaRef.current) {
137+
textareaRef.current.focus()
138+
}
139+
```
140+
141+
## Testing Recommendations
142+
143+
### Standard Transcription (Without "run")
144+
1. Test basic transcription flow - text appears in input box
145+
2. Test appending multiple transcriptions
146+
3. Test transcription with existing text in input
147+
4. Test keyboard navigation after transcription
148+
5. Test error handling (transcription failures)
149+
6. Verify textarea focus behavior
150+
151+
### Auto-Execute with "run"
152+
1. Test transcription ending with "run" - should auto-execute
153+
2. Test case-insensitivity - "run", "Run", "RUN" should all work
154+
3. Test word boundary - "running" or "runtime" should NOT trigger auto-execute
155+
4. Test "run" removal - verify the word "run" is removed from the prompt
156+
5. Test empty prompt after "run" removal - should not execute
157+
6. Verify prompt execution starts immediately after transcription
158+
159+
### Edge Cases
160+
1. Test "run" with trailing spaces - "prompt run " should work
161+
2. Test "run" as the only word - should not execute (empty prompt)
162+
3. Test transcription with "run" in the middle - "run a test" should NOT auto-execute
163+
4. Test multiple spaces before "run" - "prompt run" should work
164+
165+
## Related Files
166+
167+
- Main component: `libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx`
168+
- Transcription hook: `libs/remix-ui/remix-ai-assistant/src/hooks/useAudioTranscription.tsx`
169+
- Prompt input area: `libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx`

libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -78,20 +78,33 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
7878
const userHasScrolledRef = useRef(false)
7979
const lastMessageCountRef = useRef(0)
8080

81-
// Ref to hold the sendPrompt function for audio transcription callback
82-
const sendPromptRef = useRef<((prompt: string) => Promise<void>) | null>(null)
83-
8481
// Audio transcription hook
8582
const {
8683
isRecording,
8784
isTranscribing,
88-
error: transcriptionError,
85+
error: _transcriptionError, // Handled in onError callback
8986
toggleRecording
9087
} = useAudioTranscription({
9188
model: 'whisper-v3',
9289
onTranscriptionComplete: async (text) => {
93-
if (sendPromptRef.current) {
94-
await sendPromptRef.current(text)
90+
// Check if transcription ends with "run" (case-insensitive)
91+
const trimmedText = text.trim()
92+
const endsWithRun = /\brun\b\s*$/i.test(trimmedText)
93+
94+
if (endsWithRun) {
95+
// Remove "run" from the end and execute the prompt
96+
const promptText = trimmedText.replace(/\brun\b\s*$/i, '').trim()
97+
if (promptText) {
98+
await sendPrompt(promptText)
99+
trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true })
100+
}
101+
} else {
102+
// Append transcription to the input box for user review
103+
setInput(prev => prev ? `${prev} ${text}`.trim() : text)
104+
// Focus the textarea so user can review/edit
105+
if (textareaRef.current) {
106+
textareaRef.current.focus()
107+
}
95108
trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true })
96109
}
97110
},
@@ -524,11 +537,6 @@ export const RemixUiRemixAiAssistant = React.forwardRef<
524537
[isStreaming, props.plugin]
525538
)
526539

527-
// Update ref for audio transcription callback
528-
useEffect(() => {
529-
sendPromptRef.current = sendPrompt
530-
}, [sendPrompt])
531-
532540
const handleGenerateWorkspaceWithPrompt = useCallback(async (prompt: string) => {
533541
dispatchActivity('button', 'generateWorkspace')
534542
if (prompt && prompt.trim()) {

0 commit comments

Comments
 (0)