|
| 1 | +# Audio Transcription UX Improvement |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +Modified the audio transcription behavior in the Remix AI Assistant to append transcribed text to the input box instead of immediately executing it as a prompt. This allows users to review and edit the transcription before sending. |
| 6 | + |
| 7 | +## Changes Made |
| 8 | + |
| 9 | +### File Modified |
| 10 | +- `libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx` |
| 11 | + |
| 12 | +### Previous Behavior |
| 13 | +When audio transcription completed, the transcribed text was immediately sent as a prompt to the AI assistant: |
| 14 | + |
| 15 | +```typescript |
| 16 | +onTranscriptionComplete: async (text) => { |
| 17 | + if (sendPromptRef.current) { |
| 18 | + await sendPromptRef.current(text) |
| 19 | + trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true }) |
| 20 | + } |
| 21 | +} |
| 22 | +``` |
| 23 | + |
| 24 | +### New Behavior |
| 25 | +When audio transcription completes, the behavior depends on whether the transcription ends with "run": |
| 26 | + |
| 27 | +**If transcription ends with "run":** The prompt is automatically executed (with "run" removed) |
| 28 | +**If transcription does NOT end with "run":** The text is appended to the input box for review |
| 29 | + |
| 30 | +```typescript |
| 31 | +onTranscriptionComplete: async (text) => { |
| 32 | + // Check if transcription ends with "run" (case-insensitive) |
| 33 | + const trimmedText = text.trim() |
| 34 | + const endsWithRun = /\brun\b\s*$/i.test(trimmedText) |
| 35 | + |
| 36 | + if (endsWithRun) { |
| 37 | + // Remove "run" from the end and execute the prompt |
| 38 | + const promptText = trimmedText.replace(/\brun\b\s*$/i, '').trim() |
| 39 | + if (promptText) { |
| 40 | + await sendPrompt(promptText) |
| 41 | + trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPromptAutoRun', name: 'SpeechToTextPromptAutoRun', isClick: true }) |
| 42 | + } |
| 43 | + } else { |
| 44 | + // Append transcription to the input box for user review |
| 45 | + setInput(prev => prev ? `${prev} ${text}`.trim() : text) |
| 46 | + // Focus the textarea so user can review/edit |
| 47 | + if (textareaRef.current) { |
| 48 | + textareaRef.current.focus() |
| 49 | + } |
| 50 | + trackMatomoEvent({ category: 'ai', action: 'SpeechToTextPrompt', name: 'SpeechToTextPrompt', isClick: true }) |
| 51 | + } |
| 52 | +} |
| 53 | +``` |
| 54 | + |
| 55 | +### Code Cleanup |
| 56 | +Removed unused code that was only needed for the previous immediate execution behavior: |
| 57 | + |
| 58 | +1. **Removed ref declaration:** |
| 59 | + ```typescript |
| 60 | + // Ref to hold the sendPrompt function for audio transcription callback |
| 61 | + const sendPromptRef = useRef<((prompt: string) => Promise<void>) | null>(null) |
| 62 | + ``` |
| 63 | + |
| 64 | +2. **Removed useEffect that updated the ref:** |
| 65 | + ```typescript |
| 66 | + // Update ref for audio transcription callback |
| 67 | + useEffect(() => { |
| 68 | + sendPromptRef.current = sendPrompt |
| 69 | + }, [sendPrompt]) |
| 70 | + ``` |
| 71 | + |
| 72 | +## Benefits |
| 73 | + |
| 74 | +1. **User Control:** Users can now review and edit the transcription before sending |
| 75 | +2. **Error Correction:** If the speech-to-text makes mistakes, users can fix them |
| 76 | +3. **Better UX:** Users can append multiple transcriptions or combine voice with typing |
| 77 | +4. **Flexibility:** Transcriptions can be modified to add context or clarification |
| 78 | +5. **Voice Command Execution:** Users can say "run" at the end to immediately execute the prompt |
| 79 | +6. **Hands-free Operation:** The "run" command enables completely hands-free prompt execution |
| 80 | + |
| 81 | +## User Flow |
| 82 | + |
| 83 | +### Standard Flow (Without "run") |
| 84 | +1. User clicks the microphone button to start recording |
| 85 | +2. User speaks their prompt (e.g., "Explain how to create an ERC-20 token") |
| 86 | +3. User clicks the microphone button again to stop recording |
| 87 | +4. **Transcribing status** is shown while processing |
| 88 | +5. **Transcribed text appears in the input box** (NEW) |
| 89 | +6. Input textarea is automatically focused (NEW) |
| 90 | +7. User can review, edit, or append to the transcription (NEW) |
| 91 | +8. User clicks send button or presses Enter to submit the prompt |
| 92 | + |
| 93 | +### Auto-Execute Flow (With "run") |
| 94 | +1. User clicks the microphone button to start recording |
| 95 | +2. User speaks their prompt ending with "run" (e.g., "Explain how to create an ERC-20 token run") |
| 96 | +3. User clicks the microphone button again to stop recording |
| 97 | +4. **Transcribing status** is shown while processing |
| 98 | +5. **Prompt is automatically executed** with "run" removed (NEW) |
| 99 | +6. AI response begins streaming immediately (hands-free execution) |
| 100 | + |
| 101 | +## Implementation Details |
| 102 | + |
| 103 | +### "Run" Detection Logic |
| 104 | +The implementation uses a word-boundary regex to detect if the transcription ends with "run": |
| 105 | + |
| 106 | +```typescript |
| 107 | +const endsWithRun = /\brun\b\s*$/i.test(trimmedText) |
| 108 | +``` |
| 109 | + |
| 110 | +Key features: |
| 111 | +- **Case-insensitive:** Matches "run", "Run", "RUN", etc. |
| 112 | +- **Word boundary:** Only matches "run" as a complete word, not as part of another word |
| 113 | +- **Trailing whitespace:** Ignores any spaces after "run" |
| 114 | + |
| 115 | +Examples: |
| 116 | +- ✅ "Explain ERC-20 tokens run" → Auto-executes |
| 117 | +- ✅ "Help me debug this run" → Auto-executes |
| 118 | +- ✅ "Create a contract RUN" → Auto-executes (case-insensitive) |
| 119 | +- ❌ "Explain running contracts" → Does NOT auto-execute (word boundary) |
| 120 | +- ❌ "Tell me about runtime" → Does NOT auto-execute (word boundary) |
| 121 | + |
| 122 | +### Smart Text Appending |
| 123 | +The implementation intelligently handles existing input (when NOT auto-executing): |
| 124 | +- If input is empty: Sets the transcription as the input |
| 125 | +- If input exists: Appends the transcription with a space separator |
| 126 | +- Always trims whitespace for clean formatting |
| 127 | + |
| 128 | +```typescript |
| 129 | +setInput(prev => prev ? `${prev} ${text}`.trim() : text) |
| 130 | +``` |
| 131 | + |
| 132 | +### Auto-focus |
| 133 | +After transcription (when NOT auto-executing), the textarea is automatically focused so the user can immediately start editing: |
| 134 | + |
| 135 | +```typescript |
| 136 | +if (textareaRef.current) { |
| 137 | + textareaRef.current.focus() |
| 138 | +} |
| 139 | +``` |
| 140 | + |
| 141 | +## Testing Recommendations |
| 142 | + |
| 143 | +### Standard Transcription (Without "run") |
| 144 | +1. Test basic transcription flow - text appears in input box |
| 145 | +2. Test appending multiple transcriptions |
| 146 | +3. Test transcription with existing text in input |
| 147 | +4. Test keyboard navigation after transcription |
| 148 | +5. Test error handling (transcription failures) |
| 149 | +6. Verify textarea focus behavior |
| 150 | + |
| 151 | +### Auto-Execute with "run" |
| 152 | +1. Test transcription ending with "run" - should auto-execute |
| 153 | +2. Test case-insensitivity - "run", "Run", "RUN" should all work |
| 154 | +3. Test word boundary - "running" or "runtime" should NOT trigger auto-execute |
| 155 | +4. Test "run" removal - verify the word "run" is removed from the prompt |
| 156 | +5. Test empty prompt after "run" removal - should not execute |
| 157 | +6. Verify prompt execution starts immediately after transcription |
| 158 | + |
| 159 | +### Edge Cases |
| 160 | +1. Test "run" with trailing spaces - "prompt run " should work |
| 161 | +2. Test "run" as the only word - should not execute (empty prompt) |
| 162 | +3. Test transcription with "run" in the middle - "run a test" should NOT auto-execute |
| 163 | +4. Test multiple spaces before "run" - "prompt run" should work |
| 164 | + |
| 165 | +## Related Files |
| 166 | + |
| 167 | +- Main component: `libs/remix-ui/remix-ai-assistant/src/components/remix-ui-remix-ai-assistant.tsx` |
| 168 | +- Transcription hook: `libs/remix-ui/remix-ai-assistant/src/hooks/useAudioTranscription.tsx` |
| 169 | +- Prompt input area: `libs/remix-ui/remix-ai-assistant/src/components/prompt.tsx` |
0 commit comments