-
Notifications
You must be signed in to change notification settings - Fork 40
Fix cursor rendering for Arabic connected characters #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix cursor rendering for Arabic connected characters #248
Conversation
This commit improves the block cursor behavior for Arabic text, where
connected characters were being broken by the cursor overlay.
## Problems Fixed
1. **Character Breaking in Arabic**: The block cursor used an opaque
background that covered characters, breaking visual continuity of
connected Arabic letters. In Arabic, letters change shape based on
their position in a word (isolated/initial/medial/final forms), and
the cursor was disrupting these connections.
2. **Incorrect Width Calculation**: The cursor width was based on the
isolated form of characters placed inside the cursor div, not the
actual rendered width in connected text. This caused misalignment
where narrow connected forms appeared in wide cursor boxes.
3. **Newline Cursor Issues**:
- Wide cursor boxes appeared at end of lines
- In normal mode, cursor could be positioned on newline characters
(inconsistent with Vim behavior where $ positions on last character)
## Solutions Implemented
1. **Transparent Cursor with Outline**: Changed from opaque background
to transparent background with box-shadow outline, allowing underlying
text to show through naturally without breaking character connections.
2. **DOM-Based Width Measurement**: Calculate actual character width by
measuring the rendered glyph using Range.getBoundingClientRect(). This
captures the true width of characters after browser text shaping,
including Arabic contextual forms.
3. **Smart Newline Handling**:
- Use narrow cursor (15% of font size) for newline characters
- In normal mode, automatically adjust cursor position to last real
character when on end-of-line newline (matching Vim $ behavior)
- Preserve cursor on empty lines (consecutive newlines)
## Technical Details
- Added `width` property to Piece class for explicit width control
- Save original DOM position before traversal for accurate measurement
- Use Range API to measure individual character width from text nodes
- Force transparent letter rendering to avoid covering underlying text
- Distinguish between end-of-line newlines and empty line newlines
## Impact
This fixes a major usability issue for Arabic language users, making the
Vim mode cursor behavior work correctly with Arabic's connected writing
system while properly handling complex text shaping.
Fixes visual character breaking in Arabic text editing.
src/block-cursor.ts
Outdated
| style.fontFamily, style.fontSize, style.fontWeight, style.color, | ||
| primary ? "cm-fat-cursor cm-cursor-primary" : "cm-fat-cursor cm-cursor-secondary", | ||
| letter, hCoeff != 1) | ||
| letter, true) // Always use transparent letter to preserve RTL character connections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To preserve intra-word "letter connection" in connected scripts like Arabic (this fix has nothing to do with RTL per se so this is an inaccurate comment)
Overall this seems to make experience with non-connected scripts worse, so maybe we can decide behavior based on the character under cursor? |
|
I agree that there is no difference between
Keep what? Keep the text? Keep the cursor. I do not understand what you mean by this question.
You are absolutely right. It is also harder to spot the cursor when the fill is gone. I am contemplating how to improve this while maintaining function in connected scripts like Arabic.
It is not necessary. What is problematic and used to happen without some of the code introduced here is that someone could use the mouse to select the new line character at the end of the line, which led to discrepancy since clicking $ would move the cursor to the last character (and won't be able to access this new line character, not to mention having the cursor there.
Genuinely good idea. Let me think about it. There might be a better way to do this that works with connected scripts like Arabic. |
Implements Phase 1 of the dual-cursor system architecture. This adds utilities to detect the script type (Latin, Arabic, connected scripts) of characters based on Unicode ranges. The detection is used to determine appropriate cursor rendering strategies: - Latin text: standard opaque Vim cursor - Arabic/connected scripts: dual-layer cursor (word block + char outline) Features: - detectScriptType(): Detects script from Unicode ranges - isNeutralChar(): Identifies neutral characters (spaces, numbers, punctuation) - detectScriptTypeWithContext(): Context-aware detection for neutral chars Supported scripts: - Arabic (U+0600–U+06FF and related ranges) - Syriac (U+0700–U+074F) - connected RTL - N'Ko (U+07C0–U+07FF) - connected RTL - Hebrew (U+0590–U+05FF) - RTL but not connected, uses standard cursor Performance: O(1) Unicode range checks, suitable for per-keystroke execution. Related to replit#248
Implements Phase 2 of the dual-cursor system architecture.
This adds utilities to find word boundaries in connected scripts like
Arabic. Word boundaries are defined by transitions:
- FROM: non-Arabic TO: Arabic (word starts)
- FROM: Arabic TO: non-Arabic (word ends)
For example, in "TOOمودا", the word "مودا" has clear boundaries at
the transition points.
Features:
- findArabicWordBoundaries(): Finds start/end positions of connected word
- Expands from cursor position until non-Arabic characters
- Performance optimized with MAX_WORD_SEARCH_RANGE = ±50 characters
Algorithm:
1. Start from cursor position
2. Expand leftward while on Arabic/connected characters
3. Expand rightward while on Arabic/connected characters
4. Return {start, end, text} with absolute document positions
Used for rendering word-block layer of dual cursor in Arabic text.
Performance: O(n) where n ≤ 100 characters, suitable for real-time rendering.
Related to replit#248
Implements Phase 3 of the dual-cursor system architecture. This modifies the cursor rendering to detect script type and apply appropriate visual treatment: - Latin/non-connected scripts (focused): Opaque text with solid background (restores standard Vim block cursor behavior) - Arabic/connected scripts (focused): Transparent text with solid background (preserves visual character connections in RTL) - Any script (unfocused): Transparent text with outline only Changes: - Import detectScriptTypeWithContext() from script-detection module - Add script detection and focus state checking in measureCursor() - Set partial parameter based on script type and focus state This addresses maintainer feedback on replit#248 about restoring standard Vim cursor behavior for Latin text while maintaining special handling for Arabic connected characters. Performance: Adds single O(1) script detection per cursor render. Tested: ✅ Latin letters show white text in cursor (opaque) Tested: ✅ Arabic letters are invisible in cursor (transparent)
Implements Phase 4 of the dual-cursor system architecture. This adds hierarchical dual-cursor rendering for Arabic text: - Word-level block: Semi-transparent pink background covering entire connected word - Character-level outline: White 1px outline on specific letter under cursor Changes: - Add CursorLayerType enum for different cursor rendering strategies - Extend Piece class with layerType parameter - Modify measureCursor() to return Piece[] for multi-layer rendering - Add measureArabicDualCursor() function for dual-layer measurement - Update CSS theme with Arabic-specific cursor styles - Refine script detection to exclude only punctuation (not diacritics) - Ensure spaces/whitespace always treated as word boundaries - Fix neutral character detection: inherit script type but not special cursor - Only show dual-cursor for connected words (2+ Arabic characters) - Fix character positioning using coordsForChar for accurate RTL placement Visual design: - Focused Arabic (connected word): Semi-transparent pink word block + white char outline - Focused Arabic (isolated char): Standard transparent cursor - Focused Latin: Solid pink block with white text (opaque) - Focused neutral (punctuation, numbers): Standard transparent cursor - Unfocused: Pink outline for all (character outline hidden for Arabic) Performance: Word boundary detection O(n) where n ≤ 100 characters Tested: ✅ Dual-cursor renders correctly on Arabic connected words Tested: ✅ Word boundaries respect punctuation and spaces Tested: ✅ Navigation (hjkl) tracks correctly through Arabic words Tested: ✅ Single isolated Arabic characters use standard cursor Tested: ✅ Neutral characters (# punctuation) use standard cursor Tested: ✅ Character outline positioned correctly within word block Related to replit#248
ae1310a to
c9dd524
Compare
Alternative: Dual-cursor concept for Arabic
What do you think? @nightwing |
Implements Phase 4 of the dual-cursor system architecture. This adds hierarchical dual-cursor rendering for Arabic text: - Word-level block: Semi-transparent pink background covering entire connected word - Character-level outline: White 1px outline on specific letter under cursor Changes: - Add CursorLayerType enum for different cursor rendering strategies - Extend Piece class with layerType parameter - Modify measureCursor() to return Piece[] for multi-layer rendering - Add measureArabicDualCursor() function for dual-layer measurement - Update CSS theme with Arabic-specific cursor styles - Refine script detection to exclude only punctuation (not diacritics) - Ensure spaces/whitespace always treated as word boundaries - Fix neutral character detection: inherit script type but not special cursor - Only show dual-cursor for connected words (2+ Arabic characters) - Fix character positioning using coordsForChar for accurate RTL placement Visual design: - Focused Arabic (connected word): Semi-transparent pink word block + white char outline - Focused Arabic (isolated char): Standard transparent cursor - Focused Latin: Solid pink block with white text (opaque) - Focused neutral (punctuation, numbers): Standard transparent cursor - Unfocused: Pink outline for all (character outline hidden for Arabic) Performance: Word boundary detection O(n) where n ≤ 100 characters Tested: ✅ Dual-cursor renders correctly on Arabic connected words Tested: ✅ Word boundaries respect punctuation and spaces Tested: ✅ Navigation (hjkl) tracks correctly through Arabic words Tested: ✅ Single isolated Arabic characters use standard cursor Tested: ✅ Neutral characters (# punctuation) use standard cursor Tested: ✅ Character outline positioned correctly within word block Related to replit#248
c9dd524 to
b1b960d
Compare
|
for more context: Zettlr/Zettlr#6004 |
|
Just to clarify: this PR is a working prototype to demonstrate functionality, not a merge request. Once we agree on an approach that works well for Arabic users and fits the project, I'm happy to do a proper implementation following your architectural and style guidelines. |












Problem
The block cursor in Vim mode breaks the visual continuity of Arabic connected characters, making text editing confusing and difficult for Arabic users.
Issue 1: Character Breaking
The opaque cursor background covers Arabic letters, disrupting their connected forms. Arabic letters change shape based on position (isolated/initial/medial/final), and these visual connections are essential for readability.
When the cursor is on a character like ن in the middle of a word, the character appears in its connected (medial) form in the actual text, but the cursor div contains the isolated form, creating visual disruption.
Issue 2: Incorrect Width
The cursor width was based on the isolated form of characters rather than their actual rendered width in connected text:
Issue 3: Newline Selection
In normal mode, the cursor could be positioned on newline characters at the end of lines (unlike standard Vim behavior where
$positions on the last character).Solution
1. Transparent Cursor with Outline
background: #ff9696tobackground: transparentwithbox-shadow: 0 0 0 1px #ff9696outlinepartial: true) to avoid covering the actual text2. DOM-Based Width Measurement
Range.getBoundingClientRect()to measure actual rendered character width3. Smart Newline Handling
Technical Changes
widthproperty toPiececlass for explicit width controlelt.style.widthinadjust()methoddomAtPosbefore DOM traversal for accurate measurementTesting
Tested with Arabic text in various scenarios:
$,h,l)All cursor positioning now matches standard Vim behavior while correctly handling Arabic character shaping.
Impact
This fixes a major usability issue for Arabic language users, making the Vim mode cursor work correctly with Arabic's connected writing system. The solution properly handles complex text shaping without breaking visual character connections.