Skip to content
This repository was archived by the owner on Dec 12, 2022. It is now read-only.

Document Notation

Derk Norton edited this page Jul 30, 2021 · 53 revisions

Bali Document Notation™

The follow sections define the formal language specification for the Bali Document Notation™.

Document

This section defines the rule that is used to define a document.

// The optional EOL must be included since a document read in from a file will
// include one but a document generated by the formatter will not.
document: component EOL? EOF;

Components

This section defines the rules that are used to define components. For a graphical view of these rules click here.

component: value parameters? note?;

value: element | range | sequence | procedure;

range: element? '..' element?;

sequence: '[' collection ']';

collection: list | catalog;

parameters: '(' catalog ')';

list:
    component (',' component)* |
    EOL (component EOL)* |
    /* no items */
;

catalog:
    association (',' association)* |
    EOL (association EOL)* |
    ':' /* no associations */
;

association: element ':' component;

procedure: '{' activity '}';

activity:
    statement (';' statement)* |
    EOL (statement EOL)* |
    /* no statements */
;

note: NOTE;

Elements

This section defines the rules that are used to define elements. For a graphical view of these rules click here.

element:
    angle |
    binary |
    duration |
    moment |
    name |
    number |
    pattern |
    percent |
    probability |
    reference |
    symbol |
    tag |
    text |
    version
;

angle: ANGLE;

binary: BINARY;

duration: DURATION;

moment: MOMENT;

name: NAME;

number:
    'undefined' |
    '0' |
    '∞' |
    'infinity' |
    REAL |
    IMAGINARY |
    '(' REAL (',' IMAGINARY | 'e^' ANGLE 'i') ')'
;

pattern: 'none' | REGEX | 'any';

percent: PERCENT;

probability: 'false' | FRACTION | 'true';

reference: RESOURCE;

symbol: SYMBOL;

tag: TAG;

text: TEXT | NARRATIVE;

version: VERSION;

Statements

This section defines the rules that are used to define statements. For a graphical view of these rules click here.

statement: comment | mainClause handleClause?;

comment: NOTE | COMMENT;

mainClause:
    evaluateClause |
    checkoutClause |
    saveClause |
    discardClause |
    commitClause |
    publishClause |
    postClause |
    retrieveClause |
    rejectClause |
    acceptClause |
    ifClause |
    selectClause |
    withClause |
    whileClause |
    continueClause |
    breakClause |
    returnClause |
    throwClause
;

handleClause: 'handle' symbol (('with' block) | ('matching' expression 'with' block)+);

block: '{' activity '}';

evaluateClause: (recipient ':=')? expression;

// checkout level 2 of $document from /acme/reports/Q3/v1.3.6
checkoutClause: 'checkout' ('level' expression 'of')? recipient 'from' expression;

// save document as $citation
saveClause: 'save' expression ('as' recipient)?;

// discard document
discardClause: 'discard' expression;

// commit document to /acme/reports/Q3/v1.4
commitClause: 'commit' expression 'to' expression;

// publish event
publishClause: 'publish' expression;

// post message to /acme/blogs/v3.2
postClause: 'post' expression 'to' expression;

// retrieve $message from /acme/blogs/v3.2
retrieveClause: 'retrieve' recipient 'from' expression;

// reject message
rejectClause: 'reject' expression;

// accept message
acceptClause: 'accept' expression;

ifClause: 'if' expression 'then' block ('else' 'if' expression 'then' block)* ('else' block)?;

selectClause: 'select' expression 'from' (expression 'do' block)+ ('else' block)?;

withClause: 'with' ('each' symbol 'in')? expression 'do' block;

whileClause: 'while' expression 'do' block;

continueClause: 'continue' 'loop';

breakClause: 'break' 'loop';

returnClause: 'return' expression?;

throwClause: 'throw' expression;

recipient: symbol | attribute;

attribute: variable '[' indices ']';

Expressions

This section defines the rules that are used to define expressions. For a graphical view of these rules click here.

expression:                  // Precedence (highest to lowest)
    component                                                      #componentExpression     |
    variable                                                       #variableExpression      |
    function '(' arguments ')'                                     #functionExpression      |
    '(' expression ')'                                             #precedenceExpression    |
    '@' expression                                                 #dereferenceExpression   |
    expression op=('.' | '<-') message '(' arguments ')'           #messageExpression       |
    expression '[' indices ']'                                     #attributeExpression     |
    expression '&' expression                                      #concatenationExpression |
    expression '!'                                                 #factorialExpression     |
    <assoc=right> expression '^' expression                        #exponentialExpression   |
    op=('-' | '/' | '*') expression                                #inversionExpression     |
    expression op=('*' | '/' | '//' | '+' | '-') expression        #arithmeticExpression    |
    '|' expression '|'                                             #magnitudeExpression     |
    expression op=('<' | '=' | '>' | 'IS' | 'MATCHES') expression  #comparisonExpression    |
    'NOT' expression                                               #complementExpression    |
    expression op=('AND' | 'SANS' | 'XOR' | 'OR') expression       #logicalExpression       |
    expression '?' expression                                      #defaultExpression
;

variable: IDENTIFIER;

function: IDENTIFIER;

message: IDENTIFIER;

arguments:
    expression (',' expression)* |
    /* no expressions */
;


indices: expression (',' expression)*;

Tokens

This section defines the rules that are used to define tokens. For a graphical view of these rules click here.

/* TOKEN RULES
 It's important to remember that tokens are recognized by the
 lexer in the order declared. The longest first matching token
 is returned regardless of how many others might match. Also,
 prefix any tokens that are just used as subtokens with the
 "fragment" keyword.
*/

ANGLE: '~' ('0' | REAL);

BINARY: '\'' (BASE64 | SPACE)* ('=' ('=')?)? SPACE* '\'';

DURATION: '~' '-'? 'P' (SPAN 'W' | (SPAN 'Y')? (SPAN 'M')? (SPAN 'D')? ('T' (SPAN 'H')? (SPAN 'M')? (SPAN 'S')?)?);

FRACTION: '.' ('0'..'9')+;

IMAGINARY: FLOAT 'i' | 'e i' | 'pi i' | 'π i' | 'phi i' | 'φ i' | 'tau i' | 'τ i';

MOMENT: '<' YEARS ('-' MONTHS ('-' DAYS ('T' HOURS (':' MINUTES (':' SECONDS FRACTION?)?)?)?)?)? '>';

NAME: ('/' TYPE)+;

PERCENT: ('0' | REAL) '%';

RESOURCE: '<' TYPE ':' CONTEXT '>';

// NOTE: We cannot define negative constants here because the scanner would scan
//       a negative variable like '-exponent' as a single '-e' token rather than
//       two tokens '-' and 'exponent'.
REAL: FLOAT | 'e' | 'pi' | 'π' | 'phi' | 'φ' | 'tau' | 'τ';

REGEX: TEXT '?';

SYMBOL: '$' IDENTIFIER ('-' NUMBER)?;

TAG: '#' BASE32*;

// a narrative takes precedence over a regular text string
NARRATIVE: '"' EOL CHARACTER*? EOL SPACE* '"';

TEXT: '"' (ESCAPE | '\\"' | ~["\r\n])*? '"';

// a version like v123 takes precedence over an identifier
VERSION: 'v' NUMBER ('.' NUMBER)*;

IDENTIFIER: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*;

NOTE: '--' ~[\r\n]*;

COMMENT: '/*' EOL (COMMENT | CHARACTER)*? EOL SPACE* '*/';

EOL: '\r'? '\n';

SPACE: ('\t'..'\r' | ' ') -> channel(HIDDEN);

fragment
CHARACTER: .;

fragment
NUMBER: '1'..'9' ('0'..'9')*;

fragment
FLOAT: '-'? (NUMBER FRACTION? | '0' FRACTION) ('E' '-'? NUMBER)?;

fragment
INTEGER: '0' | '-'? NUMBER;

fragment
SPAN: INTEGER FRACTION?;

fragment
TYPE: ('a'..'z'|'A'..'Z'|'0'..'9'|'+'|'-'|'.')+;

fragment
CONTEXT: ('!'..'=' | '?'..'~')*;

fragment
YEARS: INTEGER;

fragment
MONTHS: (('0' '0'..'9') | ('1' '0'..'2'));

fragment
DAYS: (('0'..'2' '0'..'9') | ('3' '0'..'1'));

fragment
HOURS: (('0'..'1' '0'..'9') | ('2' '0'..'3'));

fragment
MINUTES: ('0'..'5' '0'..'9');

// must include 60 to handle leap seconds
fragment
SECONDS: (('0'..'5' '0'..'9') | '60');

fragment
BASE16: '0'..'9' | 'A'..'F';

// avoid confusion and offensive strings by eliminating 'E', 'I', 'O', and 'U'
fragment
BASE32: '0'..'9' | 'A'..'D' | 'F'..'H' | 'J'..'N' | 'P'..'T' | 'V'..'Z';

fragment
BASE64: '0'..'9' | 'A'..'Z' | 'a'..'z' | '+' | '/';

// replaced with actual characters when read
fragment
ESCAPE: '\\' ('u' BASE16+ | 'b' | 'f' | 'r' | 'n' | 't' | '\\');

Clone this wiki locally