home chevron_right
Registering Custom Tags chevron_right
UniversalBlockParser

UniversalBlockParserlink

This parser recognizes custom block tags (!type / !endtype), builds CustomTagNode AST nodes, and hands inner content off to the normal CommonMark pipeline. It lives in Docara core.


Location & signaturelink

  • Namespace: Simai\Docara\CustomTags
  • Class: UniversalBlockParser
  • Constructed with CustomTagRegistry.
public function __construct(private CustomTagRegistry $registry) {}

Role in the pipelinelink

  • Installed by CustomTagsExtension.
  • Runs during block start detection for each line.
  • For every registered CustomTagSpec, it attempts to match the open regex on the current line. On success it:
    1. Builds a CustomTagNode with merged attributes and meta.
    2. Returns a container continue parser which:
      • Accepts any child blocks/inline content.
      • Finishes when the close regex is matched (or immediately for single-line tags).

tryStart() - step by steplink

public function tryStart(Cursor $cursor, MarkdownParserStateInterface $state): ?BlockStart
  1. Read current line
$line = $cursor->getLine();
  1. Try each spec in the registry, match the open marker
if (! preg_match($spec->openRegex, $line, $m)) continue;
  1. Nesting rule for same-type blocks
    If the active block is a CustomTagNode with the same type and allowNestingSame === false, the new start is suppressed:
return BlockStart::none();
  1. Consume the whole line
$cursor->advanceToEnd();
  1. Parse attributes from the named group attrs and merge with spec defaults
$attrStr   = $m['attrs'] ?? '';
$userAttrs = Attrs::parseOpenLine($attrStr);
$attrs     = Attrs::merge($spec->baseAttrs, $userAttrs);
  1. Create node with meta
$node = new CustomTagNode($spec->type, $attrs, [
    'openMatch' => $m,
    'attrStr'   => $attrStr,
]);
  1. Early attribute filtering (optional)
    If the spec provides attrsFilter, it is applied immediately to the node's attributes:
if ($spec->attrsFilter instanceof \Closure) {
    $node->setAttrs(($spec->attrsFilter)($node->getAttrs(), $node->getMeta()));
}

Signature note: attrsFilter is called with two arguments ($attrs, $meta) where meta contains at least openMatch and attrStr.

  1. Return a container parser (anonymous AbstractBlockContinueParser) which:
  • getBlock() returns the CustomTagNode.
  • isContainer() is true - inner content is parsed as normal Markdown.
  • canContain() returns true - accepts any children.
  • canHaveLazyContinuationLines() is false - explicit lines only.
  • tryContinue() finishes when the close regex matches the current line (and consumes the line); otherwise continues.

Closing behaviorlink

if ($this->spec->closeRegex === null) {
    return BlockContinue::finished(); // single-line tag
}

if (preg_match($this->spec->closeRegex, $line)) {
    $cursor->advanceToEnd();
    return BlockContinue::finished();
}
  • Single-line tags: if closeRegex is null, the block ends immediately after the open line (no inner content).
  • Standard blocks: the block remains open until a line matches closeRegex (!end<type> by default). The close line is consumed and not emitted as content.

Nesting semanticslink

  • If allowNestingSame is false and the active block is the same type, an inner open is ignored (treated as text until the outer close).
  • Different tag types may still nest freely, since canContain() returns true.

Attributes and metalink

  • Attributes are parsed from the open line via Attrs::parseOpenLine() supporting:
    • Key-value pairs: key="value", key:'value', unquoted tokens
    • Shorthands: .class appends to class, #id sets id
    • Unicode spaces and smart quotes are normalized; classes are deduplicated
  • Merging is performed with Attrs::merge($spec->baseAttrs, $userAttrs).
  • Meta stored on the node:
    • openMatch - full regex match array for openRegex
    • attrStr - raw attribute substring from the opening line

Implement your attrsFilter as fn(array $attrs, array $meta): array to leverage this information (e.g., derive attributes from capture groups).


Container responsibilitieslink

The inner anonymous continue parser:

  • Does not collect text lines itself (addLine() is a no-op).
  • Delegates inner content parsing to the standard block/inline parsers since it is a container.
  • Signals close via tryContinue() when closeRegex matches.

Edge cases & behaviorlink

  • Unclosed block at EOF: CommonMark closes the container at end of document; no explicit cleanup is needed.
  • Extra !end<type>: a close line without a matching open will not be claimed by this parser; it is treated as plain text by other parsers.
  • Mixed indentation / leading spaces: open/close regexes are built to tolerate leading whitespace via ^\s* (see BaseTag::openRegex()/closeRegex()).
  • Suppressed same-type nesting: when allowNestingSame=false, authors will see the inner !<type> rendered as text; document this in authoring guidelines if needed.

Performance noteslink

  • The parser iterates specs and executes a single anchored preg_match per spec for the current line.
  • Keep openRegex anchored at start (^) and avoid overly expensive subpatterns.
  • The number of specs is typically small; if it grows large, consider grouping common prefixes or precomputing a faster dispatch in the registry.

Testing suggestionslink

Create fixtures covering:

  1. Simple block: open -> text -> close.
  2. Attribute parsing: quoted/unquoted, .class, #id.
  3. Nested different tags vs suppressed same-type nesting.
  4. Single-line tags with closeRegex = null.
  5. Unclosed block at EOF.
  6. Lines that resemble markers inside code fences (should be parsed correctly by CommonMark since inner content is delegated).

Troubleshootinglink

  • Open not detected: verify the tag's openRegex; ensure the line starts with !type and there are no trailing characters after attributes.
  • Close not detected: confirm closeRegex and that the close line has no extra content.
  • Attributes missing: check the named capture (?<attrs>...) in the open regex and your Attrs logic.
  • Unexpected inner text: you might be hitting same-type nesting suppression; set allowNestingSame() to true in your tag.

Examplelink

Markdown

!example class:"mb-4 border" .demo
Inner *markdown* content.
!endexample

Result (simplified)

<div class="example overflow-hidden radius-1/2 overflow-x-auto mb-4 border demo">
  <p>Inner <em>markdown</em> content.</p>
</div>