UniversalBlockParser
This parser recognizes custom block tags (!type / !endtype), builds CustomTagNode AST nodes, and hands inner content off to the normal CommonMark pipeline. It lives in Docara core.
Location & signature
- Namespace:
Simai\Docara\CustomTags - Class:
UniversalBlockParser - Constructed with
CustomTagRegistry.
public function __construct(private CustomTagRegistry $registry) {}
Role in the pipeline
- Installed by
CustomTagsExtension. - Runs during block start detection for each line.
- For every registered
CustomTagSpec, it attempts to match the open regex on the current line. On success it:- Builds a
CustomTagNodewith merged attributes and meta. - Returns a container continue parser which:
- Accepts any child blocks/inline content.
- Finishes when the close regex is matched (or immediately for single-line tags).
- Builds a
tryStart() - step by step
public function tryStart(Cursor $cursor, MarkdownParserStateInterface $state): ?BlockStart
- Read current line
$line = $cursor->getLine();
- Try each spec in the registry, match the open marker
if (! preg_match($spec->openRegex, $line, $m)) continue;
- Nesting rule for same-type blocks
If the active block is aCustomTagNodewith the sametypeandallowNestingSame === false, the new start is suppressed:
return BlockStart::none();
- Consume the whole line
$cursor->advanceToEnd();
- Parse attributes from the named group
attrsand merge with spec defaults
$attrStr = $m['attrs'] ?? '';
$userAttrs = Attrs::parseOpenLine($attrStr);
$attrs = Attrs::merge($spec->baseAttrs, $userAttrs);
- Create node with meta
$node = new CustomTagNode($spec->type, $attrs, [
'openMatch' => $m,
'attrStr' => $attrStr,
]);
- Early attribute filtering (optional)
If the spec providesattrsFilter, it is applied immediately to the node's attributes:
if ($spec->attrsFilter instanceof \Closure) {
$node->setAttrs(($spec->attrsFilter)($node->getAttrs(), $node->getMeta()));
}
Signature note: attrsFilter is called with two arguments ($attrs, $meta) where meta contains at least openMatch and attrStr.
- Return a container parser (anonymous
AbstractBlockContinueParser) which:
getBlock()returns theCustomTagNode.isContainer()istrue- inner content is parsed as normal Markdown.canContain()returnstrue- accepts any children.canHaveLazyContinuationLines()isfalse- explicit lines only.tryContinue()finishes when the close regex matches the current line (and consumes the line); otherwise continues.
Closing behavior
if ($this->spec->closeRegex === null) {
return BlockContinue::finished(); // single-line tag
}
if (preg_match($this->spec->closeRegex, $line)) {
$cursor->advanceToEnd();
return BlockContinue::finished();
}
- Single-line tags: if
closeRegexisnull, the block ends immediately after the open line (no inner content). - Standard blocks: the block remains open until a line matches
closeRegex(!end<type>by default). The close line is consumed and not emitted as content.
Nesting semantics
- If
allowNestingSameis false and the active block is the same type, an inner open is ignored (treated as text until the outer close). - Different tag types may still nest freely, since
canContain()returnstrue.
Attributes and meta
- Attributes are parsed from the open line via
Attrs::parseOpenLine()supporting:- Key-value pairs:
key="value",key:'value', unquoted tokens - Shorthands:
.classappends toclass,#idsetsid - Unicode spaces and smart quotes are normalized; classes are deduplicated
- Key-value pairs:
- Merging is performed with
Attrs::merge($spec->baseAttrs, $userAttrs). - Meta stored on the node:
openMatch- full regex match array foropenRegexattrStr- raw attribute substring from the opening line
Implement your attrsFilter as fn(array $attrs, array $meta): array to leverage this information (e.g., derive attributes from capture groups).
Container responsibilities
The inner anonymous continue parser:
- Does not collect text lines itself (
addLine()is a no-op). - Delegates inner content parsing to the standard block/inline parsers since it is a container.
- Signals close via
tryContinue()whencloseRegexmatches.
Edge cases & behavior
- Unclosed block at EOF: CommonMark closes the container at end of document; no explicit cleanup is needed.
- Extra
!end<type>: a close line without a matching open will not be claimed by this parser; it is treated as plain text by other parsers. - Mixed indentation / leading spaces: open/close regexes are built to tolerate leading whitespace via
^\s*(seeBaseTag::openRegex()/closeRegex()). - Suppressed same-type nesting: when
allowNestingSame=false, authors will see the inner!<type>rendered as text; document this in authoring guidelines if needed.
Performance notes
- The parser iterates specs and executes a single anchored
preg_matchper spec for the current line. - Keep
openRegexanchored at start (^) and avoid overly expensive subpatterns. - The number of specs is typically small; if it grows large, consider grouping common prefixes or precomputing a faster dispatch in the registry.
Testing suggestions
Create fixtures covering:
- Simple block: open -> text -> close.
- Attribute parsing: quoted/unquoted,
.class,#id. - Nested different tags vs suppressed same-type nesting.
- Single-line tags with
closeRegex = null. - Unclosed block at EOF.
- Lines that resemble markers inside code fences (should be parsed correctly by CommonMark since inner content is delegated).
Troubleshooting
- Open not detected: verify the tag's
openRegex; ensure the line starts with!typeand there are no trailing characters after attributes. - Close not detected: confirm
closeRegexand that the close line has no extra content. - Attributes missing: check the named capture
(?<attrs>...)in the open regex and yourAttrslogic. - Unexpected inner text: you might be hitting same-type nesting suppression; set
allowNestingSame()totruein your tag.
Example
Markdown
!example class:"mb-4 border" .demo
Inner *markdown* content.
!endexample
Result (simplified)
<div class="example overflow-hidden radius-1/2 overflow-x-auto mb-4 border demo">
<p>Inner <em>markdown</em> content.</p>
</div>