Asciidoctor :: Discussion

Suggest: Create inline node in parse time instead of convert time

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

3 messages Options

chloerei

Suggest: Create inline node in parse time instead of convert time

Asciidoctor currently does not create inline node during parsing, this means can not get inline node after parsing the document.

For example, I want to create a multipage converter, after split document to several sub document, I can not know if this sub document have any footnotes during convert time.

Currently I convert the document to HTML first, use specific mark to represent inline footnote, and then postprocess the HTML to extract footnotes.

Another issue is Asciidoctor will create duplicate footnotes after repeatedly convert.

For example:

irb> doc = Asciidoctor.load 'footnote:[text]'
irb> doc.convert
=> "<div class=\"paragraph\">\n<p><sup class=\"footnote\">[<a id=\"_footnoteref_1\" class=\"footnote\" href=\"#_footnotedef_1\" title=\"View footnote.\">1]</sup></p>\n</div>\n<div id=\"footnotes\">\n<hr>\n<div class=\"footnote\" id=\"_footnotedef_1\">\n<a href=\"#_footnoteref_1\">1. text\n</div>\n</div>"
irb> doc.convert
=> "<div class=\"paragraph\">\n<p><sup class=\"footnote\">[<a id=\"_footnoteref_2\" class=\"footnote\" href=\"#_footnotedef_2\" title=\"View footnote.\">2]</sup></p>\n</div>\n<div id=\"footnotes\">\n<hr>\n<div class=\"footnote\" id=\"_footnotedef_1\">\n<a href=\"#_footnoteref_1\">1. text\n</div>\n<div class=\"footnote\" id=\"_footnotedef_2\">\n<a href=\"#_footnoteref_2\">2. text\n</div>\n</div>"

This is weird because the documentation itself has not changed.

This is not an urgent issue, and it would be better if an API for finding inline nodes could be provided in the future.

mojavelinux

Re: Suggest: Create inline node in parse time instead of convert time

Administrator

This limitation of Asciidoctor has been known for a very long time, covered by https://github.com/asciidoctor/asciidoctor/issues/61, and frequently discussed. It's not something that's going to be solved here. Changing the parsing strategy will very likely have an impact on the grammar and thus has to be dealt with in the specification process. In fact, it will be one of the key mandates of the spec. Essentially, the document should be parsed in one phase and converted in another. But right now still aspects of the parser that rely on conversion happening while parsing. Those dependencies will need to get teased apart.

There are two approaches you can use as a workaround. One is to convert the document fully as a dry run to extract data from it, then go back and convert it again a second time armed with the information you gathers. Another is to do what Asciidoctor Mathematical does and use a treeprocessor to walk the DOM, convert blocks one by one, and extract and/or manipulate the information. See https://github.com/asciidoctor/asciidoctor-mathematical/blob/master/lib/asciidoctor-mathematical/extension.rb Or you can combine the two approaches. You'll need to get creative.

Best,

-Dan

On Mon, Jan 20, 2020 at 12:02 AM chloerei [via Asciidoctor :: Discussion] <[hidden email]> wrote:

Asciidoctor currently does not create inline node during parsing, this means can not get inline node after parsing the document.

For example, I want to create a multipage converter, after split document to several sub document, I can not know if this sub document have any footnotes during convert time.

Currently I convert the document to HTML first, use specific mark to represent inline footnote, and then postprocess the HTML to extract footnotes.

Another issue is Asciidoctor will create duplicate footnotes after repeatedly convert.

For example:

irb> doc = Asciidoctor.load 'footnote:[text]'
irb> doc.convert
=> "<div class=\"paragraph\">\n<p><sup class=\"footnote\">[<a id=\"_footnoteref_1\" class=\"footnote\" href=\"#_footnotedef_1\" title=\"View footnote.\">1]</sup></p>\n</div>\n<div id=\"footnotes\">\n<hr>\n<div class=\"footnote\" id=\"_footnotedef_1\">\n<a href=\"#_footnoteref_1\">1. text\n</div>\n</div>"
irb> doc.convert
=> "<div class=\"paragraph\">\n<p><sup class=\"footnote\">[<a id=\"_footnoteref_2\" class=\"footnote\" href=\"#_footnotedef_2\" title=\"View footnote.\">2]</sup></p>\n</div>\n<div id=\"footnotes\">\n<hr>\n<div class=\"footnote\" id=\"_footnotedef_1\">\n<a href=\"#_footnoteref_1\">1. text\n</div>\n<div class=\"footnote\" id=\"_footnotedef_2\">\n<a href=\"#_footnoteref_2\">2. text\n</div>\n</div>"

This is weird because the documentation itself has not changed.

This is not an urgent issue, and it would be better if an API for finding inline nodes could be provided in the future.

If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Suggest-Create-inline-node-in-parse-time-instead-of-convert-time-tp7500.html

To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML

Dan Allen | @mojavelinux | https://twitter.com/mojavelinux

chloerei

Re: Suggest: Create inline node in parse time instead of convert time

Got it, thanks for your reply.