custom tags for nested inline_quoted text (bold+italics, bold monospace, etc)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

custom tags for nested inline_quoted text (bold+italics, bold monospace, etc)

stallio
I'm working on custom ERB templates. Unfortunately the environment I'm exporting to (Adobe InDesign) is not very smart about nested tags, so I need to export different tags for different combinations of inline formatting.

So if text is bold + italics, I need it inside some sort of <bolditalics> tag instead of just nesting an italics tag inside a bold tag. The same goes for bold monospace, italics monospace, etc.

Normally I would handle this using some variation of @parent.context and then return different tags based on the parent element. This works fine in most cases, but it doesn't work here: @parent seems to always return the block element containing the quoted text, even if the text is inside two or three layers of inline tags.

I've tried checking @type (only returns one value, the type of the current element) and role with no luck.

The only way I've been able to get this to work is by assigning a custom role to the text (such as [bold-italics]#text#)  but this is far from ideal because it would require us to reformat our documents, rather than converting the standard formatting as I would expect.

Is what I'm trying to do even possible in the current version? Do I need to wait until the inline parser has been replaced?
Reply | Threaded
Open this post in threaded view
|

Re: custom tags for nested inline_quoted text (bold+italics, bold monospace, etc)

mojavelinux
Administrator
@stallio,

You're correct in thinking that the behavior you want is a feature of the planned inline parser, not the current one.

Currently, Asciidoctor uses a flat (regexp-based) inline parser / converter. That means the parent of any inline node is the containing block, as you have observed. In other words, one inline element has no awareness of another inline element.

This is one of the rare cases I'd suggest using a Postprocessor. Once the inline content is converted, it has a proper node structure. You can then use an HTML (or XML) parser to locate nested inline elements and rewrite them. And it will be perfectly accurate.

To process only the "article" content, just start the node with the ID of "content".

Here's how you might go about it:

```ruby
require 'nokogiri'

content = IO.read 'test.html'
html_doc = Nokogiri::HTML::Document.parse content
article_node = (html_doc.xpath '//div[@id="content"]')[0]
(article_node.xpath '//strong/em/..').each do |node|
  node.replace node.to_html.gsub('<em>', '</strong><bolditalic>').gsub('</em>', '</bolditalic><strong>')
end
IO.write 'test-output.html', html_doc.to_html
```

Using xpath, you can pretty much dive into any combination of nodes and simply rewrite the elements. And nokogiri allows you to manipulate the document quickly in a lot of different ways.

Cheers,

-Dan

On Fri, May 5, 2017 at 12:37 PM, stallio [via Asciidoctor :: Discussion] <[hidden email]> wrote:
I'm working on custom ERB templates. Unfortunately the environment I'm exporting to (Adobe InDesign) is not very smart about nested tags, so I need to export different tags for different combinations of inline formatting.

So if text is bold + italics, I need it inside some sort of <bolditalics> tag instead of just nesting an italics tag inside a bold tag. The same goes for bold monospace, italics monospace, etc.

Normally I would handle this using some variation of @parent.context and then return different tags based on the parent element. This works fine in most cases, but it doesn't work here: @parent seems to always return the block element containing the quoted text, even if the text is inside two or three layers of inline tags.

I've tried checking @type (only returns one value, the type of the current element) and role with no luck.

The only way I've been able to get this to work is by assigning a custom role to the text (such as [bold-italics]#text#)  but this is far from ideal because it would require us to reformat our documents, rather than converting the standard formatting as I would expect.

Is what I'm trying to do even possible in the current version? Do I need to wait until the inline parser has been replaced?


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/custom-tags-for-nested-inline-quoted-text-bold-italics-bold-monospace-etc-tp5578.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux