Re: How to get rid of paragraph duplication?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: How to get rid of paragraph duplication?

mojavelinux
Administrator
I'm relocating the issue in $subject (the contents of which can be found in the quoted text below) to this list in order to answer and further discuss it. If we decide there is a necessary change, then we'll create a follow-up issue in the issue tracker.

In an effort to be compliant with AsciiDoc, Asciidoctor faithfully implements the HTML output from AsciiDoc's html5 backend. As Alex points out, this produces the following markup for a simple paragraph:

<div class="paragraph">
<p>Paragraph text here</p>
</div>

The question is, why is the <p> element wrapped in a <div>?

At first, this may seem unnecessary. However, to understand why the <div> is used, you must consider the full capabilities of an AsciiDoc paragraph. True to its DocBook (and publishing) heritage, a paragraph in AsciiDoc can have a title. In DocBook, this is known as a "formal paragraph".

Consider the following AsciiDoc source:

.Paragraph title
Paragraph text

AsciiDoc and Asciidoctor both produce the following HTML from this source:

<div class="paragraph">
  <div class="title">Paragraph title</div>
  <p>Paragraph text</p>
</div>

and the following DocBook:

<formalpara>
  <title>Paragraph title</title>
  <para>Paragraph text</para>
</formalpara>

Thus, the reason for the <div> is to adjoin the paragraph and its title.

One could argue, then, that the <div> should be excluded if the paragraph has no title, to simply produce:

<p>Paragraph text</p>

This presents a slight challenge to make CSS that properly pads the paragraph with and without the <div> wrapper, which is why I think Stuart decided to use the <div> wrapper in all cases when he setup the templates. I can think of how it can be done, so the <div> is not entirely necessary.

It's important, at least for now, that Asciidoctor be as consistent as possible with AsciiDoc...as we are still gaining credibility among AsciiDoc adopters.

There are two paths to take without having to change the html5 backend as it is today.

1. A custom backend template

You can easily override the template by creating a Slim (recommended), Haml or ERB template for block_paragraph that excludes the <div> when there is no title. Here's an example of the Slim template, block_paragraph.html.slim

- if title?
  .paragraph id=@id class=role
    p=content
- else
  p=content

You place this file in the slim/html5 subdirectory in a directory you reserve for templates. You then pass the templates directory to the asciidoctor command as follows:

 asciidoctor -T /path/to/templates/slim sample.ad

or in Asciidoctor 0.1.4

 asciidoctor -T /path/to/templates -E slim sample.ad

The template you provided will be used in place of the built-in paragraph template, producing a sole <p> element when the paragraph has no title.

Of course, you'd need to adjust the default stylesheet to pad the <p> element appropriately.

2. A new HTML backend

Another approach is to complete issue #242 (https://github.com/asciidoctor/asciidoctor/issues/242), which proposes to create an HTML 5 backend that leverages the full semantics of HTML 5. We could perhaps call it html5 and rename the existing one to html5-legacy. This is where we could tune the output to be something we're all happy with (being careful not to over think it and get stuck on matters of little importance).

Thoughts?

-Dan

p.s. If you need more help setting up the custom backend, please don't hesitate to ask. Once we get the docs repository setup, I'll be sure to add proper docs for creating and using backend templates for this sort of thing.


Is there a way to get rid of the semantic duplication of a paragraph in Asciidoc's generated HTML?

Example:
echo lorem ipsum | asciidoctor - creates HTML output with the following string

<div class="paragraph">
<p>lorem ipsum</p>
</div>

Does Asciidoctor provide a way to get rid of this duplication (obviously introduced by asciidoc)?

The desired output would be <p>lorem ipsum</p>

Thanks in advance for your help!

- @aheusingfeld


--
Reply | Threaded
Open this post in threaded view
|

Re: How to get rid of paragraph duplication?

aheusingfeld
Though I understand the assumption behind this behaviour, IMO it is way too complex. Not even in a book there is a title for each and every paragraph.

I'd rather argue that "div.paragraph" is not correct as semantically seen this is actually marking a "section" (http://www.w3.org/html/wg/drafts/html/master/dom.html#paragraph). And of course a <section> may have a title or heading declared as <h1> to <h6> (http://www.w3.org/html/wg/drafts/html/master/sections.html#the-section-element). But I guess this would have to be filed against AsciiDoc itself.

It seems I don't know Docbook in detail, because I wonder why AsciiDoc differentiates between headings (##heading) and titles (.title). See the following example for details:

Source:
-----
### My heading

lorem ipsum
lorem ipsum

lorem ipsum
-----

Output:
-----
<body class="article">
<div id="header">
</div>
<div id="content">
<div class="sect2">
<h3 id="_my_heading">My heading</h3>
<div class="paragraph">
<p>lorem ipsum
lorem ipsum</p>
</div>
<div class="paragraph">
<p>lorem ipsum</p>
</div>

</div>

</div>
<div id="footer">
<div id="footer-text">
Last updated 2013-08-20 10:57:02 CEST
</div>
</div>
</body>
-----


Expected output:
-----
<body class="article">
<div id="header"></div>
<main id="content">
<section>
<h3 id="_my_heading">My heading</h3>
<p>lorem ipsum<br>
lorem ipsum</p>
<p>lorem ipsum</p>
</section>
</main>
<footer>
<p>Last updated <time>2013-08-20 10:57:02+0200</time></p>
</footer>
</body>
-----
Of course one could argue that for a line break there shouldn't be a <br> but the <p> should be closed and another one opened, but the <br> is my understanding of a simple line break. Argue now! ;)

As I guess, we all agree that Docbook is way too complex for the daily doing, my question would be how we can make the output of the assumptions (or conventions) as simple as possible without breaking with Asciidoc. Is there a way to discuss the complexity problem with them?