Newbie question

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Newbie question

Happenstance
Hi all,

I am trying to figure out where the control for line breaks is when
converting from AsciiDoctor to docbook. I have spent a while trying to find
it and can't.

All I want to do is not have hard returns output as shown in the preview
pane when I do a straight conversion to docbook.

Can someone help? Or at least explain where this is controlled?

Thanks in advance,
Maureen
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Andrew Carver
This post was updated on .
Hi, that's an excellent question, and I've not seen it documented but I know what I've actually seen happen with hard breaks in a straight conversion to DocBook: They get replaced by an XML processing-instruction like this:

<?asciidoc-br?>

That sort of processing-instruction will do nothing but sit there in your DocBook, unless the DocBook XSLT stylesheets you use are customized to do something with it. For example, in converting DocBook to XSL-FO, some folks customize their stylesheet to replace a linebreak processing-instruction with an <fo:block /> element: http://www.sagehill.net/docbookxsl/PageBreaking.html#HardPageBreaks

Does that help?
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Andrew Carver
Oops, sorry, that link was to a discussion about page breaks, not line breaks... here's the one I was trying for:

http://www.sagehill.net/docbookxsl/LineBreaks.html
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

mojavelinux
Administrator
In reply to this post by Andrew Carver
Perfectly explained.

Here's an example of how that processing instruction might get handled.


DocBook doesn't have a hard line break tag (to my knowledge), so the interpretation must be specific to the output format / toolchain.

-Dan


On Mon, Apr 16, 2018, 18:14 Andrew Carver [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi, that's an excellent question, and I've not seen it documented but I know what I've actually seen happen with hard breaks in a straight conversion to DocBook: They get replaced by an XML processing-instruction like this:

<?asciidoc-br?>

That sort of processing-instruction will do nothing but sit there in your DocBook, unless the DocBook XSLT stylesheets you use are customized to do something with it. For example, in converting DocBook to XSL-FO, some folks customize their stylesheet to replace a linebreak processing-instruction with an <fo:break /> element: http://www.sagehill.net/docbookxsl/PageBreaking.html#HardPageBreaks

Does that help?


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Newbie-question-tp6244p6245.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

mojavelinux
Administrator
In reply to this post by Happenstance
Out of curiosity, what are you using to process the DocBook?

On Mon, Apr 16, 2018, 17:13 Happenstance [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi all,

I am trying to figure out where the control for line breaks is when
converting from AsciiDoctor to docbook. I have spent a while trying to find
it and can't.

All I want to do is not have hard returns output as shown in the preview
pane when I do a straight conversion to docbook.

Can someone help? Or at least explain where this is controlled?

Thanks in advance,
Maureen



If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Newbie-question-tp6244.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Happenstance
Thanks everyone for your links and advice. I will take a look and let you know if I make any progress.
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Happenstance
In reply to this post by mojavelinux
More on this:

We’re using AsciiDocFX to save out to DocBook xml. It’s a pretty simple stand-alone app that we can open files in and save out as DocBook xml. We also ran the same asciidoc files through AsciiDoctor with the same results.

We’re trying to import these asciidoc files into InDesign, so we’re saving out to DocBook then running some xslt’s to format them into InDesign’s setup before we import them. Usually no big deal, but we also usually work with .docx files.

This is our first run-in with AsciiDoc, and it’s great because the DocBook xml is structured and easy to work with. Only thing we can’t figure out is why the DocBook conversion puts a line feed character ( ) seemingly wherever the preview pane breaks the line of text. So when we import the file, there’s all these extra paragraph breaks in the middle of paragraphs that aren’t there in the preview.

For the moment we’re running a text translate to remove the line feeds in the xslt’s. But that’s not a great solution. When there are lines of code in <programlisting> we need to honor the line feeds, so we don’t want to remove them willy-nilly.

So the question is: Are these line-feed characters dropped in there by whatever preview program is rendering them, assuming that the character will be ignored in html?  Or is that character just part of how AsciiDoc operates? I mean, InDesign isn’t exactly the target here.

We’re wondering if there’s some function to control line feeds in the DocBook conversion, especially in particular elements (such as literal code). It’s less that we need to control for line breaks and more that we need to figure out how to ignore them in certain elements. But so far, DocBook honors them for reasons we can’t grasp.
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

mojavelinux
Administrator
Thank you for this additional information. I now understand the problem you are experiencing. In fact, I've run into this problem myself.

Repeating space characters (including newlines) inside elements which are not preformatted / literal text are supposed to be interpreted as a single space. However, in my experience, the InDesign importer incorrectly interprets newlines as hard returns regardless of the element.

Are these line-feed characters dropped in there by whatever preview program is rendering them, assuming that the character will be ignored in html?

The newlines are coming from the source document. If you have newlines in a paragraph in AsciiDoc (following the sentence-per-line pattern, for example), Asciidoctor does not remove these newlines because they are not significant. In other words, Asciidoctor isn't adding or removing anything. It's just passing the text through as written. It's InDesign that is giving these newlines meaning.

> For the moment we’re running a text translate to remove the line feeds in the xslt’s.

If the behavior of InDesign cannot be changed, you can solve this in Asciidoctor using an extension. One way is to use a TreeProcessor to visit only the paragraphs in the document and remove the newlines. I have created a sample extension to demonstrate how this is done:


You can use this extension as follows:

asciidoctor -b docbook -r ./fold-lines-tree-processor.rb document.adoc

We’re wondering if there’s some function to control line feeds in the DocBook conversion, especially in particular elements (such as literal code)

Again, the problem is really with InDesign. It's doing the wrong thing here. Fortunately, Asciidoctor provides the tools necessary to work around the problem.

I hope that helps you on your journey.

-Dan


On Wed, Apr 18, 2018 at 1:29 AM, Happenstance [via Asciidoctor :: Discussion] <[hidden email]> wrote:
More on this:

We’re using AsciiDocFX to save out to DocBook xml. It’s a pretty simple stand-alone app that we can open files in and save out as DocBook xml. We also ran the same asciidoc files through AsciiDoctor with the same results.

We’re trying to import these asciidoc files into InDesign, so we’re saving out to DocBook then running some xslt’s to format them into InDesign’s setup before we import them. Usually no big deal, but we also usually work with .docx files.

This is our first run-in with AsciiDoc, and it’s great because the DocBook xml is structured and easy to work with. Only thing we can’t figure out is why the DocBook conversion puts a line feed character ( ) seemingly wherever the preview pane breaks the line of text. So when we import the file, there’s all these extra paragraph breaks in the middle of paragraphs that aren’t there in the preview.

For the moment we’re running a text translate to remove the line feeds in the xslt’s. But that’s not a great solution. When there are lines of code in <programlisting> we need to honor the line feeds, so we don’t want to remove them willy-nilly.

So the question is: Are these line-feed characters dropped in there by whatever preview program is rendering them, assuming that the character will be ignored in html?  Or is that character just part of how AsciiDoc operates? I mean, InDesign isn’t exactly the target here.

We’re wondering if there’s some function to control line feeds in the DocBook conversion, especially in particular elements (such as literal code). It’s less that we need to control for line breaks and more that we need to figure out how to ignore them in certain elements. But so far, DocBook honors them for reasons we can’t grasp.


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Newbie-question-tp6244p6252.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Happenstance
Hi Dan,

Thanks for the explanation, especially about the newlines. The thing is these newline chars seem to be coming in when AsciiDoctor converts to XML Docbook. They are there well before we ever bring anything into InDesign.

That said your plug in and advice will hopefully solve this.

so again, THANKS so much.

Cheers,
Maureen
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

mojavelinux
Administrator
Maureen,

Do the newlines you see in the DocBook match newlines that are present in the AsciiDoc source? In other words, you can map them 1-to-1?

-Dan



On Thu, Apr 19, 2018 at 4:58 PM, Happenstance [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi Dan,

Thanks for the explanation, especially about the newlines. The thing is these newline chars seem to be coming in when AsciiDoctor converts to XML Docbook. They are there well before we ever bring anything into InDesign.

That said your plug in and advice will hopefully solve this.

so again, THANKS so much.

Cheers,
Maureen


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Newbie-question-tp6244p6259.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Happenstance
Your plug-in totally solved those newlines! Thanks so much!!!
Reply | Threaded
Open this post in threaded view
|

Re: Newbie question

Jeff Lytle
In reply to this post by mojavelinux
Hi Dan --

Colleague of Maureen's at Happenstance, here. Thanks a million for this, it's a huge help! I thought there had to be a method for this, but couldn't even figure out how to google it. As to your question -- yes, the \&xa; characters are a 1-to-1 match.  I'm assuming that they originate from whatever text editor the author used, and they just get passed through because they're relatively vestigial...unless you're importing to Adobe.

We figured out how to run asciidoctor with your extension and it worked as described. Looks like we're going to need to dig into these extensions and see what else we can do. Thank you for pointing us in the right direction!

Quick follow-up if you don't mind. I got the extension to run off of the desktop, but I'm wondering if we can install it in the root /lib/ directory the way it's depicted in the "Using an extension" section. Most of us are drag-and-drop folks, so although I installed asciidoctor through home-brew, I don't understand how the thing's actually running. Is there some way to move an extension into a common directory so we can call it there instead of dragging it to Terminal from wherever? We're trying to automate this step as much as possible.

Thanks again for all your help.
--Jeff