Asciidoctor :: Discussion

Single PDF and multiple HTML from same sources

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

4 messages Options

afrey

Feb 13, 2020; 10:05am

Single PDF and multiple HTML from same sources

Hi!

I have a set of .adoc sources that together make a book (chapters, subsections, etc). I need to generate both HTML and PDF from them, but the output I'm trying to achieve is:
- a single PDF output file
- one HTML file per source .adoc, as I have found that this is the best user experience

Has anyone already achieved that?

If not, I'm considering the following solution:

- implement an include processor to support a syntax like this:
include::xref:<path to subdoc>[]
- this include will be expanded in two different ways depending on the backend:
- for HTML, to "xref:<path to subdoc>[]"
- for PDF, to "include::<path to subdoc>[]"
- this would allow me to write my top-level and chapter documents using the special include::xref: syntax to reference the subsections, which will be transformed in links for HTML and in the actual section content for PDF.
- additionally, the regular cross-document xrefs would also have to be somehow adjusted, using a macro preprocessor, so that the anchors are "globalized" with the document name when I generate the single PDF document.

Do you think I'm on the right path? Also, should this kind of flow be supported in the core?

Best regards,
-- Alex

David Jencks

Feb 13, 2020; 5:28pm

Re: Single PDF and multiple HTML from same sources

This is pretty much the path I’ve taken with what I’m calling “antora-pdf”, a branch of Guillaume’s asciidoctor-pdf.js. It’s javascript based and highly experimental at this point.

https://github.com/djencks/asciidoctor-pdf.js/tree/antora

Since it’s an Antora extension, it assumes your content is arranged in an Antora layout.

With no extra configuration, it produces one pdf per html page.

With configuration of explicit pdfs to generate, it produces only those.

Typically you’d take your Antora nav files and convert them into “book” documents with basically the xrefs changed to includes.

I do some xref and other link processing, but it might not be complete. Generally inter-page links within the pdf document get transformed into intra-pdf links successfully, and links to pages outside the pdf get transformed into links to the published site URL successfully.

It’s based on paged.js to do the pdf conversion.

I think it’s producing fairly OK results for many documents but there are often severe problems with tables; paged.js’s table pagination looks completely broken to me…. but I’m investigating.

Any feedback would be most welcome. Let me reiterate that although I’ve managed to produce some large manuals with this (e.g. the uyuni-docs manuals) it’s highly experimental and unlikely to produce beautiful results without tweaking.

thanks

David Jencks

On Feb 13, 2020, at 2:05 AM, afrey [via Asciidoctor :: Discussion] <[hidden email]> wrote:

Hi!

I have a set of .adoc sources that together make a book (chapters, subsections, etc). I need to generate both HTML and PDF from them, but the output I'm trying to achieve is:
- a single PDF output file
- one HTML file per source .adoc, as I have found that this is the best user experience

Has anyone already achieved that?

If not, I'm considering the following solution:

- implement an include processor to support a syntax like this:
include::xref:<path to subdoc>[]
- this include will be expanded in two different ways depending on the backend:
- for HTML, to "xref:<path to subdoc>[]"
- for PDF, to "include::<path to subdoc>[]"
- this would allow me to write my top-level and chapter documents using the special include::xref: syntax to reference the subsections, which will be transformed in links for HTML and in the actual section content for PDF.
- additionally, the regular cross-document xrefs would also have to be somehow adjusted, using a macro preprocessor, so that the anchors are "globalized" with the document name when I generate the single PDF document.

Do you think I'm on the right path? Also, should this kind of flow be supported in the core?

Best regards,
-- Alex

If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Single-PDF-and-multiple-HTML-from-same-sources-tp7640.html

To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML

... [show rest of quote]

afrey

Feb 14, 2020; 3:06pm

Re: Single PDF and multiple HTML from same sources

Hi David !

Thanks for your reply !

I'll have a look at antora-pdf to see if I can re-use parts of your extensions. I am not currently using Antora. I focus mainly on PDF generation, but also need some HTML on the side and multi-document is clearly better for HTML browsing.

There was some discussion on Gitter about epub and "subdoc::". Is this related?

BR
Alex

David Jencks

Feb 14, 2020; 4:15pm

Re: Single PDF and multiple HTML from same sources

Hi Alex,

If you have more than a very small number of documents to deal with I think the organization Antora provides is worth considering. If you want an “unwrapped” appearance it’s easy to make a “blank UI” that just puts the generated doc into minimal html.

You might be able to translate the ideas from my include processor back into ruby and use that with ruby asciidoctor-pdf; this is apt to be more reliable than the javascript version for some time. I haven’t looked at my code in a few weeks, and what the include processor tracks might be incomplete.

The subdoc discussion isn’t likely to result in any usable code anytime soon. I’m not entirely sure how such a thing would work, it seems sort of like an “exclude processor” rather than an “include processor”.

Good luck, and ask if you have any questions!

David Jencks

On Feb 14, 2020, at 7:06 AM, afrey [via Asciidoctor :: Discussion] <[hidden email]> wrote:

Hi David !

Thanks for your reply !

I'll have a look at antora-pdf to see if I can re-use parts of your extensions. I am not currently using Antora. I focus mainly on PDF generation, but also need some HTML on the side and multi-document is clearly better for HTML browsing.

There was some discussion on Gitter about epub and "subdoc::". Is this related?

BR
Alex

If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Single-PDF-and-multiple-HTML-from-same-sources-tp7640p7647.html

To start a new topic under Asciidoctor :: Discussion, [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML

... [show rest of quote]