API to references / list of external links

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

API to references / list of external links

riddochc
Hi!

Before making changes and pull requests, I'm writing to see if I'm missing something important in how things are implemented and why.

I'm wanting to have an API to the links found in a document.  It looks like the way references are processed by asciidoctor currently gives convenient access to footnotes, internal references, but not external links.

At a high level, it looks like links and references are identified with LinkInlineRx and LinkInlineMacroRx in lib/asciidoctor/substitutors.rb, then registered with the document instance, and then Inline objects are created to represent them in the result variable.  The document has +@references+ which stores information about internal references - first initialized as a hash at lib/asciidoctor/document.rb:193.  But the register method, at line 532, seems only concerned with :ids, :footnotes, and :indexterms.  Specifically, I don't see any code anywhere that assigns anything to the document's @references[:links] array.

I'd like this list of external links for use in a personal wiki - at the bottom of each page, I'd like to have a "This page is linked from: " section.

I realize that for links to things within the wiki, I could already use internal links, because those *are* stored in @references[:ids], but this would require putting an [#id] at the top of each page (which I'd rather not have to do), use an internal ref <<wiki-page.adoc#ref>> and then re-rendering all the pages each time I change one page.  Part of the purpose of a wiki-link is to be a little more flexible with things like capitalization and not requiring knowledge of complete filenames when editing a page, so I'm not sure internal links in their current form are very good for the purpose of a wiki.

I'm open to suggestions, but I'm currently planning to contribute a change to populate @references[:links].
Reply | Threaded
Open this post in threaded view
|

Re: API to references / list of external links

mojavelinux
Administrator
Capturing links is supported, but in a very particular way.

In order to capture references other than ids, footnotes or indexterms, you must enable the catalog_assets option when invoking the Asciidoctor API

----
require 'asciidoctor'

doc = Asciidoctor.convert_file ARGV[0], to_file: true, safe: :safe, catalog_assets: true
----

Then, you'll see that the links are captured.

----
require 'pp'

pp doc.references
----

However, since inline parsing (where the links are handled) is done during the convert phase, the catalog is populated only after convert is complete (actually, it's after each link is found and converted). You could use a custom inline macro to hook into the conversion step and insert the links at the bottom of the page. Something like:

----
page-links:[]
----

You could also use a Postprocessor if you want something less explicit. You can get creative :)

If you want to extract the links from a page, you have to use the above code (you can't simply load the document, you have to convert it). You write to /dev/null to avoid writing the file, or you can do:

----
require 'asciidoctor'

doc = Asciidoctor.load_file ARGV[0], safe: :safe, catalog_assets: true
doc.convert
links = doc.references[:links]
----

Of course, you are still welcome to improve this if you have ideas to make it better.

Cheers,

-Dan

On Thu, Apr 9, 2015 at 5:21 PM, riddochc [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi!

Before making changes and pull requests, I'm writing to see if I'm missing something important in how things are implemented and why.

I'm wanting to have an API to the links found in a document.  It looks like the way references are processed by asciidoctor currently gives convenient access to footnotes, internal references, but not external links.

At a high level, it looks like links and references are identified with LinkInlineRx and LinkInlineMacroRx in lib/asciidoctor/substitutors.rb, then registered with the document instance, and then Inline objects are created to represent them in the result variable.  The document has +@references+ which stores information about internal references - first initialized as a hash at lib/asciidoctor/document.rb:193.  But the register method, at line 532, seems only concerned with :ids, :footnotes, and :indexterms.  Specifically, I don't see any code anywhere that assigns anything to the document's @references[:links] array.

I'd like this list of external links for use in a personal wiki - at the bottom of each page, I'd like to have a "This page is linked from: " section.

I realize that for links to things within the wiki, I could already use internal links, because those *are* stored in @references[:ids], but this would require putting an [#id] at the top of each page (which I'd rather not have to do), use an internal ref <<wiki-page.adoc#ref>> and then re-rendering all the pages each time I change one page.  Part of the purpose of a wiki-link is to be a little more flexible with things like capitalization and not requiring knowledge of complete filenames when editing a page, so I'm not sure internal links in their current form are very good for the purpose of a wiki.

I'm open to suggestions, but I'm currently planning to contribute a change to populate @references[:links].


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/API-to-references-list-of-external-links-tp2977.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--