Asciidoctor AST output

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Asciidoctor AST output

Jeremie Bresson
Is there an Asciidoctor (or AsciidoctorJ) backend that output the AST as structured Text like XML or JSON?

It would be interesting in order to understand how things work.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

mojavelinux
Administrator
Not currently. The closest thing we have is the DocBook XML output (which is pretty close to the AST). I've often talked about creating a JSON converter. That would be especially useful for indexing, search and analysis tools such as Elasticsearch.

If such a converter were created, I think it would live in it's own repository, such as asciidoctor-json.

Cheers,

-Dan

On Fri, May 13, 2016 at 9:22 AM, Jeremie Bresson [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Is there an Asciidoctor (or AsciidoctorJ) backend that output the AST as structured Text like XML or JSON?

It would be interesting in order to understand how things work.



If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Asciidoctor-AST-output-tp4670.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--
Dan Allen | @mojavelinux | http://google.com/profiles/dan.j.allen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

Robert.Panzer
To give interested people an idea how the asciidoctor _source_ maps to the AST I've tried to document it here:
https://github.com/asciidoctor/asciidoctorj/blob/asciidoctorj-1.6.0/docs/integrator-guide.adoc#understanding-the-ast-classes

Would be certainly cool to write a converter that dumps the tree. This would help a lot when writing an extensions.

Cheers
Robert
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

Jeremie Bresson
I have started a very naive implementation here:
https://github.com/jmini/asciidoctorj-experiments/blob/master/ast-json/src/main/java/ast/json/AstJsonConverter.java

It uses:
* org.asciidoctor:asciidoctorj:1.6.0-alpha.3
* org.json:json:20160212

I have called all getters for each type of blocs. Only the getters that allow to navigate backward in the graph are not called (getParent(), getDocument(), getTable() on columns, …). I have the feeling that there is some redundancy in the getters.

I wanted to test it with different examples. For the moment only my examples 01 and 04 work.

Example 01 (source, output):
I think the example is so simple, that it works as expected.

Example 04 (source, output):
The difference with the previous example is that I have introduced some styles in my paragraph.
The output is in my opinion wrong: the Sting content line of the paragraph contains an "inline_quoted" block. This gets escaped by the JSON Library I use.
I have the feeling that I am not able to navigate inside the paragraph blocs programmatically.

The other examples do not work yet. I get some Errors/Exception. I did not have the time yet to analyze what is going on.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

Robert.Panzer
This post was updated on .
Hi Jeremie,

yes, you're right.
Inline content is not parsed by Asciidoctor, so that there are no nodes in the AST for anything that is inside of a paragraph.
This is the current limit that you will hit at the moment independent of using AsciidoctorJ or Asciidoctor.

I really appreciate that you are doing this. If you hit any glitches, please don't hesitate to file issues :-)

Cheers
Robert
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

Jeremie Bresson
@Robert:

Thank you a lot for your answer. I am not sure to follow what you mean.

When I try to convert a document containing this single line:

Lorem ipsum dolor sit amet, *consectetur* adipiscing elit

The "convert(ContentNode node, String transform, Map<Object, Object> o)" method is called 3 times:

* org.asciidoctor.ast.impl.DocumentImpl@3bc891f2

* org.asciidoctor.ast.impl.BlockImpl@362be0cd
     => ((Block) node).getSource(); //returns: Lorem ipsum dolor sit amet, *consectetur* adipiscing elit.
   
* org.asciidoctor.ast.impl.PhraseNodeImpl@57a982f9
     => ((PhraseNode) node).getText(); //returns: consectetur

So there is at some point a node standing for the inline text in bold. Is this outside of the AST?

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Asciidoctor AST output

mojavelinux
Administrator
In reply to this post by Jeremie Bresson
JSON output seems particularly well suited as input for a search index. I'm sure it will help other tooling as well.

I would like to see the JSON keys start with a lowercase letter, personally. I think that causes less confusion. Even better if they are converted to lowercase and snake case (but I'd be okay with camelcase too).

-Dan

On Wed, Jun 8, 2016 at 11:32 PM, Jeremie Bresson [via Asciidoctor :: Discussion] <[hidden email]> wrote:
I have started a very naive implementation here:
https://github.com/jmini/asciidoctorj-experiments/blob/master/ast-json/src/main/java/ast/json/AstJsonConverter.java

It uses:
* org.asciidoctor:asciidoctorj:1.6.0-alpha.3
* org.json:json:20160212

I have called all getters for each type of blocs. Only the getters that allow to navigate backward in the graph are not called (getParent(), getDocument(), getTable() on columns, …). I have the feeling that there is some redundancy in the getters.

I wanted to test it with different examples. For the moment only my examples 01 and 04 work.

Example 01 (source, output):
I think the example is so simple, that it works as expected.

Example 04 (source, output):
The difference with the previous example is that I have introduced some styles in my paragraph.
The output is in my opinion wrong: the Sting content line of the paragraph contains an "inline_quoted" block. This gets escaped by the JSON Library I use.
I have the feeling that I am not able to navigate inside the paragraph blocs programmatically.

The other examples do not work yet. I get some Errors/Exception. I did not have the time yet to analyze what is going on.



If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Asciidoctor-AST-output-tp4670p4753.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Loading...