How to extract the raw content of a section

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How to extract the raw content of a section

elmicka
Hi,

First of all, I'm new to asciidoctor and ruby, so maybe I'm missing something easy there.

I'm trying to write a script to extract the content from a document.
I came with this script, but "puts level2_block.content" prints out an html formated content
instead of the asciidoc content.

Is it possible to get the asciidoc content ?

Thanks

require 'asciidoctor'                                                                                                                            
include Asciidoctor                                                                 
                                                                                    
document = Asciidoctor.load_file("demo.adoc")                                       
                                                                                    
document.blocks.each do|level1_block|                                               
  puts level1_block.title                                                           
  level1_block.blocks.each do|level2_block|                                         
    page_name = level2_block.title.tr(" ","-")                                      
    puts "===============================================>"                         
    puts "\t #{page_name}"                                                          
    puts level2_block.content                                                                    
  end                                                                               
end  


My test file.

= Title 1                                                                           
                                                                                    
== Subtitle 1                                                                       
                                                                                    
=== My section 1.1                                                                  
My content goes here.                                                               
                                                                                                
== Subtitle 2                                                                       
                                                                                  
=== My section 2.1                                                                  
|===                                                                                
|header1|header2                                                                    
|cell1|cell2                                                                        
|===  
Reply | Threaded
Open this post in threaded view
|

Re: How to extract the raw content of a section

David Jencks
Looking at the javascript translation, I think that .content calls .convert.

You might try .text or .lines (which will be an array of lines)

David Jencks

On Mar 7, 2020, at 3:02 PM, elmicka [via Asciidoctor :: Discussion] <[hidden email]> wrote:

Hi,

First of all, I'm new to asciidoctor and ruby, so maybe I'm missing something easy there.

I'm trying to write a script to extract the content from a document.
I came with this script, but "puts level2_block.content" prints out an html formated content
instead of the asciidoc content.

Is it possible to get the asciidoc content ?

Thanks

require 'asciidoctor'                                                                                                                            
include Asciidoctor                                                                 
                                                                                    
document = Asciidoctor.load_file("demo.adoc")                                       
                                                                                    
document.blocks.each do|level1_block|                                               
  puts level1_block.title                                                           
  level1_block.blocks.each do|level2_block|                                         
    page_name = level2_block.title.tr(" ","-")                                      
    puts "===============================================>"                         
    puts "\t #{page_name}"                                                          
    puts level2_block.content                                                                    
  end                                                                               
end  


My test file.

= Title 1                                                                           
                                                                                    
== Subtitle 1                                                                       
                                                                                    
=== My section 1.1                                                                  
My content goes here.                                                               
                                                                                                
== Subtitle 2                                                                       
                                                                                  
=== My section 2.1                                                                  
|===                                                                                
|header1|header2                                                                    
|cell1|cell2                                                                        
|===  



If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/How-to-extract-the-raw-content-of-a-section-tp7732.html
To start a new topic under Asciidoctor :: Discussion, [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: How to extract the raw content of a section

elmicka
David,

Thank you for your answer.

I tried to use 'level2_block.text' and 'level2_block.lines',
but it seems that there is no such method for this object (undefined method error)

Reply | Threaded
Open this post in threaded view
|

Re: How to extract the raw content of a section

mojavelinux
Administrator
The API does not make the raw source of a section available. The source is only stored at the block level.

What I've done in the past to extract the source of the section is to leverage the sourcemap. By enabling the sourcemap option, you get the file and line number for each section. Then, you take that information and go back to the original source of the document and use it to cut out the source for the second.

Here's some really rough code to show you what I'm talking about:

require 'asciidoctor'

source = <<~'EOS'
= Document Title

== First Section

content

== Second Section

content
EOS

doc = Asciidoctor.load source, sourcemap: true

section_source = doc.source_lines[(doc.sections[0].lineno - 1)..(doc.sections[1].lineno - 2)].join ?\n

Best Regards,

-Dan

On Sat, Mar 7, 2020 at 5:27 PM elmicka [via Asciidoctor :: Discussion] <[hidden email]> wrote:
David,

Thank you for your answer.

I tried to use 'level2_block.text' and 'level2_block.lines',
but it seems that there is no such method for this object (undefined method error)




If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/How-to-extract-the-raw-content-of-a-section-tp7732p7734.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: How to extract the raw content of a section

elmicka
Dan,

Thank you so much, I managed to make it works with your snippet.


Regards