Hyperlinks with multiple underscores

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Hyperlinks with multiple underscores

neontapir
I'm an Asciidoc newbie, so I apologize if my question seems basic.

I'm having trouble creating a link to this URL: http://www.migrations.fr/la_guerre__de__sept__ans.htm

When I try, I end up with
<a href="http://www.migrations.fr/la_guerre">http://www.migrations.fr/la_guerre</a><em>de</em>sept__ans.htm

How can I escape the double underscores while still generating a link?
Reply | Threaded
Open this post in threaded view
|

Re: Hyperlinks with multiple underscores

LightGuardjp
Administrator
How are you creating the link, just with the bare URL? Well, I guess either way this is a bug. 

Sent from Mailbox for iPhone


On Tue, Sep 10, 2013 at 8:40 PM, neontapir [via Asciidoctor :: Discussion] <[hidden email]> wrote:

I'm an Asciidoc newbie, so I apologize if my question seems basic.

I'm having trouble creating a link to this URL: http://www.migrations.fr/la_guerre__de__sept__ans.htm

When I try, I end up with

<a href="http://www.migrations.fr/la_guerre">http://www.migrations.fr/la_guerre</a><em>de</em>sept__ans.htm

How can I escape the double underscores while still generating a link?


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Hyperlinks-with-multiple-underscores-tp558.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Hyperlinks with multiple underscores

neontapir
I can reproduce it with a document containing:

Link: http://www.migrations.fr/la_guerre__de__sept__ans.htm

I'll open an issue on GitHub. Thanks for the quick reply!
Reply | Threaded
Open this post in threaded view
|

Re: Hyperlinks with multiple underscores

mojavelinux
Administrator
Chuck,

This is one of the many reasons I'm researching peg-based grammars for parsing inline markup. The problem we're facing here is that's a limit to how much context a regular expression can see...and it messes up when the markup looks perfectly valid from a pattern-matching perspective. Most of the lightweight markup languages out there have this problem...until they make the move to a grammer-based parser.

I'm really looking forward to this improvement in Asciidoctor because it's going to make inline-formatting much more predictable...and we can get access to it in the AST.

Fortunately, AsciiDoc provides many different ways to control substitution to work around issues like this one. I'll present the solutions in the order that I recommend using them (as not all solutions are good practice).

Solution A ::

The simplest and easiest way to get a link to behave itself is to stick it into an attribute.

```asciidoc

This URL has repeating underscores {link-with-underscores} but AsciiDoc won't process them.

```

This works because quotes are substituted before attributes, so the URL remains "hidden" while the text in the line is being formatted (strong, emphasis, monospace, etc).

Solution B ::

Another way to solve formatting glitches is to explicitly specify the formatting you want to have applied to a span of text using the inline pass macro. If you want to display a URL, and have it be completely preserved, you can put it inside a pass macro and enable only macros (which is what substitutes links).

```asciidoc

This URL has repeating underscores pass:macros[http://www.migrations.fr/la_guerre__de__sept__ans.htm] but AsciiDoc won't process them.

```

This works because the pass macro removes the content from the line of text while substitutions are performed, applies the explicit substitutions to that text while it's on the sidelines, then restores it to the original location.

Solution C and D ::

The final two solution I'll mention are related, but I don't recommend using them. It's possible to escape individual characters or a range of characters inside the URL.

You can isolate the part of the URL causing problems using the double dollar escape:

```asciidoc

This URL has repeating underscores http://www.migrations.fr/$$la_guerre__de__sept__ans$$.htm but AsciiDoc won't process them.

```

Like the pass macro, it pulls the text out during substitution, but it doesn't offer a way to apply substitutions to that text.  You tend to use double dollar when you want to prevent the processor from detecting a URL, like:

```asciidoc

This URL won't be recognized by the processor $$http://www.migrations.fr/la_guerre__de__sept__ans.htm$$

```

It's also possible to escape the underscores:


```asciidoc

This URL won't be recognized by the processor <a href="http://www.migrations.fr/la\_guerre__de__sept__ans.htm">http://www.migrations.fr/la\_guerre__de__sept__ans.htm

```

However, escaping is not consistent between AsciiDoc and Asciidoctor (mostly because Ruby 1.8.7, which we still support, doesn't have look behind capabilities in the regex engine).

I think you'll be the most happy with Solution A. It's best practice to pull all your links into attributes anyway, and by doing so you get the bonus that they aren't mangled.

Hope that helps!

Cheers,

-Dan


On Tue, Sep 10, 2013 at 9:52 PM, neontapir [via Asciidoctor :: Discussion] <[hidden email]> wrote:
I can reproduce it with a document containing:

Link: http://www.migrations.fr/la_guerre__de__sept__ans.htm

I'll open an issue on GitHub. Thanks for the quick reply!


If you reply to this email, your message will be added to the discussion below:
http://discuss.asciidoctor.org/Hyperlinks-with-multiple-underscores-tp558p561.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML



--