Problem with NFD normalized Unicode in PDF

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with NFD normalized Unicode in PDF

Miki Tebeka
Hi All,

I'm writing a book on programming and want to show NFC & NFD normalized Unicode. The NFD version is not rendered well on PDF.
Here's a short example:

I once went to Kraków, it's a nice city. (NFC)

I once went to Kraków, it's a nice city. (NFD)

You can view "asciidoctor-pdf city.adoc" result at https://www.dropbox.com/s/9luy50tveup7zxh/adoc-nfd.png?dl=0 (for some reason I can't insert an image).

I've tried to change the code font using a custom theme but it didn't help. Any idea on how to solve this?

Thanks,
Miki
Miki Tebeka | miki@353solutions.com | @tebeka
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

mojavelinux
Administrator
Miki,

I was not aware that these two forms even existed. I did a little digging and it turns out this is something that the PDF generator, Prawn, needs to handle. Asciidoctor PDF isn't the one responsible for interpreting the text. Asciidoctor PDF passes the text though to the PDF generator to encode into PDF objects. So something's going wrong at that level. You'll need to report it here: https://github.com/prawnpdf/prawn

We could add a test for this to track the situation.

Since the visible result is the same, the workaround I propose is to use an ifdef conditional to output the NFC form in both cases when converting to PDF. At least then you don't see the right glyph instead of the notdef glyph.

Best Regards,

-Dan

On Tue, Mar 24, 2020 at 4:00 AM Miki Tebeka [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi All,

I'm writing a book on programming and want to show NFC & NFD normalized Unicode. The NFD version is not rendered well on PDF.
Here's a short example:

I once went to Kraków, it's a nice city. (NFC)

I once went to Kraków, it's a nice city. (NFD)

You can view "asciidoctor-pdf city.adoc" result at https://www.dropbox.com/s/9luy50tveup7zxh/adoc-nfd.png?dl=0 (for some reason I can't insert an image).

I've tried to change the code font using a custom theme but it didn't help. Any idea on how to solve this?

Thanks,
Miki



If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Problem-with-NFD-normalized-Unicode-in-PDF-tp7780.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

mojavelinux
Administrator
In reply to this post by Miki Tebeka
Btw, the font isn't going to matter. The problem is not the font. It's the translation of the character sequence to the font glyph. The code that handles that is only considering the NFC form.

-Dan

On Wed, Mar 25, 2020 at 3:46 AM Dan Allen <[hidden email]> wrote:
Miki,

I was not aware that these two forms even existed. I did a little digging and it turns out this is something that the PDF generator, Prawn, needs to handle. Asciidoctor PDF isn't the one responsible for interpreting the text. Asciidoctor PDF passes the text though to the PDF generator to encode into PDF objects. So something's going wrong at that level. You'll need to report it here: https://github.com/prawnpdf/prawn

We could add a test for this to track the situation.

Since the visible result is the same, the workaround I propose is to use an ifdef conditional to output the NFC form in both cases when converting to PDF. At least then you don't see the right glyph instead of the notdef glyph.

Best Regards,

-Dan

On Tue, Mar 24, 2020 at 4:00 AM Miki Tebeka [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi All,

I'm writing a book on programming and want to show NFC & NFD normalized Unicode. The NFD version is not rendered well on PDF.
Here's a short example:

I once went to Kraków, it's a nice city. (NFC)

I once went to Kraków, it's a nice city. (NFD)

You can view "asciidoctor-pdf city.adoc" result at https://www.dropbox.com/s/9luy50tveup7zxh/adoc-nfd.png?dl=0 (for some reason I can't insert an image).

I've tried to change the code font using a custom theme but it didn't help. Any idea on how to solve this?

Thanks,
Miki



If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Problem-with-NFD-normalized-Unicode-in-PDF-tp7780.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

mojavelinux
Administrator
It looks like we can simply call the unicode_normalize method on the text to automatically translate to NFC.

text.unicode_normalize

I see a place where we could do this in Asciidoctor PDF. If you open an issue, I will consider doing that. https://github.com/asciidoctor/asciidoctor-pdf/issues

-Dan

On Wed, Mar 25, 2020 at 3:47 AM mojavelinux [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Btw, the font isn't going to matter. The problem is not the font. It's the translation of the character sequence to the font glyph. The code that handles that is only considering the NFC form.

-Dan

On Wed, Mar 25, 2020 at 3:46 AM Dan Allen <[hidden email]> wrote:
Miki,

I was not aware that these two forms even existed. I did a little digging and it turns out this is something that the PDF generator, Prawn, needs to handle. Asciidoctor PDF isn't the one responsible for interpreting the text. Asciidoctor PDF passes the text though to the PDF generator to encode into PDF objects. So something's going wrong at that level. You'll need to report it here: https://github.com/prawnpdf/prawn

We could add a test for this to track the situation.

Since the visible result is the same, the workaround I propose is to use an ifdef conditional to output the NFC form in both cases when converting to PDF. At least then you don't see the right glyph instead of the notdef glyph.

Best Regards,

-Dan

On Tue, Mar 24, 2020 at 4:00 AM Miki Tebeka [via Asciidoctor :: Discussion] <[hidden email]> wrote:
Hi All,

I'm writing a book on programming and want to show NFC & NFD normalized Unicode. The NFD version is not rendered well on PDF.
Here's a short example:

I once went to Kraków, it's a nice city. (NFC)

I once went to Kraków, it's a nice city. (NFD)

You can view "asciidoctor-pdf city.adoc" result at https://www.dropbox.com/s/9luy50tveup7zxh/adoc-nfd.png?dl=0 (for some reason I can't insert an image).

I've tried to change the code font using a custom theme but it didn't help. Any idea on how to solve this?

Thanks,
Miki



If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Problem-with-NFD-normalized-Unicode-in-PDF-tp7780.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux



If you reply to this email, your message will be added to the discussion below:
https://discuss.asciidoctor.org/Problem-with-NFD-normalized-Unicode-in-PDF-tp7780p7785.html
To start a new topic under Asciidoctor :: Discussion, email [hidden email]
To unsubscribe from Asciidoctor :: Discussion, click here.
NAML


--
Dan Allen | @mojavelinux | https://twitter.com/mojavelinux
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

Miki Tebeka
In reply to this post by mojavelinux
Thanks. I've submitted https://github.com/prawnpdf/prawn/issues/1154
Miki Tebeka | miki@353solutions.com | @tebeka
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

Miki Tebeka
In reply to this post by mojavelinux
That could be a nice workaround. My original intent that the text in the PDF will be in NFD as well, I'll print a notice now.
Miki Tebeka | miki@353solutions.com | @tebeka
Reply | Threaded
Open this post in threaded view
|

Re: Problem with NFD normalized Unicode in PDF

Miki Tebeka
FWIW I've found out the the IBM Plex Mono font does the job.

(You can view the book at https://gum.co/Qkmou)
Miki Tebeka | miki@353solutions.com | @tebeka