Discussion:
[hakyll] Weird EOL behaviour on Linux
James Mansion
2017-04-12 21:47:53 UTC
Permalink
I have been using hakyll on windows and its working well,
see http://www.mansionfamily.plus.com/

So, I check out onto Xubuntu and try working on the train on my portable.
I find that the site I build locally is quite different - in a bad way.

Checkout of the sources is via a Mercurial store, on a FreeBSD server,
using HTTP.
The sources are rst formatted.

I find that the generated output on Windows, I get a fairly sane HTML
document, with <CRLF> at the end of every line. That's what Notepad++ is
showing, and the site renders OK locally and if I deploy it then it renders
OK on all platforms I have.

On Linux, however - its weird.

I have a source line that looks like this (in rst, with EOL markup per
Notepad++, changed to {} to differentiate from HTML):

Suppose also that I:{CRLF}

Then in Windows the generated HTML is:

<dt>Suppose also that I:</dt>{CRLF}

While on Linux I get:

<dt>Suppose also that I:{CR}</dt}{LF}


In fact the source file I view on Linux is still (via Scite explicit EOL):

Suppose that I:{CRLF}

So it seems that I might need to adjust the CRLF handling in my Mercurial
setup for this.

But it also seems to me that this should ideally be handled by the system,
since in effect CRLF is the de facto standard EOL for text based systems
like HTML.
I would hope to process text that is in either NL or CRNL format (or even
CR format) and generate output that is CRNL for the web.

The results are definitely a bit weird.

Have I missed something in terms of a setting I can use to sort this out?
--
You received this message because you are subscribed to the Google Groups "hakyll" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hakyll+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Daniel Gnoutcheff
2017-04-13 17:37:27 UTC
Permalink
Post by James Mansion
So it seems that I might need to adjust the CRLF handling in my Mercurial
setup for this.
But it also seems to me that this should ideally be handled by the system,
since in effect CRLF is the de facto standard EOL for text based systems
like HTML.
While it's true that CRLF is the canonical line terminator for IETF
network protocols, this is widely ignored on the web. For example,
Wikipedia, Amazon, and Facebook all serve HTML with unix-style (LF) line
terminators.

Furthermore, the IETF mandate only applies to text in transit. Text
stored on disk is always "supposed" to have the line terminator style
that's standard for the OS. In theory, network applications will
convert line terminators before sending text over the wire.

Ref: https://www.rfc-editor.org/old/EOLstory.txt

Of course, as we've observed, this doesn't work well in practice, which
is why we're still worrying about line terminators in 2017. :P

I agree that, ideally, a text processing utility should handle all line
terminator styles. But as of now, the only line terminator that's
guaranteed to work is the one used by the currently running OS.

So yes, I think getting Mercurial to do line terminator conversion for
us would be a good solution. As a network application, it arguably
*should* do that. I know git can do it[1].

HTH!

[1] http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/
--
You received this message because you are subscribed to the Google Groups "hakyll" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hakyll+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...