Automatic Code Cleanup

By Cory LaViska on April 24, 2019

Surreal CMS works hard to keep your code as tidy as possible. The editor has always produced semantic markup, but now we've taken things a step further to make sure your code is easy to read even when you view the source.

Slip, Trip or FallOne thing that always bothered me after viewing the source of a published page was how content regions would lose indentation. For example, this is what you'd normally see after publishing a page.

<section>
  <!-- Content region -->
  <div id="content" class="cms-editable"><h2>Lorem Ipsum Dolor</h2>
<p>Vivamus nec odio ac ligula congue feugiat at vitae leo. Aenean sem justo, finibus sed dui eu, accumsan facilisis dolor. Fusce quis dui eget odio iaculis aliquam vel sed velit. Nulla pellentesque posuere semper. Nulla eu sagittis lorem, a auctor nulla. Sed ac condimentum orci, ac varius ante. Nunc blandit quam sit amet sollicitudin sodales.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras at dignissim augue, in iaculis neque. Etiam bibendum felis ac vulputate pellentesque. Cras non blandit quam. Nunc porta, est non posuere sagittis, neque nunc pellentesque diam, a iaculis lacus urna vitae purus. In non dui vel est tempor faucibus. Aliquam erat volutpat.</p></div>
</section>

Notice how the indentation is wrong inside the content region? Now imagine a page with multiple content regions...yikes. Things can get messy quickly.

This happens because of the way code from the editor gets merged with existing code when the page is published. Ideally, the resulting code would look like this.

<section>
  <!-- Content region -->
  <div id="content" class="cms-editable">
    <h2>Lorem Ipsum Dolor</h2>
    <p>Vivamus nec odio ac ligula congue feugiat at vitae leo. Aenean sem justo, finibus sed dui eu, accumsan facilisis dolor. Fusce quis dui eget odio iaculis aliquam vel sed velit. Nulla pellentesque posuere semper. Nulla eu sagittis lorem, a auctor nulla. Sed ac condimentum orci, ac varius ante. Nunc blandit quam sit amet sollicitudin sodales.</p>
    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras at dignissim augue, in iaculis neque. Etiam bibendum felis ac vulputate pellentesque. Cras non blandit quam. Nunc porta, est non posuere sagittis, neque nunc pellentesque diam, a iaculis lacus urna vitae purus. In non dui vel est tempor faucibus. Aliquam erat volutpat.</p>
  </div>
</section>

That's a lot better, but how can I fix this programmatically?


Tabs vs Spaces

The trouble with indenting code is that everybody does it differently. Some people use tabs, others use spaces. Those who use spaces have to decide how many spaces to use. In some cases, they even prefer different levels of indentation depending on the file type (e.g. HTML, CSS,  JavaScript, etc.).

Personally, I've settled on two spaces for indentation in my own projects. I find it easy to read and I see consistent indentation when I view the source in diff tools and GitHub. (I prefer two spaces, but I'll gladly use a different convention if the project I'm working on calls for it.)

This is a heated debate among developers, so forcing my own preference onto everyone else's code clearly wasn't an option. I thought about adding a setting, but that would be tricky because of the variety of ways users might want to configure indentation.

Then it hit me. Why not just detect the type of indentation automatically? 🤔


A Stroke of Genius

Automatically detecting indentation turned out to be pretty easy, thanks to an open source package called detect-indent. It's widely used and performed flawlessly in all my tests. Plus, it allowed me to detect indentation on a per-file basis. Whether you use tabs, two spaces, four spaces, eight spaces — it doesn't matter — your indentation preference will be detected.

The next step was to apply this to a beautifier, then run the full source of each page through it when a page is published. With a bit of tweaking, I'm happy to report that all code published through Surreal CMS will be properly indented!

You might be thinking, "why does this matter?"

Well, if you don't spend much time editing code it may not matter that much to you. For those who do, it makes code a lot easier to read. When code is indented properly, you can tell where HTML tags start and end so you'll be less prone to making errors. It also gives you a better understanding of the page's structure at a glance.

And, if you're anything like me, you'll agree that having beautiful code just feels good — even if most of the world won't actually see it!