Best Practices for Filenames on the Web

By Cory LaViska on May 15, 2019

Some filenames are better than others. We enforce a certain pattern to ensure your website follows best practices for naming files on the web.

A label and a twigBack in the DOS days, filenames had some pretty strict rules. They couldn't be longer than eight characters plus a three character extension. This was called the 8.3 filename, and it brings back a lot of memories for this ol' programmer.

In modern operating systems, filenames are much more flexible. These days, it's perfect valid to have a file called Bob's Résumé.pages — note the capital letters, apostrophe, space, and use of accented characters. This is obviously better than bobsres.pgs, but it's probably not something you want to do on your website.

Because of the way URLs are encoded, your pretty filenames will end up looking like this.

It would be much easier to read and type if your URL looked like this instead.

Web-safe filenames

To keep things consistent, we enforce certain rules to create web-safe filenames when working with files in Surreal CMS.

  • Filenames must be lowercase
  • Filenames must not contain spaces
  • Filenames must only contain a-z, 0-9, dashes, and dots.

Yuck. This is staring starting to sound like those crazy password requirements you see on banking websites. Don't worry — we make this seamless for users by automatically:

  • Lowercasing the filename
  • Converting spaces to dashes
  • "Latinizing" accented characters (e.g. "é" becomes "e")
  • Removing symbols and similar characters

So if a user enters Bob's Résumé.pages, we'll convert it to bobs-resume.pages for them.

Why is this considered a best practice?

In reality, you can name a file just about anything you want and browsers will be able to handle it, albeit with a really ugly encoded URL. It's not just about aesthetics, though.

Some operating systems are case sensitive (e.g. Linux, macOS) while others are not (e.g. Windows). That means, in some operating systems, FILE.EXT is the same as file.ext. If you ever move your website to a server running a case-sensitive OS, links that used to work have the potential to break. There's also the possibility of duplicate content issues. For this reason, it's a lot safer to stick with lowercase filenames.

Spaces aren't pretty in URLs because they're encoded with %20, so we replace them with dashes. Dashes are simply less scary looking in URLs, so users are more comfortable seeing them. We prefer dashes over underscores per Google's recommendation.

Accented characters get encoded in URLs too, so we convert them to their Latin equivalents. This makes URLs easier to read and type. Search engines do a really good job of understanding these conversions, so SEO points aren't deducted. (Your "crème brûlée" recipe will still appear in searches for "creme brulee.")

We don't enforce a length for filenames, but it's possible that search engines might frown upon excessively long filenames, especially if you're stuffing them with keywords. Short and concise is a good rule of thumb.

We follow these best practices and enforce them in Surreal CMS so our users don't have to worry about it. It's one of many behind-the-scenes things we do to ensure a great experience for both them and their visitors.