Bookmark and Share

JavaScript Crunchinator

See the demo page for an example of how to use this tool and the source code.

This utility can be used to reduce the size of JavaScript source code in a file. It uses some simple parsing and regular expressions to remove comments and unnecessary white space in the script code. Depending on your style of coding, it can produce significant savings in terms of file size.

Using the Script

The demo shows an easy way to set up and use the tool. A form is provided so you can simply cut and paste the input and output code.

  1. First, always save a backup of the original code. With no comments or formatting, the condensed version is very difficult to read and edit.
  2. Open the source file in a text editor, highlight the code you wish to process and copy it.
  3. Paste the code into the area marked 'Input' on the form.
  4. Press the 'Crunch' button. The script will start processing the code. Various status messages will display as it works through each step. When the status displays 'Done.' the condensed code will appear in the 'Output' area of the form.
  5. Use your cursor or press the 'Select' button to highlight the output code. You can then copy and paste it back into your page or .js file, replacing the original code.

The original and compressed code sizes will be displayed. Some scripts will compress better than others, depending on how many comments and extraneous spacing the original contains.

Caveats

One common problem with the condensed code the script produces is caused by not using semicolons (';') at the end of a line. While it's perfectly legitimate to leave them off when statements are separated by a newline:

x = 3.141592
y = 2.718281

the script would produce:

x=3.141592 y=2.718281

which would generate and error message like "expected ';'" or "missing ; before statement" when executed.

The process can be slow. For long scripts, it's often better to break the input script up, running a hundred lines or so at a time and concatenating the output. This technique may also help when trying to identify errors, such as those caused by a missing semicolon as noted above.

Coding Details

The script performs an ordered series of steps to condense the input code. In each step, the input is processed and passed on to the next step.

Literal Strings

The first step is to identify any literal strings, i.e., string constants deliminated by quotation marks. Being constant values, they should not be altered. They can also make the rest of the process difficult since they may contain character sequences that look like JavaScript source. For example:

var s = "The total is calculated using c = a + 3 * b.";
var t = "In C, comments are indicated by '/*' and '*/'.";

The function replaceLiteralStrings() does this by scanning each source line one character at a time. When it finds a quotation mark (double or single) it then starts looking for the matching, ending quote.

It also checks for escape sequences (a quotation mark preceded by a backslash character) so that it properly handles strings like the following:

var a = "here is a string with \"embedded\" quotes.";
var b = "here is a string with 'embedded' quotes.";
var c = 'here is a string with "embedded" quotes.';
var d = 'here is a string with \'embedded\' quotes.';

Once the ending quote has been found, the entire string, including quotes, is saved in a global array called literalStrings and replaced in the source with a character sequence in the form __n__ (an integer surrounded by a pair of double underline characters). The value n corresponds to the array index where the original string was saved.

Comments

Next, comments are removed. The input is broken up into individual lines (i.e., lines ending with a newline character.) and a regular expression is used to remove end of line comments ("//...").

Then all newline characters are converted to single spaces. This allows multi-line comments ("/*...*/") to be removed. Again, regular expressions are used.

White Space

Extraneous white space is removed next, in the compressWhiteSpace() function. It first converts any sequence of multiple white space characters (tabs, spaces, etc.) to a single space.

Then, it looks for special characters such as operators, parentheses curly braces, etc. and removes any leading or trailing white space. For example in the code:

function myAverage(x, y, z) { return ((x + y + z) / 3); }

You can remove every space except the one between "function" and "myAverage" while keeping correct syntax:

function myAverage(x,y,z){return((x+y+z)/3);}

These operations all make use of regular expressions.

Combining Literal Strings

Often, for readability, programmers will break up long string constants to fit them on multiple lines using the + operator to concatenate them.

var message = "This is a long line of text that has been broken up into"
            + "two lines in order to fit without scrolling.";

The combineLiteralStrings() function looks for this pattern of quote-plus-quote and simply removes it.

Restoring Literal Strings

At this point, the literal strings saved in the first step can be restored. The restoreLiteralStrings() function handles this by looping through the string array and using the replace() method to put them back where they belong.

for (i = 0; i < literalStrings.length; i++)
  s = s.replace(new RegExp("__" + i + "__"), literalStrings[i]);

That should leave us with a working, albeit smaller, version of the original script.