Resolving Drupal Data Compression (or PHP Served Data) Corruption Issues

This problem's main symptoms involve browsers complaining that "[[they]] Can't Open the Page (cannot decode raw data)" (Safari), "[[there is a]] Content Encoding Error (invalid or unsupported form of compression)" (FireFox), or "[[they]] can't reach this page (INET_E_DATA_NOT_AVAILABLE)" (Internet Explorer and Edge). This problem is typically due to an error in the compressed data stream output by Drupal 7 when the setting "Compress cached pages." is selected in the Performance section of the website settings and a version of the current page has not been cached. Users should be able to see the website without issue with a simple refresh of the page because the cache for the page will be generated.

A compressed HTML data stream can be obtained by running  

wget -S --header="accept-encoding: gzip" example.com

Upon examination of both a page that has the problem (with Compress cached pages enabled and without a cached version) and does not have the problem (a page that is already cached and/or with Compress cached pages disabled) using a text editor that converts binary into ASCII such as nano. One may find interruptions in the data such as properly structured words, html or line breaks in what should normally be nothing but seemingly random characters. These interruptions represent a corruption of the compressed data stream and, as a result, break a web browser's ability to uncompress the data and display the webpage.

When Drupal serves compressed data from the cache, it uses PHP to alter the headers to indicate the compressed html mime type and serve the compressed data. In order for PHP to output any kind of special mime type, especially those whose data is binary, the output buffer needs to be completely clear of any output, otherwise data corruption would occur to the outputted data. 

To resolve this problem one must examine modules to ensure that they aren't outputting anything prematurely. One must also take into consideration Drupal's load process. The very first thing it loads which is also usually modified by a developer or systems administrator is settings.php files. It is after these files that Drupal is initialized and then starts the data serving process. Anything that gets written to the output buffer before the initialization will break Drupal's ability to serve data, especially binary data. If a settings.php (or associated files) have an echo or even a closing PHP tag followed by a space or a line break will result in that data being written to the data buffer before Drupal has a chance to. It is because of the consequences PHP's "inline with html" nature that Drupal coding standards strongly suggest that PHP closing tags are never added to PHP files that contain only code.

The takeaway here is, in any php project, including one using Drupal, unless if it is a template and you are expecting to output HTML right after, don't close the PHP tags in files that only contain php code... Including settings files.

This article was written for Red Sky IT Solutions.

Tags