EC3: The fourth column in Table VI shows that on
average only about 6% of the necessary repairs were out
of scope for our approach. Examples of such repairs are
missing include statements and faulty vprints. The high
number of non-cprint patches in timeclock is due to a single
patch involving a vprint that is required in every script.
While most of the invalid HTML generated by our bench-
marks would be silently corrected by a browser, we found
three errors that resulted in visible layout problems, two of
which were automatically fixed by PHPRepair.
C. Threats to Validity
The subject programs used in our evaluation may not be
representative of other PHP programs. We did not specifi-
cally select the benchmarks to suit our approach; many of
them have been used in our previous research [2]. Some
PHP programs (such as phpBB2 [2]) use custom templating
mechanisms to generate their output, whereby a template of
the page to generate is read from a file and subjected to
some string processing to generate the actual output page.
Our approach does not work well on such programs, which
typically contain few cprints.
The bugs we detected and fixed may not be representative
since the test suites we use do not cover all of a program’s
behavior. However, the test suites achieve high coverage
and were generated using algorithms that are completely
unrelated to the repair techniques studied in this paper.
Finally, there is often more than one way to fix a given
HTML generation error, but in our evaluation we had to
pick a single fix. When constructing the corrected HTML
output and the golden versions of our subject programs, we
have attempted to choose “sensible” repairs that disturb the
original structure as little as possible.
VI. RELATED WORK
Static analysis of strings in web applications has been
used to validate HTML output from web applications [8],
[9], to ensure that only XML documents meeting a given
DTD are generated [10], and to detect security vulnerabili-
ties [11], [12], [13]. Our PHPQuickFix tool also performs a
static analysis, but only handles the special case of HTML
errors within an individual string literal. Since PHPQuick-
Fix is neither sound nor complete it cannot guarantee the
absence of all errors; similarly, PHPRepair is only sound
up to the given test suite. However, our tools can automat-
ically repair HTML generation errors, rather than simply
identifying them. Due to its dynamic approach, PHPRepair
does not incur false positives as a static tool might.
Nguyen et al. [4] tackle the same problem of repairing
HTML generation errors in PHP code, but in a very different
way. They use a heuristic algorithm to map HTML output
back to the program, while we use instrumentation to get
a precise mapping. Like us they focus on constant prints,
but their heuristic repair algorithm does not appear to
ensure soundness, completeness or minimality. Finally, their
evaluation only considers fixes found by HTML Tidy; we
also consider more complicated manual fixes.
Weimer et al. [3] use genetic programming to repair C
programs, whereby repairs are found by adapting statements
from other locations in a program. Like ours, their approach
requires a test suite, uses instrumentation to record execution
paths, and guarantees correctness up to that suite. Our focus
on constant prints allows us to perform exhaustive search for
repairs, ensuring both completeness and minimality. Genetic
programming approaches support more complex repairs but
rely on heuristics and hence lack these important properties.
There has also been work on synthesizing programs
that meet a given specification. Closest to our work are
approaches that require the user to provide an initial program
template with “holes” to be filled in [14], [15]. PHPRepair
implicitly allows any cprint as a “hole” and uses tests to
identify which ones to modify along with cost minimiza-
tion to avoid unnecessary patches. Finally, Gulwani [16]
described a tool to synthesize Excel spreadsheet macros.
Like PHPRepair, that approach is based on input-output
examples and synthesizes a program that generates strings.
However, programs are synthesized in a specialized domain-
specific language, while we repair arbitrary PHP programs.
Angelic debugging [17], like our approach, uses constraint
solving over a test suite to identify erroneous expressions.
While it can handle more general errors, angelic debugging
is in general not able to suggest source-level repairs.
Several projects use constraint solving for automatic pro-
gram transformations, often in the form of refactorings, as
in type-related refactorings [18], refactoring for inferring
generic types in Java [19], and refactorings that manipulate
access modifiers [20].
VII. CONCLUSIONS AND FUTURE WORK
We have presented a novel approach to automatically
repair HTML generation errors in PHP programs, targeting
a common class of repairs based on adding, modifying,
and removing statements that print string literals. We have
developed a simple static tool, PHPQuickFix, for repairing
errors local to a single print statement, and a test-based tool,
PHPRepair, for repairing more complex errors by solving a
system of string constraints. Our experiments show that these
tools are able to efficiently repair most HTML generation
bugs in a variety of open-source benchmark programs.
There are several avenues for further research. We would
like to experiment with different cost metrics incorporating
knowledge of the program’s structure (e.g., to encourage
solutions where all fixes are localized in the same script). To
improve performance, we may be able to leverage the highly
structured form of our constraints to aggressively optimize
our SAT-based encoding, rather than relying on Kodkod’s
built-in encoding. Finally, we would like to generalize our
approach to handle more complex repairs.