Regression test suite for cairo.

How to use cairo's test suite
=============================
Using this test should be as simple as running:

	make test

assuming that the cairo distribution in the directory above has been
configured and built. The test suite here goes through some effort to
run against the locally compiled library rather than any installed
version, but those efforts may fall short depending on the level of your
libtool madness.

The results of the test suite run are summarized in an index.html
file, which, when viewed in a web browser makes it quite easy to
visually see any failed renderings alongside the corresponding
reference image, (and a diff image as well).

The test suite needs to be run before any code is committed and before
any release. See below for hints and rules governing the use of the suite.

The test suite is built as a single binary, which allows you to choose
individual or categories of tests to run. For example, to run specific tests:
    ./cairo-test-suite record-neg-extents-unbounded record-neg-extents-bounded
Or if you want to run all paint.* related tests you can use:
    ./cairo-test-suite paint
Or if you want to check the current status of known failures:
    ./cairo-test-suite XFAIL
Or to run a subset of tests, use the -k option to run only the tests
that include the given keyword:
    ./cairo-test-suite -k downscale
The binary also permits controlling which backend is used via the
CAIRO_TEST_TARGET environment variable, so for instance:
    CAIRO_TEST_TARGET=gl ./cairo-test-suite -k blur
This binary should be backwards-compatible with all library versions,
allowing you to compare current versus past behaviour for any test.

Tailoring tests running
-----------------------
There are some mechanisms to limit the tests run during "make test".
These come very handy when doing development, but should not be used
to circumvent the "pass" requirements listed below.

make's TARGETS environment variable can be used to limit the backends when
running the tests. It should contain a (space-, comma-separated) list of
backends. CAIRO_TESTS environment variable, which is a comma-, space-seperated
lists, can be used to limit the tests run.
For example:

  CAIRO_TESTS="zero-alpha" make test TARGETS=image,ps

make's FORMAT variable can also be used to limit the content formats when
running the tests. It should contain a (space-, comma-separated) list of
content formats to test.
For example:

  CAIRO_TESTS="zero-alpha" make test TARGETS=image,ps FORMAT="rgb,rgba"

Another very handy mechanism when trying to fix bugs is:

  make retest

This will re-run the test suite, but only on tests that failed on the
last run. So this is a much faster way of checking if changes actually
fix bugs rather than running the entire test suite again.

The test suite first compares the output from the current run against the
previous in order to skip more expensive image comparisons . If you think
this is interfering with the results, you can clear the cached results using:

  make clean-caches

Running tests under modified environments or tools
-------------------------------------------------
To run tests under a tool like gdb, one can use the run target and
the TOOL variable.  For example:

  CAIRO_TESTS=user-font make run TOOL=gdb TARGETS=pdf

If you want to run under valgrind, there is a specific target for that
that also sets a bunch of useful valgrind options.  Try:

  CAIRO_TESTS=user-font make check-valgrind

You can run tests under a modified environment you can use the ENV
make variable.  However, that environment will also affect the libtool
wrapper of the tests.  To only affect the actual test binaries, pass
such environment as TOOL:

  CAIRO_TESTS=user-font make run TOOL="LD_PRELOAD=/path/to/something.so"

Getting the elusive zero failures
---------------------------------
It's generally been very difficult to achieve a test run with zero
failures. The difficulties stem from the various versions of the many
libraries that the test suite depends on, (it depends on a lot more
than cairo itself), as well as fonts and other system-specific
settings. If your system differs significantly from the system on
which the reference images were generated, then you will likely see
the test suite reporting "failures", (even if cairo is working just
fine).

We are constantly working to reduce the number of variables that need
to be tweaked to get a clean run, (for example, by bundling fonts with
the test suite itself), and also working to more carefully document
the software configuration used to generate the reference images.

Here are some of the relevant details:

  * Your system must have a copy of the DejaVu font, the sha1sum of
    the version used are listed in [...].  These are
      "DejaVu Sans" (DejaVuSans.ttf) [e9831ee4fd2e1d0ac54508a548c6a449545eba3f];
      "DejaVu Sans Mono" (DejaVuSansMono.ttf) [25d854fbd0450a372615a26a8ef9a1024bd3efc6];
      "DejaVu Serif" (DejaVuSerif.ttf) [78a81850dc7883969042cf3d6dfd18eea7e43e2f];
      [the DejaVu fonts can be installed from the fonts-dejavu-core 2.34-1 Debian package]
    and also
      "Nimbus Sans L" (n019003l.pfb)
      [which can be found in the gsfonts Debian package].

  * Currently, you must be using a build of cairo using freetype
    (cairo-ft) as the default font backend. Otherwise all tests
    involving text are likely to fail.

  * To test the pdf backend, you will want the very latest version of
    poppler as made available via git:

	git clone git://anongit.freedesktop.org/git/poppler/poppler

    As of this writing, no released version of poppler contains all
    the fixes you will need to avoid false negatives from the test
    suite.

  * To test the ps backend, you will need ghostscript version 9.06.

  * Testing the xlib backend is problematic since many X server
    drivers have bugs that are exercised by the test suite. (Or, if
    not actual bugs, differ slightly in their output in such a way
    that the test suite will report errors.) This can be quite handy
    if you want to debug an X server driver, but since most people
    don't want to do that, another option is to run against a headless
    X server that uses only software for all rendering. One such X
    server is Xvfb which can be started like this:

	Xvfb -screen 0 1680x1024x24 -ac -nolisten tcp :2

    after which the test suite can be run against it like so:

	DISPLAY=:2 make test

    We have been using Xvfb for testing cairo releases and ensuring
    that all tests behave as expected with this X server.

What if I can't make my system match?
-------------------------------------
For one reason or another, you may be unable to get a clean run of the
test suite even if cairo is working properly, (for example, you might
be on a system without freetype). In this case, it's still useful to
be able to determine if code changes you make to cairo result in any
regressions to the test suite. But it's hard to notice regressions if
there are many failures both before and after your changes.

For this scenario, you can capture the output of a run of the test
suite before your changes, and then use the CAIRO_REF_DIR environment
variable to use that output as the reference images for a run after
your changes. The process looks like this:

        # Before code change there may be failures we don't care about
        make test

        # Let's save those output images
        mkdir /some/directory/
        cp -r test/output /some/directory/

        # hack, hack, hack

        # Now to see if nothing changed:
        CAIRO_REF_DIR=/some/directory/ make test

Best practices for cairo developers
===================================
If we all follow the guidelines below, then both the test suite and
cairo itself will stay much healthier, and we'll all have a lot more
fun hacking on cairo.

Before committing
-----------------
All tests should return a result of PASS or XFAIL. The XFAIL results
indicate known bugs. The final message should be one of the following:

	All XX tests behaved as expected (YY expected failures)
	All XX tests passed

If any tests have a status of FAIL, then the new code has caused a
regression error which should be fixed before the code is committed.

When a new bug is found
-----------------------
A new test case should be added by imitating the style of an existing
test. This means adding the following files:

	new-bug.c
	reference/new-bug.ref.png
	reference/new-bug.xfail.png

Where new-bug.c is a minimal program to demonstrate the bug, following
the style of existing tests. The new-bug.ref.png image should contain
the desired result of new-bug.c if the bug were fixed while
new-bug.xfail.png contains the current results of the test.

Makefile.sources should be edited by adding new-bug.c to test_sources.
And last but not least, don't forget to "git add" the new files.

When a new feature is added
---------------------------
It's important for the regression suite to keep pace with development
of the library. So a new test should be added for each new feature.
The work involved is similar the work described above for new bugs.
The only distinction is that the test is expected to pass so it
should not need a new-bug.xfail.png file.

While working on a test
-----------------------
Before a bugfix or feature is ready, it may be useful to compare
output from different builds. For convenience, you can set
CAIRO_REF_DIR to point at a previous test directory, relative
to the current test directory, and any previous output will be
used by preference as reference images.

When a bug is fixed
-------------------
The fix should be verified by running the test suite which should
result in an "unexpected pass" for the test of interest. Rejoice as
appropriate, then remove the relevant xfail.png file from git.

Before releasing
----------------
All tests should return a result of PASS for all supported (those enabled by
default) backends, meaning all known bugs are fixed, resulting in the happy
message:

	All XX tests passed

Some notes on limitations in poppler
====================================
One of the difficulties of our current test infrastructure is that we
rely on external tools to convert cairo's vector output (PDF,
PostScript, and SVG), into an image that can be used for the image
comparison. This means that any bugs in that conversion tool will
result in false negatives in the test suite.

We've identified several such bugs in the poppler library which is
used to convert PDF to an image. This is particularly discouraging
because 1) poppler is free software that will be used by *many* cairo
users, and 2) poppler calls into cairo for its rendering so it should
be able to do a 100% faithful conversion.

So we have an interest in ensuring that these poppler bugs get fixed
sooner rather than later. As such, we're trying to be good citizens by
reporting all such poppler bugs that we identify to the poppler
bugzilla. Here's a tracking bug explaining the situation:

	Poppler does not yet handle everything in the cairo test suite
	https://bugs.freedesktop.org/show_bug.cgi?id=12143

Here's the rule: If a cairo-pdf test reports a failure, but viewing
the resulting PDF file with acroread suggests that the PDF itself is
correct, then there's likely a bug in poppler. In this case, we can
simply report the poppler bug, (making it block 12143 above), post the
PDF result from the test suite, and list the bug in this file. Once
we've done this, we can capture poppler's buggy output as a
pdf-specific reference image (as reference/*.xfail.png) so that the 
test suite will regard the test as passing, (and we'll ensure there
is no regression).

Once the poppler bug gets fixed, the test suite will start reporting a
false negative again, and this will be easy to fix by simply removing
the pdf-specific reference image.

Here are the reported poppler bugs and the tests they affect:

[Newest was closed in 2009.]