git.sur5r.net Git - i3/i3/blob - docs/testsuite

   1 i3 testsuite
   2 ============
   3 Michael Stapelberg <michael+i3@stapelberg.de>
   4 September 2011
   5
   6 This document explains how the i3 testsuite works, how to use it and extend it.
   7 It is targeted at developers who not necessarily have been doing testing before
   8 or have not been testing in Perl before. In general, the testsuite is not of
   9 interest for end users.
  10
  11
  12 == Introduction
  13
  14 The i3 testsuite is a collection of files which contain testcases for various
  15 i3 features. Some of them test if a certain workflow works correctly (moving
  16 windows, focus behaviour, …). Others are regression tests and contain code
  17 which previously made i3 crash or lead to unexpected behaviour. They then check
  18 if i3 still runs (meaning it did not crash) and if it handled everything
  19 correctly.
  20
  21 The goal of having these tests is to automatically find problems and to
  22 automatically get a feel for whether a change in the source code breaks any
  23 existing feature. After every modification of the i3 sourcecode, the developer
  24 should run the full testsuite. If one of the tests fails, the corresponding
  25 problem should be fixed (or, in some cases, the testcase has to be modified).
  26 For every bugreport, a testcase should be written to test the correct
  27 behaviour. Initially, it will fail, but after fixing the bug, it will pass.
  28 This ensures (or increases the chance) that bugs which have been fixed once
  29 will never be found again.
  30
  31 Also, when implementing a new feature, a testcase might be a good way to be
  32 able to easily test if the feature is working correctly. Many developers will
  33 test manually if everything works. Having a testcase not only helps you with
  34 that, but it will also be useful for every future change.
  35
  36 == Implementation
  37
  38 For several reasons, the i3 testsuite has been implemented in Perl:
  39
  40 1. Perl has a long tradition of testing. Every popular/bigger Perl module which
  41    you can find on CPAN will not only come with documentation, but also with
  42    tests. Therefore, the available infrastructure for tests is comprehensive.
  43    See for example the excellent http://search.cpan.org/perldoc?Test::More
  44    and the referenced http://search.cpan.org/perldoc?Test::Tutorial.
  45
  46 2. Perl is widely available and has a well-working package infrastructure.
  47 3. The author is familiar with Perl :).
  48 4. It is a good idea to use a different language for the tests than the
  49    implementation itself.
  50
  51 Please do not start programming language flamewars at this point.
  52
  53 === Mechanisms
  54
  55 ==== Script: complete-run
  56
  57 The testcases are run by a script called +complete-run.pl+. It runs all
  58 testcases by default, but you can be more specific and let it only run one or
  59 more testcases. Also, it takes care of starting up a separate instance of i3
  60 with an appropriate configuration file and creates a folder for each run
  61 containing the appropriate i3 logfile for each testcase. The latest folder can
  62 always be found under the symlink +latest/+. Unless told differently, it will
  63 run the tests on a separate X server instance (using the Xdummy script).
  64
  65 .Example invocation of complete-run.pl+
  66 ---------------------------------------
  67 $ cd ~/i3/testcases
  68
  69 $ ./complete-run.pl
  70 # output omitted because it is very long
  71 All tests successful.
  72 Files=78, Tests=734, 27 wallclock secs ( 0.38 usr  0.48 sys + 17.65 cusr  3.21 csys = 21.72 CPU)
  73 Result: PASS
  74
  75 $ ./complete-run.pl t/04-floating.t
  76 [:3] i3 startup: took 0.07s, status = 1
  77 [:3] Running t/04-floating.t with logfile testsuite-2011-09-24-16-06-04-4.0.2-226-g1eb011a/i3-log-for-04-floating.t
  78 [:3] t/04-floating.t finished
  79 [:3] killing i3
  80 output for t/04-floating.t:
  81 ok 1 - use X11::XCB::Window;
  82 ok 2 - The object isa X11::XCB::Window
  83 ok 3 - Window is mapped
  84 ok 4 - i3 raised the width to 75
  85 ok 5 - i3 raised the height to 50
  86 ok 6 - i3 did not map it to (0x0)
  87 ok 7 - The object isa X11::XCB::Window
  88 ok 8 - i3 let the width at 80
  89 ok 9 - i3 let the height at 90
  90 ok 10 - i3 mapped it to x=1
  91 ok 11 - i3 mapped it to y=18
  92 ok 12 - The object isa X11::XCB::Window
  93 ok 13 - i3 let the width at 80
  94 ok 14 - i3 let the height at 90
  95 1..14
  96
  97 All tests successful.
  98 Files=1, Tests=14,  0 wallclock secs ( 0.01 usr  0.00 sys +  0.19 cusr  0.03 csys =  0.23 CPU)
  99 Result: PASS
 100
 101 $ less latest/i3-log-for-04-floating.t
 102 ----------------------------------------
 103
 104 ==== IPC interface
 105
 106 The testsuite makes extensive use of the IPC (Inter-Process Communication)
 107 interface which i3 provides. It is used for the startup process of i3, for
 108 terminating it cleanly and (most importantly) for modifying and getting the
 109 current state (layout tree).
 110
 111 See [http://i3wm.org/docs/ipc.html] for documentation on the IPC interface.
 112
 113 ==== X11::XCB
 114
 115 In order to open new windows, change attributes, get events, etc., the
 116 testsuite uses X11::XCB, a new (and quite specific to i3 at the moment) Perl
 117 module which uses the XCB protocol description to generate Perl bindings to
 118 X11. They work in a very similar way to libxcb (which i3 uses) and provide
 119 relatively high-level interfaces (objects such as +X11::XCB::Window+) aswell as
 120 access to the low-level interface, which is very useful when testing a window
 121 manager.
 122
 123 === Filesystem structure
 124
 125 In the git root of i3, the testcases live in the folder +testcases+. This
 126 folder contains the +complete-run.pl+ and +Xdummy+ scripts and a base
 127 configuration file which will be used for the tests. The different testcases
 128 (their file extension is .t, not .pl) themselves can be found in the
 129 conventionally named subfolder +t+:
 130
 131 .Filesystem structure
 132 --------------------------------------------
 133 ├── testcases
 134 │   ├── complete-run.pl
 135 │   ├── i3-test.config
 136 │   ├── lib
 137 │   │   ├── i3test.pm
 138 │   │   ├── SocketActivation.pm
 139 │   │   └── StartXDummy.pm
 140 │   ├── t
 141 │   │   ├── 00-load.t
 142 │   │   ├── 01-tile.t
 143 │   │   ├── 02-fullscreen.t
 144 │   │   ├── ...
 145 │   │   ├── omitted for brevity
 146 │   │   ├── ...
 147 │   │   └── 74-regress-focus-toggle.t
 148 │   └── Xdummy
 149 --------------------------------------------
 150
 151 == Anatomy of a testcase
 152
 153 Learning by example is definitely a good strategy when you are wondering how to
 154 write a testcase. Let's take +t/11-goto.t+ as an easy example and go through it
 155 step by step:
 156
 157 .t/11-goto.t: Boilerplate
 158 ----------------------
 159 #!perl
 160 # vim:ts=4:sw=4:expandtab
 161
 162 use i3test;
 163 use File::Temp;
 164
 165 my $x = X11::XCB::Connection->new;
 166 -----------------------
 167
 168 This is what we call boilerplate. It exists at the top of every test file (to
 169 some extent). The first line is the shebang, which specifies that this file is
 170 a Perl script. The second line contains VIM specific settings on how to
 171 edit/format this file (use spaces instead of tabs, indent using 4 spaces).
 172 Afterwards, the +i3test+ module is used. This module contains i3 testsuite
 173 specific functions which you are strongly encouraged to use. They make writing
 174 testcases a lot easier and will make it easier for other people to read your
 175 tests.
 176
 177 The next line uses the +File::Temp+ module. This is specific to this testcase,
 178 because it needs to generate a temporary name during the test. Many testcases
 179 use only the +i3test+ module.
 180
 181 The last line opens a connection to X11. You might or might not need this in
 182 your testcase, depending on whether you are going to open windows (etc.) or
 183 only use i3 commands.
 184
 185 .t/11-goto.t: Setup
 186 ----------------------
 187 my $tmp = fresh_workspace;
 188
 189 cmd 'split h';
 190 ----------------------
 191
 192 The first line calls i3test's +fresh_workspace+ function which looks for a
 193 currently unused workspace, switches to it, and returns its name. The variable
 194 +$tmp+ will end up having a value such as +"/tmp/87kBVcHbA9"+. Note that this
 195 is not (necessarily) a valid path, it's just a random workspace name.
 196
 197 So, now that we are on a new workspace, we ensure that the workspace uses
 198 horizontal orientation by issuing the +split h+ command (see the i3 User's
 199 Guide for a list of commands). This is not strictly necessary, but good style.
 200 In general, the +cmd+ function executes the specified i3 command by using the
 201 IPC interface and returns once i3 acknowledged the command.
 202
 203 .t/11-goto.t: Setup
 204 ----------------------
 205 #####################################################################
 206 # Create two windows and make sure focus switching works
 207 #####################################################################
 208
 209 my $top = open_window($x);
 210 my $mid = open_window($x);
 211 my $bottom = open_window($x);
 212 ----------------------
 213
 214 In every major section of a testcase, you should put a comment like the one
 215 above. This makes it immediately clear how the file is structured.
 216
 217 The +open_window+ function opens a standard window, which will then be put into
 218 tiling mode by i3. If you want a floating window, use the
 219 +open_floating_window+ function. These functions accept the same parameters as
 220 +X11::XCB::Window->new+, see the i3test documentation at TODO.
 221
 222 .t/11-goto.t: Helper function
 223 ----------------------
 224 #
 225 # Returns the input focus after sending the given command to i3 via IPC
 226 # and syncing with i3
 227 #
 228 sub focus_after {
 229     my $msg = shift;
 230
 231     cmd $msg;
 232     sync_with_i3 $x;
 233     return $x->input_focus;
 234 }
 235 ----------------------
 236
 237 This section defines a helper function which will be used over and over in this
 238 testcase. If you have code which gets executed more than once or twice
 239 (depending on the length of your test, use your best judgement), please put it
 240 in a function. Tests should be short, concise and clear.
 241
 242 The +focus_after+ function executes a command and returns the X11 focus after
 243 the command was executed. The +sync_with_i3+ command makes sure that i3 could
 244 push its state to X11. See <<i3_sync>> to learn how this works exactly.
 245
 246 .t/11-goto.t: Test assumptions
 247 ----------------------
 248 $focus = $x->input_focus;
 249 is($focus, $bottom->id, "Latest window focused");
 250
 251 $focus = focus_after('focus left');
 252 is($focus, $mid->id, "Middle window focused");
 253 ----------------------
 254
 255 Now, we run the first two real tests. They use +Test::More+'s +is+ function,
 256 which compares two values and prints the differences if they are not the same.
 257 After the arguments, we supply a short comment to indicate what we are testing
 258 here. This makes it vastly more easy for the developer to spot which testcase
 259 is the problem in case one fails.
 260
 261 The first test checks that the most recently opened window is focused.
 262 Afterwards, the command +focus left+ is issued and it is verified that the
 263 middle window now has focus.
 264
 265 Note that this is not a comprehensive test of the +focus+ command -- we would
 266 have to test wrapping, focus when using a more complex layout, focusing the
 267 parent/child containers, etc. But that is not the point of this testcase.
 268 Instead, we just want to know if +$x->input_focus+ corresponds with what we are
 269 expecting. If not, something is completely wrong with the test environment and
 270 this trivial test will fail.
 271
 272 .t/11-goto.t: Test that the feature does not work (yet)
 273 ----------------------
 274 #####################################################################
 275 # Now goto a mark which does not exist
 276 #####################################################################
 277
 278 my $random_mark = mktemp('mark.XXXXXX');
 279
 280 $focus = focus_after(qq|[con_mark="$random_mark"] focus|);
 281 is($focus, $mid->id, "focus unchanged");
 282 ----------------------
 283
 284 Syntax hint: The qq keyword is the interpolating quote operator. It lets you
 285 chose a quote character (in this case the +|+ character, a pipe). This makes
 286 having double quotes in our string easy.
 287
 288 In this new major section, a random mark (mark is an identifier for a window,
 289 see "VIM-like marks" in the i3 User’s Guide) will be generated. Afterwards, we
 290 test that trying to focus that mark will not do anything. This is important: Do
 291 not only test that using a feature has the expected outcome, but also test that
 292 using it without properly initializing it does no harm. This command could for
 293 example have changed focus anyways (a bug) or crash i3 (obviously a bug).
 294
 295 .t/11-goto.t: Test that the feature does work
 296 ----------------------
 297 cmd "mark $random_mark";
 298
 299 $focus = focus_after('focus left');
 300 is($focus, $top->id, "Top window focused");
 301
 302 $focus = focus_after(qq|[con_mark="$random_mark"] focus|);
 303 is($focus, $mid->id, "goto worked");
 304 ----------------------
 305
 306 Remember: Focus was on the middle window (we verified that earlier in "Test
 307 assumptions"). We now mark the middle window with our randomly generated mark.
 308 Afterwards, we switch focus away from the middle window to be able to tell if
 309 focusing it via its mark will work. If the test works, the goto command seems
 310 to be working.
 311
 312 .t/11-goto.t: Test corner case
 313 ----------------------
 314 # check that we can specify multiple criteria
 315
 316 $focus = focus_after('focus left');
 317 is($focus, $top->id, "Top window focused");
 318
 319 $focus = focus_after(qq|[con_mark="$random_mark" con_mark="$random_mark"] focus|);
 320 is($focus, $mid->id, "goto worked");
 321 ----------------------
 322
 323 Now we test the same feature, but specifying the mark twice in the command.
 324 This should have no effect, but let’s be sure: test it and see if things go
 325 wrong.
 326
 327 .t/11-goto.t: Test second code path
 328 ----------------------
 329 #####################################################################
 330 # Check whether the focus command will switch to a different
 331 # workspace if necessary
 332 #####################################################################
 333
 334 my $tmp2 = fresh_workspace;
 335
 336 is(focused_ws(), $tmp2, 'tmp2 now focused');
 337
 338 cmd qq|[con_mark="$random_mark"] focus|;
 339
 340 is(focused_ws(), $tmp, 'tmp now focused');
 341 ----------------------
 342
 343 This part of the test checks that focusing windows by mark works across
 344 workspaces. It uses i3test's +focused_ws+ function to get the current
 345 workspace.
 346
 347 .t/11-goto.t: Test second code path
 348 ----------------------
 349 done_testing;
 350 ----------------------
 351
 352 The end of every testcase has to contain the +done_testing+ line. This tells
 353 +complete-run.pl+ that the test was finished successfully. If it does not
 354 occur, the test might have crashed during execution -- some of the reasons why
 355 that could happen are bugs in the used modules, bugs in the testcase itself or
 356 an i3 crash resulting in the testcase being unable to communicate with i3 via
 357 IPC anymore.
 358
 359 [[i3_sync]]
 360 == Appendix A: The i3 sync protocol
 361
 362 Consider the following situation: You open two windows in your testcase, then
 363 you use +focus left+ and want to verify that the X11 focus has been updated
 364 properly. Sounds simple, right? Let’s assume you use this straight-forward
 365 implementation:
 366
 367 .Racey focus testcase
 368 -----------
 369 my $left = open_window($x);
 370 my $right = open_window($x);
 371 cmd 'focus left';
 372 is($x->input_focus, $left->id, 'left window focused');
 373 ----------
 374
 375 However, the test fails. Sometimes. Apparantly, there is a race condition in
 376 your test. If you think about it, this is because you are using two different
 377 pieces of software: You tell i3 to update focus, i3 confirms that, and then you
 378 ask X11 to give you the current focus. There is a certain time i3 needs to
 379 update the X11 state. If the testcase gets CPU time before X11 processed i3's
 380 requests, the test will fail.
 381
 382 image::i3-sync.png["Diagram of the race condition", title="Diagram of the race condition"]
 383
 384 One way to "solve" this would be to add +sleep 0.5;+ after the +cmd+ call.
 385 After 0.5 seconds it should be safe to assume that focus has been updated,
 386 right?
 387
 388 In practice, this usually works. However, it has several problems:
 389
 390 1. This is obviously not a clean solution, but a workaround. Ugly.
 391 2. On very slow machines, this might not work. Unlikely, but in different
 392    situations (a delay to wait for i3 to startup) the necessary time is much
 393    harder to guess, even for fast machines.
 394 3. This *wastes a lot of time*. Usually, your computer is much faster than 0.5s
 395    to update the status. However, sometimes, it might take 0.4s, so we can’t
 396    make it +sleep 0.1+.
 397
 398 To illustrate how grave the problem with wasting time actually is: Before
 399 removing all sleeps from the testsuite, a typical run using 4 separate X
 400 servers took around 50 seconds on my machine. After removing all the sleeps,
 401 we achieved times of about 25 seconds. This is very significant and influences
 402 the way you think about tests -- the faster they are, the more likely you are
 403 to check whether everything still works quite often (which you should).
 404
 405 What I am trying to say is: Delays adds up quickly and make the test suite
 406 less robust.
 407
 408 The real solution for this problem is a mechanism which I call "the i3 sync
 409 protocol". The idea is to send a request (which does not modify state) via X11
 410 to i3 which will then be answered. Due to the request's position in the event
 411 queue (*after* all previous events), you can be sure that by the time you
 412 receive the reply, all other events have been dealt with by i3 (and, more
 413 importantly, X11).
 414
 415 image::i3-sync-working.png["Diagram of the i3 sync solution", title="Diagram of the i3 sync solution"]
 416
 417 === Implementation details
 418
 419 The client which wants to sync with i3 initiates the protocol by sending a
 420 ClientMessage to the X11 root window:
 421
 422 .Send ClientMessage
 423 -------------------
 424 # Generate a ClientMessage, see xcb_client_message_t
 425 my $msg = pack "CCSLLLLLLL",
 426     CLIENT_MESSAGE, # response_type
 427     32,     # format
 428     0,      # sequence
 429     $root,  # destination window
 430     $x->atom(name => 'I3_SYNC')->id,
 431
 432     $_sync_window->id,    # data[0]: our own window id
 433     $myrnd, # data[1]: a random value to identify the request
 434     0,
 435     0,
 436     0;
 437
 438 # Send it to the root window -- since i3 uses the SubstructureRedirect
 439 # event mask, it will get the ClientMessage.
 440 $x->send_event(0, $root, EVENT_MASK_SUBSTRUCTURE_REDIRECT, $msg);
 441 -------------------
 442
 443 i3 will then reply with the same ClientMessage, sent to the window specified in
 444 +data[0]+. In the reply, +data[0]+ and +data[1]+ are exactly the same as in the
 445 request. You should use a random value in +data[1]+ and check that you received
 446 the same one when getting the reply.
 447
 448 == Appendix B: Socket activation
 449
 450 Socket activation is a mechanism which was made popular by systemd, an init
 451 replacement. It basically describes creating a listening socket before starting
 452 a program.  systemd will invoke the program only when an actual connection to
 453 the socket is made, hence the term socket activation.
 454
 455 The interesting part of this (in the i3 context) is that you can very precisely
 456 detect when the program is ready (finished its initialization).
 457
 458 === Preparing the listening socket
 459
 460 +complete-run.pl+ will create a listening UNIX socket which it will then pass
 461 to i3. This socket will be used by i3 as an additional IPC socket, just like
 462 the one it will create on its own. Passing the socket happens implicitly
 463 because children will inherit the parent’s sockets when fork()ing and sockets
 464 will continue to exist after an exec() call (unless CLOEXEC is set of course).
 465
 466 The only explicit things +complete-run.pl+ has to do is setting the +LISTEN_FDS+
 467 environment variable to the number of sockets which exist (1 in our case) and
 468 setting the +LISTEN_PID+ environment variable to the current process ID. Both
 469 variables are necessary so that the program (i3) knows how many sockets it
 470 should use and if the environment variable is actually intended for it. i3 will
 471 then start looking for sockets at file descriptor 3 (since 0, 1 and 2 are used
 472 for stdin, stdout and stderr, respectively).
 473
 474 The actual Perl code which sets up the socket, fork()s, makes sure the socket
 475 has file descriptor 3 and sets up the environment variables follows (shortened
 476 a bit):
 477
 478
 479 .Setup socket and environment
 480 -----------------------------
 481 my $socket = IO::Socket::UNIX->new(
 482     Listen => 1,
 483     Local => $args{unix_socket_path},
 484 );
 485
 486 my $pid = fork;
 487 if ($pid == 0) {
 488     $ENV{LISTEN_PID} = $$;
 489     $ENV{LISTEN_FDS} = 1;
 490
 491     # Only pass file descriptors 0 (stdin), 1 (stdout),
 492     # 2 (stderr) and 3 (socket) to the child.
 493     $^F = 3;
 494
 495     # If the socket does not use file descriptor 3 by chance
 496     # already, we close fd 3 and dup2() the socket to 3.
 497     if (fileno($socket) != 3) {
 498         POSIX::close(3);
 499         POSIX::dup2(fileno($socket), 3);
 500     }
 501
 502     exec "/usr/bin/i3";
 503 }
 504 -----------------------------
 505
 506 === Waiting for a reply
 507
 508 In the parent process, we want to know when i3 is ready to answer our IPC
 509 requests and handle our windows. Therefore, after forking, we immediately close
 510 the listening socket (i3 will handle this side of the socket) and connect to it
 511 (remember, we are talking about a named UNIX socket) as a client. This connect
 512 call will immediately succeed because the kernel buffers it. Then, we send a
 513 request (of type GET_TREE, but that is not really relevant). Writing data to
 514 the socket will also succeed immediately because, again, the kernel buffers it
 515 (only up to a certain amount of data of course).
 516
 517 Afterwards, we just blockingly wait until we get an answer. In the child
 518 process, i3 will setup the listening socket in its event loop. Immediately
 519 after actually starting the event loop, it will notice a new client connecting
 520 (the parent process) and handle its request. Since all initialization has been
 521 completed successfully by the time the event loop is entered, we can now assume
 522 that i3 is ready.
 523
 524 === Timing and conclusion
 525
 526 A beautiful feature of this mechanism is that it does not depend on timing. It
 527 does not matter when the child process gets CPU time or when the parent process
 528 gets CPU time. On heavily loaded machines (or machines with multiple CPUs,
 529 cores or unreliable schedulers), this makes waiting for i3 much more robust.
 530
 531 Before using socket activation, we typically used a +sleep(1)+ and hoped that
 532 i3 was initialized by that time. Of course, this breaks on some (slow)
 533 computers and wastes a lot of time on faster computers. By using socket
 534 activation, we decreased the total amount of time necessary to run all tests
 535 (72 files at the time of writing) from > 100 seconds to 16 seconds. This makes
 536 it significantly more attractive to run the test suite more often (or at all)
 537 during development.
 538
 539 An alternative approach to using socket activation is polling for the existance
 540 of the IPC socket and connecting to it. While this might be slightly easier to
 541 implement, it wastes CPU time and is considerably uglier than this solution
 542 :). After all, +lib/SocketActivation.pm+ contains only 54 SLOC.