Speaker Martin Pool
Time 2004-01-15 14:45
Conference LCA2004

front end to gcc

“speed is the one truly modern pleasure.” Aldous Huxley

previous work

  • NFS
  • same libraries
  • synchronised clocks
  • single administrator
  • a lot of hassle

Requirements

  • run jobs in parallel
  • same output
  • must be correct

Phases of compiling

  • preprocess, include files
  • compile preprocessed source
  • link together object files

      export DISTCC_HOSTS="a b c ..."
      make CC=distcc -j6
    

is gain > overhead time?

  • kernel efficient at sending bulk data
  • some projects, e.g. OpenOffice assume sequential execution
  • good projects, code that is hard to compile, e.g. C++

problems

  • object file gcc 2.96 won’t work with gcc 3.2
  • check gcc version not enough; check vendor specific patches?
  • C++ ABI issues
  • Makefile bugs, race conditions, etc
  • kernel bugs detected, TCP round trip timer, now fixed; time doesn’t go backwards even if going too fast
  • precompiled headers - especially for C++ templates
  • integrated headers

scheduling

  • make doesn’t expose global information, distcc doesn’t know what it needs to know
  • best throughput desired

compression

  • LZO fast compressor
  • 1:2 to 1:3
  • good when network is the limiting factor

SSH

  • a bit slow, but feasible
  • access, privacy, integrity all taken care of

performance hacks

  • preforked daemons
  • sendfile to send the file other the network
  • mmap
  • TTCP corks

scalability

  • recursive make considered possible

KISS = Keep it simple & safe

  • No central coordinator
  • minarchy
  • clients remember hitns about down/busy hosts
  • fall back to running gcc