Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page describes the current Quickstep coding guidelines. This information is targeted towards new and existing contributors on this project. All developers of Quickstep should subscribe to the developer's list, which is dev@quickstep.incubator.apache.org. To subscribe, send an empty email to dev-subscribe@quickstep.incubator.apache.org.

General Code Style

The Quickstep coding style is closely based on the Google C++ style guide, which you can read at https://google.github.io/styleguide/cppguide.html

The Google style guide is an excellent guide to writing C++ code that is both high-performance and highly readable and maintainable. We do depart from the style guide in a few areas, as listed below:+ code that is both high-performance and highly readable and maintainable. We do depart from the style guide in a few areas, as listed below:

  1. Code Documentation: We require documenting the logic in your `.cpp` files (in addition to doxygen comments in the `.hpp` files) so that it is easy for readers to understand the logic that you have implemented. This approach helps maintain the code in the long run, making it easier for other developers to work with your code (if done right, it should also help you down the line if you have to go back to your code at a later time). This approach also makes it easier for another developer to examine your code when merging your pull request as s/he can more easily match your intent (declared in your comments) with your actual action (encapsulated in the code).
  2. File naming: Header files have the suffix `.hpp`, source files have the suffix `.cpp`. Files should be named after the main class defined in the file, in `CamelCase`, exactly as the class name is in the source. For example, if you write a class named `MyNewClass`, its sources would be `MyNewClass.hpp` and `MyNewClass.cpp`.
  3. Line length: lines may be up to 120 characters long (up from Google's limit of 80). However, it is preferred to keep lines below 80 characters unless readability/aesthetics would be negatively affected by doing so. It is *strongly* preferred to keep lines below 100 characters unless you can't possibly avoid it. Comments should generally be kept to a limit of 80 character lines. See the Google style guide for rules and advice about how to break up a statement across multiple lines and how to indent continuations (continuations of a line must be indented at least 4 spaces, or they can be indented further to "line up" with related things vertically). Our preference is to put each parameter to a function call on its own line when we have to break up a call statement, and to align "groups" of related things vertically (e.g. parameters separated by commas, or parts of an expression with the same level of precedence).
  4. Method naming: Instance methods of classes are should have camelcase names starting with a lowercase letter `likeThis()`. Static methods of classes, and constructors, should have capitalized camelcase names `LikeThis()`.
  5. Exceptions: we do use exceptions to report errors in some cases, but we are trying to cut down on that. If you detect an error which is clearly the result of the caller misusing the API, you should intentionally crash the program with the `LOG(FATAL)` macro in `glog/logging.h`. Where possible, you should report other errors with return values that should be checked by the caller, not exceptions (be sure to note the requirement to check return values in your doxygen comments for the method in question). Exceptions are generally reserved for externally-generated weirdness (like a file loaded from disk which appears to be corrupted). If you really think you need to add a new exception, you should run it by other Quickstep core contributors, and make it inherit from `std::exception`.
  6. Pointer and reference sigils: the Google style guide does not have an exact recommendation for where to place sigil characters for pointers (\*) and references (&), but merely says to be consistent across a project. In Quickstep, we place the sigil next to the variable name rather than the type name, like so: `void *ptr;`. For functions which return a pointer or reference, we place the sigil next to the type name, like so: ` void* someMethod(); `.

...

We are very selective about adding external library dependencies to Quickstep. Currently there are 4 hard dependencies on external libraries, and 2 soft dependencies on external libraries that are used for optional features. The hard dependencies are on farmhash (a fast and high-quality suite of hash functions for strings and other byte arrays), glog (a flexible logging library from Google), googletest (our unit-testing framework), and protocol buffers (a portable and high performance data serialization framework). The soft dependencies are on linenoise, an interactive command-line editing library (like GNU readline, but much smaller & simpler and released under a more permissive BSD license), and on tcmalloc, a very fast memory allocator which is especially well-suited to multi-threaded programs (part of Google perftools).

Adding a new library dependency is a big deal which affects the portability of Quickstep, and is something that should always be discussed with other developers on the developer's list before it happens. We prefer to use libraries with BSD/MIT/X11/Apache-style licenses, which are very permissive in what they allow us to do with the code. We also prefer libraries which are lightweight and provide a limited, specific set of features that we need, as big libraries lead to long build times and overly-large statically linked executables. Finally, we want to use libraries which are highly portable across different systems so that Quickstep can be portable too (for example, all of our hard dependencies work fine with many different compilers across different flavors of UNIX and Windows). We also have a preference for libraries that can be built with CMake (either using CMake build files provided by the original project or written by us), since that makes it much easier to integrate with our build system.

...

We do use C++11 and require a C++11 compiler and standard library to build Quickstep. Be judicious about using C++11 features, though, and don't use them "just because" (for instance, don't use auto unless it's really needed in a template context, don't implement a move constructor for a class unless move semantics are actually needed, etc.). Feel free to use all the approved C++11 features listed in the Google style guide, plus the chrono library, and in circumstances where it's actually needed, move semantics. You should not use C++11 threads directly, but instead use our own threads module (which uses a platform-appropriate choice of C++11 threads, POSIX threads, or Windows threads under the covers). If you want to use any other C++11 features, you should consult the rest of the team email the developer's list first so that we're the active developers all aware of the rationale and familiar enough with the features in question that we the current developers won't be confused by your code.

...

Info

Quickstep is coded using C++-11. If you are new to C++, you will need to get familiar with the language. The language is powerful, but also vast. We try to use C++ “correctly” as per the guidelines at: https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md That page has a LOT of information in there, so it is best to browse it once and then use it as a reference to go to specific sections when you have questions.

A few other helpful items that you may want to pick up (pulled from an email to new members of the Quickstep group in early 2015):

#1: Move Semantics
— Another related intersting reading is at: https://en.wikipedia.org/wiki/Rule_of_three_%28C%2B%2B_programming%29

#2: Templates
— A gentle introduction is at: https://en.wikipedia.org/wiki/Template_%28C%2B%2B%29
— Don’t miss the related topic of Variadic templates at: https://en.wikipedia.org/wiki/Variadic_template
— The more technical guide is at: http://en.cppreference.com/w/cpp/language/templates This is probably more digestible after the first two. 

#3: Dependent Names
Ok: I only said two issues above, to not scare you too much, but you probably want to see this too:
— Sooner or later (probably sooner if you are in the guts of the execution engine code) you will hit the issue of dependent names. Here is a gentle starting point: 
http://aszt.inf.elte.hu/~gsd/halado_cpp/ch06s03.html#Dependent-name (see the top part and read on if you are ambitious).

WARNING: Templates are a powerful component in C++ and key way to do metaprogramming. But it takes a while to get comfortable with it. You can go overboard with it and make the code completely unreadable (even to you a few months after you have written it). So, use it carefully, and when you can, please leave ample comments in the code to help others — with templates you will be very unlikely to be too verbose in your comments. 

Of course as you go through the code and see spots that could benefit from additional comments, please add them and open a PR. Comments-only PR on existing code are welcome! 

...