C++ Toolchain

To be a good programmer, you need good tools to help you craft your code. Your choice of tools and your ability to use them effectively is critical to working effectively. If all you have is a hammer, even tasks that should be easy become difficult or impossible.

This page reflects some of the strong preferences of Professor Pisan. Check with your instructors in case they have specific preferences. This short video demonstrates some of the tools from this toolchain.

The information below explores several tools. Not all the tools is necessarily needed. If you'd like a concise set of instructions to setup your Visual Studio Code environment, see this detailed instructions.

Choosing a good set of tools is the first step, and then you have to learn how to use them well. I don't like switch between Mac and Windows, because every time I do, it takes me extra effort to remember how to do things in that environment. I use Emacs as my text editor, so a lot of the shortcuts are at the tip of my fingers. I do not have to think about them. The less you have to think about the tool, the more energy and focus you have for the actual task you are trying to accomplish.

Tools change. Each company has its own toolchain for its software developers. Each new version of the same program can even introduce slight differences. You need to be flexible enough to learn and use different tools.

Different courses will require you to use different set of tools. This page describes one possible way. Unless you are a seasoned programmer with well-established toolchain, you should follow the recommendations on this page. Even if you are seasoned programmer, there are probably a few things you can learn from the approach described.

We will use several tools such as Visual Studio Code, g++ compiler, GitHub for storing code and tracking changes, clang-tidy to ensure our programs adhere to C++ standards, clang-format to check our code style, CSS Linux Lab as a shared environment to test our programs, valgrind to check for memory leaks, llvm to check for code coverage, and others.

Learning to use the tools effectively takes time, but it is time that will come back to you 100-fold in the future. Every time you do something the hard way, you are wasting time, potentially introducing errors, reducing your ability to iterate. For example, learning the keyboard command to save a document (CTRL-S for most editors) can take some time, but ends up saving much more in the long run. StackExchange has a discussion that with references to several research paper showing for frequents tasks keyboard is much more efficient than  using the mouse.

Compiler

A program starts out as a text file, by convention a file with a .cpp extension for C++ programs. The compiler, which is just another program, converts this text file into an executable file. On Windows, executables have a .exe extension. On MacOS and Unix, the executable files do not have a specific extension.

Compilation Stages

Compiling a C++ program happens in several distinct steps

  1. Preprocessing - Handle special preprocessing commands such as #include, #define, #ifndef , etc
  2. Compilation - C++ source code is parsed and turned into assembly code. A binary object file with a .o extension is created. Each source file is converted into a separate object file. If you have a large project with 100 files and you change one file, then only a single file needs to be recompiled into object code.
  3. Linking - The object files and system libraries are linked together to create an executable file.

Most of the time we do not have to think of the different compilation stages. We compile the project to create an executable, but knowing the different stages can be useful when debugging since you get different types of error messages in each stage.

Different error messages:

  1. Preprocessing - Missing #include file, misspelled file name, incorrect name for a system library
  2. Compilation - Syntax error (such as missing semicolon), undefined function or class, etc.
  3. Linking - Duplicate definitions or missing definition for a function

g++, the GNU C++ Compiler, is the most commonly used compiler. clang++ is a newer compiler from the LLVM project that is designed for easy experimentation. For our purposes, g++ and clang++ are often undistinguishable.

Installing a Compiler

You can often complete all your programming projects using the CSS Linux Lab. The instructions for installing different compilers is included below for your convenience. For most courses, you do not have to install a compiler on your own machine, but being able to develop on your local machine without needing an internet connection has its advantages.

Linux: CSS Linux Lab already has g++ and clang++ compilers installed. Once you login, you can execute g++ (or clang++) from the command line. See Connecting to the Linux lab machines on details of how to connect. If you are running Ubuntu Linux, or another flavor of Linux, on your own machine, search for instructions specific to your operating system.

Windows: Several options are available:

  • Install MSYS2 which creates a unix like environment on your Windows machine. install llvm in that environment using pacman -S –needed mingw-w64-x86_64-clang mingw-w64-x86_64-clang-analyzer mingw-w64-x86_64-clang-tools-extra mingw-w64-x86_64-lldb mingw-w64-x86_64-llvm
  • Install Visual Studio comes with MSVC which is a compiler for Windows.
  • Install MinGW, MinGW-w64 or Cygwin. Both of these provide minimal unix like environments.
  • If you are running Window 10, install the Windows Subsystem for Linux

Mac: You can download XCode and then install the XCode Command Line tools using xcode-select –install. A better option is to use the brew package manager to download g++ and additional tools. brew is a flexible package manager and will make it much easier to download and install other programs.

g++ on CSS Linux Lab

Let's walk through the process of compiling a program on CSS Linux Lab

List the files in the current directory by executing the unix program ls with flags-al. -a is to display all files including hidden files that start with a dot in their file name. -l is to list the files in long format, showing permissions and file sizes.

$ ls -al
total 4
drwxrwxr-x  2 pisan pisan   10 Aug  7 19:48 .
drwx------ 32 pisan pisan 4096 Aug  7 19:48 ..

Execute the whoami command that display the currently logged in user.

$ whoami
pisan

Execute the which command that displays the location for the g++executable.

$ which g++
/usr/local/bin/g++

Start the very basic editor nanoto write our program.

$ nano helloworld.cpp

The code for our “Hello World” program is as follows:

#include <iostream>
 
using namespace std;
 
int main() {
   cout << "Hello, World" << endl;
 
   return 0;
}

Compile helloworld.cpp. The default executable produced by g++ is a.out

$ g++ helloworld.cpp 
$ ls -al
total 20
drwxrwxr-x  2 pisan pisan   53 Aug  7 19:50 .
drwx------ 32 pisan pisan 4096 Aug  7 19:48 ..
-rwxrwxr-x  1 pisan pisan 8776 Aug  7 19:50 a.out
-rw-rw-r--  1 pisan pisan  107 Aug  7 19:50 helloworld.cpp

Execute the program a.out. When you type a unix command, the unix shell searches through the directories listed in the $PATH variable to find the executable. To execute a program that is not in $PATH, we have to specify the pathname. ./ refers to the current directory. rm is used to remove (delete) the given file name

$ ./a.out
Hello, World
$ rm a.out

Let's choose a better file name for our program using the -o flag.

$ g++ helloworld.cpp -o hello
Hello, World
$ ls -al
drwxrwxr-x  2 pisan pisan   53 Aug  7 19:50 .
drwx------ 32 pisan pisan 4096 Aug  7 19:48 ..
-rwxrwxr-x  1 pisan pisan 8776 Aug  7 19:50 hello
-rw-rw-r--  1 pisan pisan  107 Aug  7 19:50 helloworld.cpp

We can now execute our program as hello.

$ ./hello
Hello, World
$ rm hello

The temporary object files that are created as part of the compilation process are deleted when the compilation is complete. If we want to inspect the object files, we can use the -c flag. Unless we have special tools, we cannot examine the contents of the binary object files. We can continue to the next step of compilation by calling xxx with the object file to produce the executable.

$ g++ -c helloworld.cpp
$ ls -al
total 12
drwxrwxr-x  2 pisan pisan   60 Aug  7 19:50 .
drwx------ 32 pisan pisan 4096 Aug  7 19:48 ..
-rw-rw-r--  1 pisan pisan  107 Aug  7 19:50 helloworld.cpp
-rw-rw-r--  1 pisan pisan 2688 Aug  7 19:50 helloworld.o
$ g++ helloworld.o -o hello
$ ls -al
total 24
drwxrwxr-x  2 pisan pisan   77 Aug  7 19:51 .
drwx------ 32 pisan pisan 4096 Aug  7 19:48 ..
-rwxrwxr-x  1 pisan pisan 8776 Aug  7 19:51 hello
-rw-rw-r--  1 pisan pisan  107 Aug  7 19:50 helloworld.cpp
-rw-rw-r--  1 pisan pisan 2688 Aug  7 19:50 helloworld.o
$ ./hello
Hello, World

Once we are finished, it is time to exit the shell and logout of the machine.

$ exit 

There is an easier way to do all this, but it is important to do it the hard way at least once to understand the process from start to finish. Next, we will look at tools to streamline this process.

IDE

An IDE, an Integrated Development Environment, is a text editor that provides additional facilities to programmers for software development. A good IDE will highlight keywords with different colors, prove code completion suggestions, show function documentation and parameter while typing a function, indent code automatically, provide integrated commands to run and debug the program, have keyboard shortcuts to refactor code, jump to function definitions and help us catch programming mistakes as we type.

IDEs come in different shapes and sizes. Some IDEs come with their own compiler while others can be configured to use different compilers. Web based IDEs allow us to write and execute our programs through the browser interface. Some IDEs are work with only a single programming language while others use plugins to work with many languages.

Some of the popular IDEs for C++ are: Visual Studio Code, Visual Studio, CLion, Eclipse, Code:: Blocks, Sublime Text, Xcode (only for Mac), Emacs, Vim

Your IDE must have at least the following features:

  • Reformat code to fix any indentation problems
  • Refactor function name and variable names
  • Compile and Run code easily
  • Step through code in debugging mode and show the value of local variables
  • Comment/Uncomment a line or a block of code easily
  • Jump to the definition of a function easily
  • Find all occurrences of a string and replace them easily

An alternative to having an IDE running on your local machine is to use your web browser to access a site that provides an online IDE. Some of the popular online programming environments are: Repl.it, CodeAnywhere, AWS Cloud9, Visual Studio Codespaces, Ideone, Codepad, Codechef, Jdoodle, Codiva, and many others,

We are going to use Visual Studio Code to edit and compile files directly on the CSS Linux Lab.

If you would like to work offline, you can use Visual Studio Code on your own local machine. However, you must always test your programs on the CSS Linux Lab before submitting them. It is not good enough for your program to work on your machine. Your program has to work on the CSS Linux Lab.

Repl.It

Before we get to more complex IDEs, let's explore repl.it which allows you to write programs and execute programs in your browser. You can, but do not have to, create a repl.it account.

Click on “start coding”, choose “C++” as your language and create your first repl.it

  repl.it creates the default “Hello World” program. You can compile and run it by clicking on the “play” button (green arrow).

When you compile and run it, repl.it shows you the unix commands it is using. The default for C++ programs is:

$ clang++-7 -pthread -std=c++17 -o main main.cpp
$ ./main  

The default compiler on repl.it is clang++-7. You can also type directly into the shell area to compile your program using a different compiler. Try compiling your program using g++ by typing

$ g++ -o main main.cpp

Other unix commands, such as ls, which, rm, etc, are also available in the shell provided by repl.it.

The repl.it interface is intuitive and sufficient for small programs. More advanced repl.it commands are available through a hidden menu. Use CTRL-Shift-P (Windows) or Command-Shift-P (Mac) to open the repl.it Command Palette.

You can add a link to your README.md file, something like https://repl.it/github/GitHubUSERName/projectname which will with one click import the GitHub repository to repl.it and allow anyone to run it.

A repl.it hack: Currently (Jan 2020), the auto-format function on repl.it is broken, so you cannot re-indent code easily. You can however use clang-format from the unix shell to re-format the file that has bad indentation. This trick can be used on any system where you have clang-format installed.

clang-format -i main.cpp

Visual Studio Code

Visual Studio Code is currently the most popular IDE. Make sure you do not confuse VSC with Visual Studio or Visual Studio Codespaces as each one offers a different set of features.

The power of VSC comes from the large number of plugins there are in the VSC Marketplace. VSC can edit files on a remote machine and provide a terminal to execute commands on the remote machine. We are going to make use of this functionality to write our “Hello World” program CSS Linux Lab.

  1. Instal ssh client
  2. Confirm that Husky OnNet is connected. You need UW VPN service when connecting to UW machines.
  3. Check ssh. Open a Terminal (CMD for Windows, Terminal application with bash shell for MAC) and issue this command: ssh <YourNetId>@csslab7.uwb.edu
    • Windows: If ssh is not in your path, you will have to add it manually. In most cases, ssh.exe will be installed in C:\Windows\System32\OpenSSH. Follow the instructions at this web page, to add this folder to your PATH.
  4. Follow the instructions on Remote Development using SSH to install the Visual Studio Code Remote Development Extension Pack. CSS Linux Labs already have the SSH client installed, so you can ignore the “Install an OpenSSH compatible SSH client if one is not already present.” step

Here is a short video demonstrating how to connect to CSS Linux Lab.

If you try to connect to multiple csslab machines and you are getting an error message from VSC, then the files VSC has installed in csslab might have been corrupted. Login to any csslab machine using Putty, MobaXterm or Terminal (see instructions at Connecting to the Linux lab machines if necessary. Once logged in, execute rm -rf .vscode-server/ to delete all files that were installed by VSC.

This is also a good time to install a terminal program, other than VSC, to connect to CSS Linux Lab. See instructions on Connecting to the Linux lab machines. Having a terminal program to access your files in the CSS Linux Lab and a file transfer program (ftp program) to transfer files easily from your local computer to CSS Linux Lab will come in handy. For Windows, MobaXTerm and Putty are popular clients for logging into unix machines. MobaXTerm also allows you to transfer files as well. For Mac, you can use the builtin Terminal with ssh to login, scp to transfer files. Other programs for transferring files that work for both Windows and Mac are Filezilla client and Cyberduck.

Connect to csslab1.uwb.edu using the “Remote Explorer” in VSC. The first time you connect, ssh will ask you if you trust this machine as well as your password. If you do not want to enter your password every time, you can create an ssh key pair. The Quick SSH setup for linux lab has some basic information on how you might do it.

 

Once the ssh connection has been setup and the configuration information added to /usr/netid/.ssh/config (on your local computer), You can establish a connection in VSC, by clicking on the “Connect to Host in New Window” icon.

When you are connected to the CSS Lab, you should see the name of the machine you are connected to on the bottom left of VSC.

When you are connected to the remote machine, using “New File” or “Open” in VSC will open a file on remote machine. 

Choose Terminal > New Terminal from the VSC menu to create a terminal on the remote machine and follow the steps in the image below to create your 342 directory.

$ mkdir 342
$ cd 342
$ ls -al
total 4
drwxrwxr-x  2 pisan pisan   10 Aug  9 13:04 .
drwx------ 33 pisan pisan 4096 Aug  9 13:04 ..
$ mkdir hello
$ cd hello
$ pwd
/home/NETID/pisan/342/hello
$ ls -al
total 0
drwxrwxr-x 2 pisan pisan 10 Aug  9 13:04 .
drwxrwxr-x 3 pisan pisan 27 Aug  9 13:04 ..
csslab1:hello$ 

Now that we have a directory for our project and a shell to compile it, we are ready to write our “Hello World” program. Create a New File in VSC, copy the content below and save it with the name “helloworld.cpp” into the 342/hello directory. 

#include <iostream>
 
using namespace std;
 
int main() {
  cout << "Hello, World" << endl;
 
  return 0;
}

Check in the shell that your file is in the correct location.

$ ls -al
total 0
drwxrwxr-x 2 pisan pisan 36 Aug  9 13:12 .
drwxrwxr-x 3 pisan pisan 27 Aug  9 13:04 ..
-rw-rw-r-- 1 pisan pisan  0 Aug  9 13:12 helloworld.cpp

You can now compile and execute your program.

$ g++ -o helloworld helloworld.cpp 
$ ls -al
total 16
drwxrwxr-x 2 pisan pisan   58 Aug  9 13:19 .
drwxrwxr-x 3 pisan pisan   27 Aug  9 13:04 ..
-rwxrwxr-x 1 pisan pisan 8776 Aug  9 13:19 helloworld
-rw-rw-r-- 1 pisan pisan  107 Aug  9 13:19 helloworld.cpp
$ ./helloworld 
Hello, World
$ 

Time for some experimentation to learn about compiler error messages:

  1. What is the error message you get, if you change “iostream” to “xxx”?
  2. What is the error message you get, if you delete a “;”
  3. What is the error message you get, if you change “«” to “<”?
  4. What is the error message you get, if you delete a quotation mark? 
  5. What is the error message you get, if you comment out the “using namespace std;” line?

The compiler processes the file line by line. If there is a missing semicolon on line-10, its processing for all the lines after line-10 goes wonky. It might generate 10s or 100s of error messages. The only error message that you need to pay attention to is the first error message. All error messages after the first one are often meaningless.

  Let's examine the contents of “Hello World” program.

1.  #include <iostream>
2.  
3.  using namespace std;
4. 
5.  int main() {
6.     cout << "Hello, World" << endl;
7. 
8.     return 0;
9.  }

 

Line-1: #include is a pre-processor command. It tells the pre-processor that the program will use the iostream library. The Standard Input / Output Streams Library defines the cout functions for printing. See cplusplus reference for additional details.

Line-3: C++ functions reside in different namespaces, so that functions with the same name do not overwrite each other. cout is in the namespace std (the standard namespace). To reference the cout command, we need to use std::cout when we refer to it. Using std:: can make our programs harder to read, so for our class programs, we are going to include the statement using namespace std; at the beginning of our programs. For large projects, where there are multiple connected systems, it is better not to have using statements and reference each function with its full name.

Line-5: int main () All programs must have one, and only one, main function. The main function is where execution of the program will start. By default, the main function returns 0 to indicate successful termination of the program. The main function can be defined not to take any parameters, as is the case in our example, or it can be defined as int main(int argc, char* argv[]) when a program takes command line arguments.

 Line-6: cout << "Hello, World" << endl;. cout is the insertion operator. We are inserting some text into the standard output stream. When we have multiple items to insert, we can chain the << operator. The endl inserts a newline and flushes the stream. You might also see “\n” used as the newline character. For compatibility, it is better to use endl since Windows defines end-of-line as “\r\n” while Unix defines it as “\n”.

Line-8: The return statement exits the function, and in our case exits the program.

For a more detailed look at transition from Java to C++, see Professor Zander's notes on what used to be the CSS 332 course.

Another popular IDE is CLion. CLion can also edit files on a remote host as well as let you compile and run programs on a remote host. See https://www.jetbrains.com/help/clion/editing-individual-files-on-remote-hosts.html for details.

Using VSC Menu to compile/run

Instead of using the command line, we can also use VSC to compile, run and debug programs. We need to tell VSC how to compile and run our program.

If your program consists of a single file, say main.cpp, you can use “Run > Run without Debugging” from the VSC Menu. VSC will create a .vscode/launch.json file once it has confirmed you want to use g++ Build and Debug active file as the default action. This lets you work with a single cpp file.

To work with multiple files, you also need to create a Task which will be stored in .vscode/tasks.json

  1. Open Command Palette (Windows: Ctrl-Shift-P, Mac: Option-Shift-P)
  2. Choose “Tasks: Configure Task”
  3. Choose “C/C++ g++ build active file”
  4. VSC will create a .vscode/tasks.json file.
  5. Modify .vscode/tasks.json to change “type” from “command” to “shell”

You also need to tell the VSC task which files it needs to compile. The easiest way I found was to replace

"${file}",
with
"${fileDirname}/*.cpp",
in .vscode/tasks.json This tells VSC that you want to include all the files with a cpp extension in the compilation.

If you will work with .vscode/tasks.json you might also add additional flags to g++

"args": [
          "-g",
          "-std=c++11",
          "-Wall",
          "-Wextra",
          "-Wno-sign-compare",
          // "${file}",
          "${fileDirname}/*.cpp",
          "-o",
          "${fileDirname}/${fileBasenameNoExtension}"
        ],

This video demonstrates the process of debugging with VSC: https://youtu.be/XHOeRF6_2fE

Troubleshooting VSC

See http://depts.washington.edu/cssuwb/wiki/vsc_and_remote_development#trouble_shooting if you are getting “VSC Server Error” or “Disk Quota Exceeded” when trying to connect to CSSLab machines

GitHub

GitHub is a code sharing and publishing service. Software developers use GitHub to share code and track changes to code. Understanding and using GitHub is critical when multiple people are working on the same codebase.

For this class, you will complete most of your work on your own, so we will only use a small subset of GitHub's features.

You will need to create a GitHub account, so you can store your software in a GitHub repository. Visit https://education.github.com/ to create a student account. The Student Developer Pack provides free access to many commercial products. Take advantage of your student account and learn to use some of these tools before you graduate.

GitHub repos can be public, such as https://github.com/pisan342/hello-factorial, or private.

GitHub uses the git program for interacting with the repositories. The usual interaction with a GitHub repo is as follows:

  1. Create a new repo via browser, add an optional README file. Alternatively, if you want to base your code on existing code from another repository, you can fork an existing repository
  2. clone the repo to your computer (it might be empty)
  3. Create new files or edit existing files
  4. Add the new files to the list of files to be tracked by git
  5. commit the changes to your local computer
  6. push the changes to the GitHub repository
  7. Check that the new version of your files are on the GitHub repository using your browser
  8. Go back to step-3

git has many additional features, such as merging different versions of code, when multiple developers are working on the same code base.

It is best to treat the version on GitHub as the master copy, so you can always delete the local version on your computer and start from where you left of.

We will first explore how to work with git commands from the command line and then look at how we can use VSC to execute these commands.

git from Command Line

The basic interaction with GitHub is as follows:

    # Always start a new project by creating a repository for it on the GitHub web site with a Readme file
    # Clone a directory, even if it is empty
    $ git clone https://github.com/pisan342/hello-factorial
 
    # Change into the directory just created
    $ cd hello-factorial
 
    # Get the latest version from web repository
    $ git pull
 
    # Add files that have been modified and should be included in the next version
    $ git add main.cpp README.md
 
    # Commit changed files to the local database as next version
    $ git commit -m"Made some changes"
 
    # Push (upload) the latest version of files to the web repository
    $ git push
 
    # List modified files
    $ git status
 
    # List the difference between a current file and the one in the web repository
    $ git diff some-changed-file
 
    # Checkout an older version of the repository. Get HASH_NUMBER from web page.
    $ git checkout HASH_NUMBER
 

A common problem, for teams or when you use multiple computers, is that the web based version and the local version of the repository have been independently modified. git pull fails since the code files cannot be merged. There are multiple solutions to this problem:

  1. Rename the local repository using mv oldname newname, clone the repository from the web version and then manually compare the two versions using diff -qr webversion newname. Not the most elegant approach, but effective.
  2. Use git mergetool, read StackOverflow, learn about git stash>, create branches, etc. Much more of a surgical approach, can keep histories and multiple versions intact.

Using ssh keys for GitHub

If you do not want to enter your username/password each time with GitHub,, you can use ssh key.

  1. Add your ssh key to GitHub at https://github.com/settings/keys On CSS Lab machines, your ssh public key will be in ~/.ssh/id_rsa.pub. If it does not exist, you can generate a new one using ssh-keygen command on the command line. cat ~/.ssh/id_rsa.pub will print it to screen which you can copy to https://github.com/settings/keys
  2. When cloning directory use git clone git@github.com:uwbclass/repositoryName.git which should use the ssh key and you won't have to enter your password each time.

git from Visual Studio Code

The process from VSC is similar. The “git” button on the left menubar display the list of files that have been modified or added. Clicking on '+' stages these files. The 'checkmark' commits the files to the local repository. And “push” from the “…” (more Options Menu) uploads the latest version fo the committed files to the web repository.

Desktop GUIs for Git

GitHub Desktop, GitKraken and other graphical GUIs are also available for Windows and Mac if you are developing on your local computer. For the CSS Linux Lab, you will have to rely on command line options or VSC acting on the remote files.

Code Style

Programmers are picky about their code style. Extended debates on whether to use K&R style braces or GNU style braces, whether to use 2 or 4 spaces to indent has persisted on the internet. Style is important. Having a consistent style makes it easier to read code. Each company, and sometimes each project, often defines their own style leading to all sorts of inconsistencies in code. While there is no universally agreed style for C++, there are some widely accepted styles.

We are going to use clang-tidy and clang-format to make sure our code is consistent with modern programming practices. Both xclang-tidy and clang-format are installed in the CSS Linux Lab, but you may need to “activate” the “llvm-toolset” by adding the following command to your ~./bashrc file. You can also manually execute this command, but we want to automate the mundane parts of software development, so we can focus on the more interesting issues.

$ source scl_source enable llvm-toolset-7.0

If you are using a C-Shell, please use the following command:

$ scl enable llvm-toolset-7.0 # This command is for c-shell users only. If you are using bash, ignore this.

clang-tidy

clang-tidy is part of the LLVM project https://clang.llvm.org/extra/index.html and it has 100s of different checks that it performs on the code. It can detect common errors like bugprone-infinite-loop, check for readability-else-after-return, warn about readability-function-size and many more common issues. 

To produce high quality code, we need to examine each clang-tidy warning. In some cases, we might choose not to follow the warnings because it requires advanced features or the warning is intended for much large projects whereas our emphasis is easy on code that is easy to read and modify.

clang-tidy relies on .clang-tidy configuration file to set its options. A sample .clang-tidy file is below.

# Configuration options for clang-tidy
# CSS Linux machines, Sep 2019: LLVM version 3.8.1
#
# usage: clang-tidy *.cpp -- -std=c++14
#
#
---
# See https://clang.llvm.org/extra/clang-tidy/#using-clang-tidy for all possible checks
Checks: '*,-fuchsia-*,-cppcoreguidelines-owning-memory,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-google-build-using-namespace,-hicpp-no-array-decay,-modernize-use-trailing-return-type,-llvm-header-guard,-cert-err60-cpp,-cppcoreguidelines-pro-bounds-constant-array-index,-google-global-names-in-headers'
 
WarningsAsErrors:  '*'
HeaderFilterRegex: '.*'
 
CheckOptions:
 - { key: readability-identifier-naming.ClassCase,           value: CamelCase  }
 - { key: readability-identifier-naming.StructCase,          value: CamelCase  }
 - { key: readability-identifier-naming.EnumCase,            value: CamelCase  }
 - { key: readability-identifier-naming.GlobalConstantCase,  value: UPPER_CASE }
 
 - { key: readability-identifier-naming.VariableCase,        value: camelBack  }
 - { key: readability-identifier-naming.ParameterCase,       value: camelBack  }
 - { key: readability-identifier-naming.PublicMemberCase,    value: camelBack  }
 
# No good consensus on function names problem.isFinished() and GetInputFromUser() are both good
# - { key: readability-identifier-naming.FunctionCase,        value: camelBack  }
# - { key: readability-identifier-naming.PublicMethodCase,    value: camelBack  }
# - { key: readability-identifier-naming.PrivateMethodCase,   value: camelBack  }
 
 
 
 
 
 
########################################################################
# Disabled checks
########################################################################
# -fuchsia-*,
#     Checks associated with fuchsia operating system
# -cppcoreguidelines-owning-memory,
#      Using and learning about raw pointers, not type-based semantics of gsl::owner<T*>
# -cppcoreguidelines-pro-bounds-array-to-pointer-decay,
#      Using pointers to arrays
# -cppcoreguidelines-pro-bounds-pointer-arithmetic,
#      Not using <span> which is in C++20
# -google-build-using-namespace,
#      Will use "using namespace std" to make code easier to read
# -hicpp-no-array-decay,
#      Using pointers to arrays
# -modernize-use-trailing-return-type,
#      Not using the modern return type indications such as "factorial(int n) -> int"
# -llvm-header-guard
#      Will use short header guards not full directory and file name
# -cert-err60-cpp
#      Want to be able to throw exception with string
# -cppcoreguidelines-pro-bounds-constant-array-index
#      Want to use array[index] without having to use gsl::at()
# -google-global-names-in-headers
#      Readability, want to have "using namespace std;" in headers as well

Checks - Each check can be enabled or disables. We enable all checks using “*” and then disable some of them with a “-” before the check name. The reason for disabling each of the checks is explained at the bottom of the .clang-tidy file in comments. We want to enable as many checks as possible.

WarningsAsErrors - Will treat warnings as errors, so we can use the return code from clang-tidy to detect if any warnings were issued

HeaderFilterRegex - display errors from all non-system headers

CheckOptions - Forces all Class names to be CamelCase, variables to be camelBack, globals to be UPPER_CASE.

We can execute clang-tidy as below:

$ clang-tidy *.cpp -- -std=c++14
11211 warnings generated.
22429 warnings generated.
factorial.cpp:13:14: error: invalid case style for parameter 'N' [readability-identifier-naming,-warnings-as-errors]
int fact(int N) {
             ^~
             n
factorial.cpp:16:16: error: statement should be inside braces [google-readability-braces-around-statements,-warnings-as-errors]
    if (N <= 1)
               ^
Suppressed 22418 warnings (22418 in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
9 warnings treated as errors
The “suppressed warnings” are from system libraries and can be ignored.

clang-tidy also has a -fix option that should be used with care. It will attempt to fix the file. Making a backup before trying to fix it is strongly recommended. After fixing the file, you can view it in VSC in the “source control” view to see what has changed before committing them to your repository.

$ clang-tidy -fix *.cpp -- -std=c++14

clang-format

clang-format can be used as a standalone program or as integrated into VSC to format your document https://clang.llvm.org/docs/ClangFormat.html

clang-format relies on .clang-format configuration file. This file can be generated using clang-format -style=llvm -dump-config > .clang-format

The .Clang-format file specifies how braces are aligned, whether single line short functions are allowed, whether you need space before parenthesis in an if statement, etc.

From the command line, you can run clang-format -i factorial.cpp which will modify the factorial.cpp file to conform to the formatting instructions. If you'd like to see how clang-format will modify your source files, you can use clang-format factorial.cpp | diff factorial.cpp - to see which lines will get modified. Unlike clang-tidy, letting clang-format modify program files is much safer.

To make sure VSC is using clang-format to format the document as you write your program.

  1. View > Command Palette > Format Document With …
  2. Configure Default Formatter
  3. Choose “clang-format”

You can now use “Format Document” (Shift + Alt + F on Windows, Shift + Option + F on Mac, Ctrl + Shift + I on Linux) to format your document based on the defined style.

 

Code Coverage

Without test cases, we don't know if our program works as expected. We need to make sure that when we modify our program, we are not introducing new bugs. Creating test cases to test the different inputs that our program accepts is important.

Novice programmers often write test cases, confirm that the program works and then modify the test cases to check for other types of input. The problem with this approach is that all the test cases are in the programmer's head. When the program is modified, there is no good way to run all the previous test cases.

In test-driven development, the programmer starts with writing a test and then writes the necessary code to pass the test and regularly refactors the code.

While we will not adapt a fully test-driven development approach in this course, we will emphasize writing tests to check the correctness of our program.

There are multiple C++ testing frameworks. Most of these frameworks are appropriate for much larger project. We will instead rely on the much simpler assert statements to test our code.

Below is the main.cpp file from https://github.com/pisan342/hello-factorial

/**
 * test functions for factorial
 **/
 
#include "factorial.h"
#include <cassert>
#include <iostream>
 
using namespace std;
 
void test01() {
  assert(fact(1) == 1);
  assert(fact(3) == 6);
}
 
void test02() {
  assert(fact(-1) == 1);
  assert(fact(5) == 120);
}
 
int main() {
  test01();
  test02();
  memoryLeakFunction();
 
  cout << "Done." << endl;
  return 0;
}

The assert statement takes one parameter. If it is true, the program continues. If it is false, the program exits with a message. Multiple conditions can be combined in an assert statement as necessary. A common trick is to add a descriptive string to assert statements such as assert(fact(1) == 1 && "Checking factorial of 1 failed"); A string always evaluates to true, so only the first condition determines the value passed to assert.

assert statements are also useful for checking the expected value of parameters passed into a function. When compiling a release version of the software, the program can be compiled to disable all the assert statements, so there is no performance impact on the final program.

assert statements are a very basic form of unit testing. If the program successfully executes passing all the tests, we can conclude it is passing all the tests and the functions are working as intended.

We might still have functions that are not getting called or lines of code that are never executed. To check the code coverage of our programs, we will use llvm-profdata and llvm-cov. The overall process is as follows:

(We will never do these steps by hand, but it is included here, so you understand the overall process)

  1. Compile the program: clang++-g -fprofile-instr-generate -fcoverage-mapping *.cpp -o a.out-code-coverage
  2. Execute the program to create default.profraw file: ./a.out-code-coverage
  3. Combine profiling data from muliple runs of the program: llvm-profdata merge default.profraw -output=a.out-code-coverage .profdata
  4. Create a visual from the profiling data: llvm-cov show a.out-code-coverage-instr-profile=a.out-code-coverage.profdata
  5. Create a report from the profiling data: llvm-cov report a.out-code-coverage-instr-profile=a.out-code-coverage.profdata

The following steps would produce an output similar to what is seen below

    factorial.cpp:
    1|       |/*
    2|       | * factorial definition
    3|       | *
    ...
   18|     10|    return N * fact(N - 1);
   19|     10|  } else {
   20|      0|    cout << "Too large: " << N << endl;
   21|      0|    return -1;
   22|      0|  }
   23|     15|}
   24|       |
   ...
   37|       |
   38|      0|void unusedFunction() {
   39|      0|  cout << "A cout statement ";
   40|      0|  cout << "that is never called " << endl;
   41|      0|}
 
 
Filename                      Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
factorial.cpp                      12                 2    83.33%           3                 1    66.67%          27                 7    74.07%
main.cpp                            3                 0   100.00%           3                 0   100.00%          16                 0   100.00%
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                              15                 2    86.67%           6                 1    83.33%          43                 7    83.72%

The lines 20-21 as well as lines 38-41 in main.cpp are never executed. The report shows that factorial.cpp has 83.33% coverage while main.cpp has 100% coverage.

Since we do not want to execute these commands by hand, we will use the https://github.com/pisan342/hello-factorial/blob/master/check-code-coverage.sh script to generate this output for us.

Memory Leaks

Memory can be allocated on the stack or on the heap. For memory allocated on the heap, the programmer is responsible for deleting it and returning this memory back to the operating system. If memory allocated on the heap is not returned back to the operating system, a program can consume a large amount of memory and impact the whole system.

When a program terminates, all the memory allocated that has been allocated is released back to the operating system. For programs that only run for a short amount of time, a memory leak may not be serious, but for background tasks on servers or programs in embedded devices getting rid of memory leaks is critical.

Smart pointers have improved memory management and reduced the amount of memory leaks in programs. In order to understand how the memory works, we will use raw pointers and handle memory allocation and deallocation through new and delete commands explicitly.

We will use two different tools for detecting memory leaks: valgrind and g++ flags.

valgrind

valgrind creates a wrapper around the executable and monitors its memory allocation. valgrind can be used with any executable. A typical usage is as follows:

$ g++ -g *.cpp
$ valgrind ./a.out 
==4425== Memcheck, a memory error detector
==4425== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4425== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==4425== Command: ./a.out
==4425== 
Hello World
Fact 5: 120
Done.
==4425== 
==4425== HEAP SUMMARY:
==4425==     in use at exit: 40 bytes in 1 blocks
==4425==   total heap usage: 2 allocs, 1 frees, 72,744 bytes allocated
==4425== 
==4425== LEAK SUMMARY:
==4425==    definitely lost: 40 bytes in 1 blocks
==4425==    indirectly lost: 0 bytes in 0 blocks
==4425==      possibly lost: 0 bytes in 0 blocks
==4425==    still reachable: 0 bytes in 0 blocks
==4425==         suppressed: 0 bytes in 0 blocks
==4425== Rerun with --leak-check=full to see details of leaked memory
==4425== 
==4425== For lists of detected and suppressed errors, rerun with: -s
==4425== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
$ 

The important part of the valgrind report is the definitely lost where it indicates our program lost 40 bytes of memory. A more detailed report is available using the XXX flag.

==4526==     in use at exit: 40 bytes in 1 blocks
==4526==   total heap usage: 2 allocs, 1 frees, 72,744 bytes allocated
==4526== 
==4526== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==4526==    at 0x4C2AC38: operator new[](unsigned long) (vg_replace_malloc.c:433)
==4526==    by 0x400916: memoryLeakFunction() (factorial.cpp:27)
==4526==    by 0x400AE7: main (main.cpp:24)

The additional information shows that our program did 2 allocations but only 1 deallocation and the memoryLeakFunction is most likely responsible for this leak.

g++ -fsanitize=address -fno-omit-frame-pointer 

The newer versions of g++ have additional flags that can be used to detect memory leaks. A typical usage is below:

$ g++ -fsanitize=address -fno-omit-frame-pointer -g *.cpp
$ ./a.out 
Hello World
Fact 5: 120
Done.

=================================================================
==4679==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7fdb43387c4f in operator new[](unsigned long) /tmp/gcc-9.2.0/libsanitizer/asan/asan_new_delete.cc:107
    #1 0x400d06 in memoryLeakFunction() factorial.cpp:27
    #2 0x400f66 in main main.cpp:24
    #3 0x7fdb425e0554 in __libc_start_main (/lib64/libc.so.6+0x22554)

SUMMARY: AddressSanitizer: 40 byte(s) leaked in 1 allocation(s).

The above report points out that line 27 in factorial.cpp may be responsible for the memory leak.

Static Analysis

You can sometimes detect memory leaks without even running the program, but this is much more limited.

Typical usage is:

$ clang++ -std=c++14 --analyze main.cpp
clang++ -std=c++14 –analyze main.cpp

Unix

The toolchain we have covered is for developing programs in the CSS Linux Lab with

  • g++ compiler
  • Visual Studio Code running on our local machine but editing files on the CSS Linux Lab
  • GitHub for version control and to store our projects
  • clang-tidy and clang-format to enforce code style
  • llvm-profdata and llvm-cov to check code coverage
  • valgrind and g++ flags to detect memory leaks

We can automate a lot of these manual tasks. The sample project https://github.com/pisan342/hello-factorial includes create-output.sh and check-code-coverage.sh scripts that run all these test for us.

create-output.sh script is shown below. Executing this script is similar to typing out each of these commands by hand on the command line.

#!/bin/bash
 
# Run this script as `./create-output.sh > output.txt 2>&1`
 
# How we want to call our executable, 
# possibly with some command line parameters
EXEC_PROGRAM="./a.out "
 
# Timestamp for starting this script
date
 
MACHINE=""
# Display machine name if uname command is available
if hash uname 2>/dev/null; then
  uname -a
  MACHINE=`uname -a`
fi
 
# Display user name if id command is available
if hash id 2>/dev/null; then
  id
fi
 
# If we are running as a GitHub action, install programs
GITHUB_MACHINE='Linux fv-az'
 
if [[ $MACHINE == *"${GITHUB_MACHINE}"* ]]; then
  echo "====================================================="
  echo "Running as a GitHub action, attempting to install programs"
  echo "====================================================="
  sudo apt-get install llvm clang-tidy clang-format valgrind
fi
 
# If we are running on CSSLAB and 
# clang-tidy is not active, print a message
CSSLAB_MACHINE='Linux csslab'
 
CLANG_TIDY_EXE='/opt/rh/llvm-toolset-7.0/root/bin/clang-tidy'
 
if [[ $MACHINE == *"${CSSLAB_MACHINE}"* ]]; then
    if ! hash clang-tidy 2>/dev/null && [ -e "${CLANG_TIDY_EXE}" ] ; then
        echo "====================================================="
        echo "ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR ERROR "
        echo "clang-tidy NOT found in path (but is in $CLANG_TIDY_EXE )"
        echo "Add the following command to ~/.bashrc file"
        echo "     source scl_source enable llvm-toolset-7.0"
        echo "You can add the command by executing the following line"
        echo "     echo \"source scl_source enable llvm-toolset-7.0\" >> ~/.bashrc"
        echo "====================================================="
    fi
fi
 
# delete a.out, do not give any errors if it does not exist
rm ./a.out 2>/dev/null
 
echo "====================================================="
echo "1. Compiles without warnings with -Wall -Wextra flags"
echo "====================================================="
 
g++ -g -std=c++11 -Wall -Wextra -Wno-sign-compare *.cpp
 
echo "====================================================="
echo "2. Runs and produces correct output"
echo "====================================================="
 
# Execute program
$EXEC_PROGRAM
 
echo "====================================================="
echo "3. clang-tidy warnings are fixed"
echo "====================================================="
 
if hash clang-tidy 2>/dev/null; then
  clang-tidy *.cpp -- -std=c++11
else
  echo "WARNING: clang-tidy not available."
fi
 
echo "====================================================="
echo "4. clang-format does not find any formatting issues"
echo "====================================================="
 
if hash clang-format 2>/dev/null; then
  # different LLVMs have slightly different configurations which can break things, so regenerate
  echo "# generated using: clang-format -style=llvm -dump-config > .clang-format" > .clang-format
  clang-format -style=llvm -dump-config >> .clang-format
  for f in ./*.cpp; do
    echo "Running clang-format on $f"
    clang-format $f | diff $f -
  done
else
  echo "WARNING: clang-format not available"
fi
 
echo "====================================================="
echo "5. No memory leaks using g++"
echo "====================================================="
 
rm ./a.out 2>/dev/null
 
g++ -std=c++11 -fsanitize=address -fno-omit-frame-pointer -g *.cpp
# Execute program
$EXEC_PROGRAM > /dev/null
 
 
echo "====================================================="
echo "6. No memory leaks using valgrind, look for \"definitely lost\" "
echo "====================================================="
 
rm ./a.out 2>/dev/null
 
if hash valgrind 2>/dev/null; then
  g++ -g -std=c++11 *.cpp
  # redirect program output to /dev/null will running valgrind
  valgrind --log-file="valgrind-output.txt" $EXEC_PROGRAM > /dev/null
  cat valgrind-output.txt
  rm valgrind-output.txt 2>/dev/null
else
  echo "WARNING: valgrind not available"
fi
 
echo "====================================================="
echo "7. Tests have full code coverage"
echo "====================================================="
 
./check-code-coverage.sh
 
# Remove the executable
rm ./a.out* 2>/dev/null
 
date
 
echo "====================================================="
echo "To create an output.txt file with all the output from this script"
echo "Run the below command"
echo "      ./create-output.sh > output.txt 2>&1 "
echo "====================================================="

The create-output.sh script is intended to be run on the CSS Linux Lab, but you can modify it to run on your desktop machine as well. If you have a Mac and have already installed brew getting this script to work should not be too hard. For Windows, there are many different configuration options depending on whether you are using mingw, WSL or some other combination.

We'd like to make sure that our programs run not just on our machine, not just on CSS Linux Lab, but run successfully on any machine.  GitHub Actions provide a way to test our programs on a virtual machine. The sample project https://github.com/pisan342/hello-factorial has a file buildrun.yml under .github/workflows/ directory. This file defines a set of instructions to execute every time we “push” our code to GitHub.

buildrun.yml file is provided below. The only action it takes is to execute the create-output.sh script, just like we were doing manually on CSS Linux Lab.

name: Basic compile, code checks, and test run

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
    
    - uses: actions/checkout@v2

    - name: Run create-output.sh file
      run: chmod 755 create-output.sh; ./create-output.sh

This script gets executed each time some new code is “push”ed to the repository. The results from the script can be found under “Actions” on the repository web page.

 

Unix Tutorials

Miscellaneous

You can write your own scripts to customize your unix environment. Here are some example scripts

top-cpu

#!/bin/sh

# top-cpu
# find out who is using all the cpu

ps aux | head -1
ps aux --cols 160 | sort -nr -k3 | head
ps aux --cols 160 | sort -nr -k3 | head | awk '{ print $1}' | uniq | xargs getent passwd

top-mem

#!/bin/sh

# top-mem
# find out who is using all the memory

ps aux | head -1
ps aux --cols 160 | sort -nr -k4 | head
ps aux --cols 160 | sort -nr -k4 | head | awk '{ print $1}' | uniq | xargs getent passwd

uptimes

#!/bin/sh

# Find out the load on all the csslab machines
ssh csslab1.uwb.edu 'uname -n; uptime';
ssh csslab2.uwb.edu 'uname -n; uptime';
ssh csslab3.uwb.edu 'uname -n; uptime';
ssh csslab4.uwb.edu 'uname -n; uptime';
ssh csslab5.uwb.edu 'uname -n; uptime';
ssh csslab6.uwb.edu 'uname -n; uptime';
ssh csslab7.uwb.edu 'uname -n; uptime';
ssh csslab8.uwb.edu 'uname -n; uptime';
ssh csslab9.uwb.edu 'uname -n; uptime';
ssh csslab10.uwb.edu 'uname -n; uptime';
ssh csslab11.uwb.edu 'uname -n; uptime';
ssh csslab12.uwb.edu 'uname -n; uptime'; 

To find out how much disk-space you are using, use du -sh ~

To find out your largest directories, use du -s `(ls -A ~)` | sort -n

To use these scripts:

  1. Save them to ~/local/bin/
  2. Make them executable using chmod 755 filename
  3. Add the path to your .bashrc using export PATH=“${PATH}:/home/NETID/my-net-id/local/bin” replacing my-net-id with your actual netid

This wiki page was mostly written by Yusuf Pisan pisan@uw.edu You can make changes directly or contact him with corrections.

Navigation

The best way to find something here is to use the search box in the upper right or the site index link below.

Print/export
Mobile QR Link