C++ Utility function for converting bytes to higher units


Sometimes you need a handy function to convert number of bytes to a human readable size. I wrote a C++ version but found better logic on StackOverflow but implemented in C so decided to implement a similar function in C++.

/**
*Function to convert number of bytes to higher units.
*Inspired from C version of source from:
*http://stackoverflow.com/questions/3898840/converting-a-number-of-bytes-into-a-file-size-in-c
*/

#include
#include
#include
#include
#include

#define LIST_SZ(x) (sizeof(x)/sizeof(*(x)))

static const char *units[] = { "TiB", "GiB", "MiB", "KiB", "B" };
static const unsigned long tebibytes = 1024ULL * 1024ULL * 1024ULL * 1024ULL;

std::string
HumanReadableSize(unsigned long bytes)
{
std::stringstream result;
unsigned long multiplier = tebibytes;

for (int i = 0; i < LIST_SZ(units); i++, multiplier /= 1024)
{
if (bytes < multiplier)
continue;
if (bytes % multiplier == 0)
result << (bytes / multiplier) << " " << units[i] << std::fixed;
else
result << std::setprecision(4) << ((float) bytes / multiplier) << " " << units[i];
return result.str();
}
result << "0";
return result.str();
}

//Unit Test function
int main(void)
{
unsigned long list[] =
{
0, 1, 2, 34, 900, 1023, 1024, 1025, 2048, 1024 * 1024,
1024 * 1024 * 1024 + 1024 * 1024 * 400
};

for (int i = 0; i < LIST_SZ(list); i++)
{
std::cout << HumanReadableSize(list[i]);
std::cout << std::endl;
}
std::cout << HumanReadableSize(2966141478);
return 0;
}

Best practices for obtaining GCOV code coverage for C/C++ application


Every now and then at work I come across the need to instrument a new project with code coverage and as it happens I end up struggling to recollect how I did it. Google does come to my rescue but most of the posts deal with how to instrument the code and process basic reports but none of them deal with best practices as such. So here is a TL;DR version of best practices according to my opinion which I plan to refer next time I need to do the task. My best practices are for the situation similar to what I’m facing and these need to be tweaked for other cases.

Rules of engagement:

  • Need to instrument C/C++ code
  • Work with standard GCC 4.4 compiler and the lcov-1.12  (These are the versions I’m using but other versions may work)
  • Target system where the binary is executed is different from the system where the source is compiled.
  • There are multiple versions of target devices where different code paths get executed
  • Need to generate a combined coverage report from all the target devices.
  • For this example, assume my work- space root is /home/jpadhye/project and source is contained in directories comp1 and comp2
  • For this example, the compiled are stored under work-space root in directory obj/x86_64

Steps for instrumentation:

1] If you are using GNU makefile, let’s assume it is target.mk you have lines:

#Targets
$(BLDDIR): $(BLDDIR)/executable.bin
$(BLDDIR): $(BLDDIR)/library.so
#options
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private OPT_CFLAGS = -O3
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private EXTRA_CFLAGS = -fno-strict-aliasing
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private ONLY_CXXFLAGS += -Woverloaded-virtual
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private DEBUG_CFLAGS =

To instrument you can add extra items to enable code coverage optionally:

#Switch to turn coverage on
$_TARGET_CODE_COVERAGE=1  
#Targets
$(BLDDIR): $(BLDDIR)/executable.bin
$(BLDDIR): $(BLDDIR)/library.so
#options
ifeq ($($_TARGET_CODE_COVERAGE),1)
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private EXTRA_CFLAGS = --coverage
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private EXTRA_CXXFLAGS = --coverage
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private EXTRA_DEFINES = -DTARGET_CODE_COVERAGE
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private ONLY_CXXFLAGS = -fno-default-inline -fno-inline
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private OPT_CFLAGS = -O0
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private DEBUG_CFLAGS = -g
$(BLDDIR)/%.bin: private EXTRA_LDFLAGS += --coverage
$(BLDDIR)/%.so: private SO_FLAGS += --coverage
else
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private OPT_CFLAGS = -O3
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private EXTRA_CFLAGS = -fno-strict-aliasing
$(BLDDIR)/%.lo $(BLDDIR)/%.o: private ONLY_CXXFLAGS += -Woverloaded-virtual
$(BLDDIR)/%.lo $(BLDDIR)/%.o $(BLDDIR)/%.so: private DEBUG_CFLAGS =
endif

The extra-define switch of -DTARGET_CODE_COVERAGE is to enable signal handling code in the instrumented target which you will add in next step.

2] Then add proper signal handlers to the source code:

/* Make sure proper headers are included*/
#include <signal.h>
#include <unistd.h>
#ifdef TARGET_CODE_COVERAGE
/*Forward declaration of flush api*/
void __gcov_flush();
/*Signal handler definition which flushes profiling data when 
exit() and __gcov_flush() is called. */

void signal_handler(int signum)
{
  if (signo == SIGPROF) {
  	printf("Received SIGPROF\n");
	__gcov_flush();
  } else if (signo == SIGSTOP) {
   	  printf("received SIGSTOP\n");
	  exit(0);
  }
}
#endif 

//Main function of your program
int main ()
{
.
.
.
#ifdef TARGET_CODE_COVERAGE
//Handle the signals	
if (signal(SIGPROF, sig_handler) == SIG_ERR)
	printf("\ncan't catch SIGKILL\n");
if (signal(SIGSTOP, sig_handler) == SIG_ERR)
    printf("\ncan't catch SIGSTOP\n");
#endif 	
.
.
.	
}

3] Then compile your source code with code coverage enabled. Ensure that the $_TARGET_CODE_COVERAGE is set to 1 to enable code coverage. After the compilation is done, for all of the instrumented source files, you will find a *.gcno with the same name and under same  directory structure as the source file. In my case, the object files are collected in the ‘obj/x86_64’ directory in the work-space root with the source directory structure maintained.  These files are used while generating the coverage information. If these files are not created, then something went wrong in your instrumentation. Check the above steps

Steps for execution:

In my case, the target execution environment was a separate device, different than the device on which the code was built. So these steps are for my use case but are also applicable where build and execution machine are the same.

1] When the code gets compiled with code coverage enabled, the coverage file contain the full path. For example, if you work-space root directory is ‘/home/jaideep/project’ then the complete absolute file path for the source files gets recorded. When code gets executed on target machine, the *.gcda files containing the coverage information will be maintain the same directory and naming structure as the source files. But if you want the coverage information to be generated in a specific folder, then you need to strip the prefix of the work-space directory and specify alternate prefix for the work-space directory structure as follows:

export GCOV_PREFIX=/tmp/codecoverage/
export GCOV_PREFIX_STRIP=2

This ensures that the directory structure from the project directory onward gets generated in the directory specified by GCOV_PREFIX.

2] Once the environment is set, then we execute the instrumented binary and run the required tests to generate the coverage information. Once you are done, you can kill the process with SIGSTOP which results in graceful execution of the program: ‘kill -SIGSTOP <pid>’. If you want to generate code coverage for each test case, then you can simply call  ‘kill -SIGPROF <pid>’ to make the process dump the coverage information without killing the processes itself. Any of these signals will result in coverage information being dumped in the form of *.gcda files with the same directory structure from the project root onward.

3] Once the coverage information is generated, compress the folder containing the *gcda files into a tarball and copy it to the root of your work-space.

WKSP=/home/jaideep/project/
TRGT=jaideep@build-host
cd /tmp/codecoverage/
tar -czf ${HOSTNAME}.tgz obj
scp ${HOSTNAME}.tgz ${TRGT}:${WKSP}

Generating HTML report:

To generate the HTML coverage report, ensure all the coverage information tarballs from your target devices are present. Currently my script handles information from two target hosts but the logic could be extended to handle multiple hosts. Following is the explanation of the script:

#host1_report.tgz: First variable is coverage tarball copied from first target device.
#host2_report.tgz: Second variable is coverage tarball copied from second target device.
#test_name: This will show up as the report name in the html report
#pattern: The pattern you are interested in. For example:If I only want coverage for comp1, I'll give '*/comp1/*'
#output_location: Location where html report is expected. This should be document directory of webserver.
#Example: 
./generate_coverage_html.sh target1-lnx.tgz target2-lnx.tgz lnx-app-ut1 '*/comp1/subdir/* */comp2/subdir/*' /var/www/htdocs/

The annotated script to generate the report is as follows:

#Run this script from the workspace root
#!/bin/bash
set -o errexit
set -v
set -x

#Define temporary directory where coverage metadata will be stored.
WORK_DIR=${PWD}/obj/codecoverage

#Parse the user input. This part taken from:
#https://github.com/socialize/socialize-project-helpers-ios/blob/master/generate-coverage-report.sh

usage() {
  echo "Usage: $0 <host1_report.tgz> <host2_report.tgz> <test_name> <pattern> <output_location>"
  exit 1
}

[ -n "$1" ] || usage
HOST1_REPORT="$1"
HOST1="${HOST1_REPORT%.*}"
shift

[ -n "$1" ] || usage
HOST2_REPORT="$1"
HOST2="${HOST2_REPORT%.*}"
shift

[ -n "$1" ] || usage
TEST_NAME="$1"
shift

[ -n "$1" ] || usage
PATTERN="$1"
shift

[ -n "$1" ] || usage
OUTPUT_LOC="$1"
shift

#Create test title with timestamp
TITLE="${TEST_NAME}_$(date +"%m%d%y%H%M%S")"

echo "WORK_DIR: ${WORK_DIR}"
echo "HOST1: ${HOST1}"
echo "HOST2: ${HOST2}"
echo "TEST_NAME: ${TEST_NAME}"
echo "PATTERN: ${PATTERN}"
echo "OUTPUT_LOC: ${OUTPUT_LOC}"
echo "TITLE: ${TITLE}"

#Cleanup the coverage directory and extract lcov tool
# and setup the PATH to the lcov binary
rm -rf ${WORK_DIR}
mkdir -p ${WORK_DIR}
wget http://downloads.sourceforge.net/ltp/lcov-1.12.tar.gz -O ${WORK_DIR}/lcov-1.12.tar.gz
tar -xzf ${WORKDIR}/lcov-*.tar.gz -C ${WORK_DIR}/ --strip-components=1
export PATH=${WORK_DIR}/bin:$PATH

#Setup lcov options
# --no-external : Tells lcov to ignore paths outside work directory prefix
# --rc lcov_branch_coverage=1 : Turns on config option to generate branch coverage.
LCOV_OPTS="--no-external --rc lcov_branch_coverage=1" #--ignore-errors source

#Create initial coverage
lcov -c -i -t ${TITLE} -d ${PWD}/obj/x86_64/-o ${WORK_DIR}/coverage.initial

#Cleanup any previous *.gcda files and extract the edge report
find ${WORKDIR}/ -name "*.gcda" -exec rm -rf {} \;
tar -zxf ${HOST1_REPORT}
lcov -c -t ${TITLE} -b ${PWD} -d ${PWD}/obj/x86_64/ -o ${WORK_DIR}/coverage.info.${HOST1} ${LCOV_OPTS}

#Cleanup any previous *.gcda files and extract the core report
find ${PWD}/obj/x86_64/ -name "*.gcda" -exec rm -rf {} \;
tar -zxf ${HOST2_REPORT}
lcov -c -t ${TITLE} -b ${PWD} -d ${PWD}/obj/x86_64/ -o ${WORK_DIR}/coverage.info.${HOST2} ${LCOV_OPTS}

#Cleanup all the gcda files once reports are processed
find ${PWD}/obj/x86_64/ -name "*.gcda" -exec rm -rf {} \;

#Combine the host1 and host2 reports
lcov -t ${TITLE} -a ${WORK_DIR}/coverage.info.${HOST1} \
-a ${WORK_DIR}/coverage.info.${HOST2} -t ${TITLE} \
-a ${WORK_DIR}/coverage.initial -t ${TITLE} \
-o ${WORK_DIR}/coverage.info.combined-${HOST1}_${HOST2} \
${LCOV_OPTS}

#Stop globbing so that pattern is not expanded in next line
set -f

#Extract the interesting pattern supplied by user from the combined report
lcov -e ${WORK_DIR}/coverage.info.combined-${HOST1}_${HOST2} ${PATTERN} \
-o ${WORK_DIR}/coverage.info.${TITLE} ${LCOV_OPTS}

#Generate the HTML report with legend and branch coverage information
genhtml -t ${TITLE} -o ${OUTPUT_LOC}/${TITLE} -p ${PWD} \
${WORK_DIR}/coverage.info.${TITLE} --legend --branch-coverage --num-spaces 4

exit ${RETVAL}