type-correct is a modern LLVM and LibTooling
automated refactoring solution designed to 'fix' types. It rewrites inconsistent integer usage (commonly mixed int,
long, and size_t) into a consistent, mathematically correct state, specifically targeting the elimination of
truncation warnings and signed/unsigned mismatches in legacy C and C++ codebases.
Unlike simple regex replacements, type-correct builds a dependency graph of your variables, functions, and expressions
to mathematically solve for the "widest" necessary type, ensuring global consistency across compilation units.
type-correct functions as more than a linter; it is a constraint solver. It parses the AST (Abstract Syntax Tree),
identifies usage patterns (assignments, comparisons, pointer arithmetic), and determines the optimal type width for
every connected component in your code.
%%{init: {'theme': 'base', 'themeVariables': { 'fontFamily': 'Roboto Mono Normal', 'fontSize': '16px'}}}%%
flowchart TD
%% Define Styles based on Project Design Codes
classDef blue fill:#4285f4,stroke:#20344b,stroke-width:2px,color:#ffffff,font-family:'Google Sans Medium';
classDef green fill:#34a853,stroke:#20344b,stroke-width:2px,color:#ffffff,font-family:'Google Sans Medium';
classDef yellow fill:#f9ab00,stroke:#20344b,stroke-width:2px,color:#ffffff,font-family:'Google Sans Medium';
classDef red fill:#ea4335,stroke:#20344b,stroke-width:2px,color:#ffffff,font-family:'Google Sans Medium';
classDef navy fill:#20344b,stroke:#57caff,stroke-width:2px,color:#ffffff,font-family:'Google Sans Medium';
%% Nodes
Source["Source Code<br>(.c / .cpp)"]:::navy
Clang["Clang AST<br>Frontend"]:::blue
Matcher["AST Matcher<br>(Identify Decls)"]:::blue
%% Safety Logic
Boundary["Safety Check<br>(Struct/CMake)"]:::red
%% Solver Logic
Graph["Dependency<br>Graph"]:::yellow
Solver["Type Solver<br>(Width Calc)"]:::yellow
Ptr["Pointer Logic<br>(ptrdiff_t)"]:::yellow
%% Output Logic
Facts["CTU Facts<br>(Map/Reduce)"]:::green
Writer["Rewriter<br>(Apply Changes)"]:::green
%% Flow
Source --> Clang
Clang --> Matcher
Matcher --> Boundary
Boundary -- "If Safe" --> Graph
Boundary -. "If System/Fixed" .-> Writer
Graph --> Solver
Solver --> Ptr
Ptr --> Facts
Facts -- "Global Type" --> Writer
- StructAnalyzer (The Safety Gate): Before modifying a variable, the tool determines if it is safe to change. It
heuristically detects "System Boundaries" by:
- Analyzing inclusion graphs (viral fixedness).
- Scanning neighbor
CMakeLists.txtfiles to blindly identifyFetchContentorExternalProjectdirectories ( vendored code). - Checking for
attribute((packed))or bitfields which lock memory layout.
- TypeSolver (The Brain): A graph-based solver that propagates type requirements. If
a = b, andbmust besize_t, thenamust also be at leastsize_t. It handles:- Pointer Semantics: Variables used as array indices or in pointer arithmetic are forced to
ptrdiff_twidth ( orsize_t) to prevent 64-bit truncation. - Symbolic Constraints: Understanding relations like
c = a + b.
- Pointer Semantics: Variables used as array indices or in pointer arithmetic are forced to
- FactManager (CTU): Handles Cross-Translation Unit analysis via a Map-Reduce approach, allowing the tool to ensure
a function definition in
a.cppmatches its usage inb.cpp.
The primary operation involves widening narrow integer types (int, short) to architecture-appropriate width types (
size_t, ptrdiff_t) based on how they are actually used.
#include <string.h>
int main(void) {
/* FROM */
const int n = strlen("FOO");
/* TO: Correctly captures return type of strlen */
const size_t n = strlen("FOO");
/* FROM */
for(int i=0; i<strlen("BAR"); i++) {}
/* TO: Loop index matches comparison type */
for(size_t i=0; i<strlen("BAR"); i++) {}
/* FROM */
int f(long b) { return b; }
static const int c = f(5);
/* TO: Propagation of return types */
long f(long b) { return b; }
static const long c = f(5);
} Often when building third-party libraries I get a bunch of warnings "comparison between signed and unsigned types is UB".
Not every such occurrence has a trivial solution. But—in my experience—most do. Usually switching just one var from
int to size_t also requires tracing all use of that var and changing all those types to size_t also. This is the
manual labor type-correct automates.
From:
unsigned long f() {return 0;}
const size_t v = f();
int main() {
std::vector<float> vec;
for(int i=0; i<vec.size(); i++) {}
} To:
unsigned long f() {return 0;}
const unsigned long v = f();
int main() {
std::vector<float> vec;
for(size_t i=0; i<vec.size(); i++) {}
} PS: I'm aware that
size_type
isn't necessarily size_t—and that decltype(vec)::size_type would be more correct—but using it here anyway. Just to
reiterate: C++ is an afterthought, my main target is C.
#include <limits.h>
short sh=SHRT_MAX;
int i=INT_MAX;
long l=LONG_MAX;
long long ll=LLONG_MAX; /* C99 */ Technically, these are all defined and expected [on clang as an
ImplicitCastExpr]:
ll = l;
l = i;
i = sh; (but the other direction, 'narrowing', is implementation defined)
However, IMHO, people doing int sl = strlen(s); actually want size_t.
This opinionated view is the assumption made for type_correct.
But… attempts are made to be reasonably conservative. See
type_correct/tests/test_type_correct.cpp for false positive and true
positive checks.
type-correct is built as a standalone CLI tool that consumes a compilation database or acts on single files.
Before applying changes, stick to read-only "Audit Mode". This generates a change list table without touching disk.
$ type_correct_cli --audit --project-root=$(pwd) src/main.cppTo apply changes directly to your source files:
$ type_correct_cli --in-place src/main.cppFor complex projects where a header change affects multiple Translation Units (TUs), a single pass is insufficient.
type-correct supports an iterative "Fixed-Point Convergence" mode using a Map-Reduce strategy.
# 1. Create a directory for intermediate facts
mkdir facts
# 2. Run iteratively until types stop changing
type_correct_cli \
--phase=iterative \
--facts-dir=facts \
--project-root=$(pwd) \
--in-place \
src/*.cppBy default, the tool is conservative about changing struct layouts (ABI breaking). If you are recompiling the entire universe and want to optimize internal structs:
$ type_correct_cli --in-place --enable-abi-breaking-changes src/*.cppInstall a C++20 compliant compiler suite, CMake, and LLVM 16+ (from brew, apt, or source).
Dependencies:
- CMake >= 3.20
- LLVM/Clang >= 16.0 (Required for modern LibTooling APIs)
$ mkdir build && cd build
# Point to your LLVM installation root
$ cmake .. \
-DCMAKE_BUILD_TYPE='Debug' \
-DCT_Clang_INSTALL_DIR='/usr/lib/llvm'
$ cmake --build .(Replace /usr/lib/llvm with your actual LLVM install directory found via llvm-config --prefix; on macOS brew it's -DCT_Clang_INSTALL_DIR=/opt/homebrew/opt/llvm).
The project uses GoogleTest for unit logic and llvm-lit for integration testing.
$ cd build
$ ctest --output-on-failurescripts/update_coverage_badges.py can compute documentation coverage from multiple open-source API doc generators and
update the README badges.
Supported doc sources:
- Doxygen XML (C/C++/Objective-C/C#/Java)
- JSDoc JSON (
jsdoc -X) for JavaScript - TypeDoc JSON (
typedoc --json) for TypeScript - OpenAPI specs (JSON/YAML) for REST APIs
- Custom JSON counts (
documented+total)
$ python3 scripts/update_coverage_badges.py \
--doc-source doxygen=build/docs/xml \
--doc-source typedoc=docs/typedoc.jsonTo run coverage + doc coverage and refresh the shields in one step:
$ scripts/coverage_badges.sh [build]Enable the pre-commit hook:
$ git config core.hooksPath .githooksBoilerplate from https://github.com/banach-space/clang-tutor
The person who associated a work with this deed has dedicated the work to the public domain by waiving all of his or her rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information below.
- In no way are the patent or trademark rights of any person affected by CC0, nor are the rights that other persons may have in the work or in how the work is used, such as publicity or privacy rights.
- Unless expressly stated otherwise, the person who associated a work with this deed makes no warranties about the work, and disclaims liability for all uses of the work, to the fullest extent permitted by applicable law.
- When using or citing the work, you should not imply endorsement by the author or the affirmer.