Clang Tutorial Part II: LibTooling Example

LibTooling Example

I’ll start with a LibTooling example because I think it’s the most useful interface to Clang, as described in Part I of this tutorial. You can also use this code in the Plugin environment with just a few simple changes. Onwards to the example!

Note: this is a continuation of Clang Tutorial: Part I.

Let’s say that you want to analyze a simple C file, test.c, as shown below:

void do_math(int *x) {
    *x += 5;
}

int main(void) {
    int result = -1, val = 4;
    do_math(&val);
    return result;
}

We’d like to do some simple refactoring/fixes on test.c:

  • Change the function do_math to add5, a better name
  • Change all calls to do_math to call add5 instead to match the first change
  • Fix the return statement to return val instead of result

First, we need to set up some structure. Clang comes packaged with a good Plugin example, but not a LibTooling example. No problem, here’s how to create a LibTooling program.

Here is the full source code for this LibTooling example.

You can check out the repository into the proper directory like so:

$ cd llvm/tools/clang/tools/
$ git clone https://github.com/kevinaboos/LibToolingExample.git example

The following sections are a complete walkthrough of the code in Example.cpp.

Starting with a Main Function

Let’s follow along with the code’s execution flow, starting from the main() function at the bottom of the file:

int main(int argc, const char **argv) {
    // parse the command-line args passed to your code
    CommonOptionsParser op(argc, argv);
    // create a new Clang Tool instance (a LibTooling environment)
    ClangTool Tool(op.getCompilations(), op.getSourcePathList());

    // run the Clang Tool, creating a new FrontendAction (explained below)
    int result = Tool.run(newFrontendActionFactory<ExampleFrontendAction>());

    errs() << "\nFound " << numFunctions << " functions.\n\n";
    // print out the rewritten source code ("rewriter" is a global var.)
    rewriter.getEditBuffer(rewriter.getSourceMgr().getMainFileID()).write(errs());
    return result;
}

Everything in main() is fairly self-explanatory, except rewriter, which is explained later. You’re just setting up a Clang Tool (L5), passing it command-line arguments (op.getCompilations()) and the list of source files (op.getSourcePathList()), and then running it (L8). The benefit of using LibTooling is that you can do things before (L6) and after (L9) the analysis, like printing out the modified code in L12 or counting the number of functions in several source files (L10). You cannot do this with a Clang Plugin.We also need a few global variables (at the top of the file):

Rewriter rewriter;
int numFunctions = 0;

Creating a FrontendAction

Now, let’s create our own custom FrontendAction, which is merely a class that will perform some action in the Clang frontend. We’ll choose an ASTFrontendAction because we want to analyze the AST representation of the source code test.c.

class ExampleFrontendAction : public ASTFrontendAction {
public:
    virtual ASTConsumer *CreateASTConsumer(CompilerInstance &CI, StringRef file) {
        return new ExampleASTConsumer(&CI); // pass CI pointer to ASTConsumer
    }
};

Nothing too complex going on here. We just create a subclass of ASTFrontendAction and overwrite the CreateASTConsumer function so that we can return our own ASTConsumer, as shown below. We also pass a pointer to the CompilerInstance because it contains a lot of contextual information that we’ll need later during our actual analysis.

Creating an ASTConsumer

The ASTConsumer “consumes” (reads) the AST produced by the Clang parser. You can override however many functions you wish, so that your code will be called when a certain type of AST item has been parsed. First, let’s override HandleTopLevelDecl(), which will be called whenever Clang parses a new set of top-level declarations (such as global variables, function definitions, etc.)

class ExampleASTConsumer : public ASTConsumer {
private:
    ExampleVisitor *visitor; // doesn't have to be private

public:
    // override the constructor in order to pass CI
    explicit ExampleASTConsumer(CompilerInstance *CI)
        : visitor(new ExampleVisitor(CI)) // initialize the visitor
        { }

    // override this to call our ExampleVisitor on each top-level Decl
    virtual bool HandleTopLevelDecl(DeclGroupRef DG) {
        // a DeclGroupRef may have multiple Decls, so we iterate through each one
        for (DeclGroupRef::iterator i = DG.begin(), e = DG.end(); i != e; i++) {
            Decl *D = *i;
            visitor->TraverseDecl(D); // recursively visit each AST node in Decl "D"
        }
        return true;
    }
};

Basically, this code uses our ExampleVisitor (described below) to visit the AST nodes in each top-level declaration in the entire source file. For test.c, two Decls would be visited, the FunctionDecl for do_math() and the FunctionDecl for main().

A Better Implementation of ASTConsumer

However, overriding HandleTopLevelDecl() means that your code in that function will be immediately called each time a new Decl is parsed in the source, not after the entire source file has been parsed. This creates a problem because, from the parser’s point of view, when do_math() is being visited, it is completely unaware that main() exists. This means you can’t access or reason about functions defined after the function you’re currently analyzing.  
★ ★ hey, that’s important! ★ ★


Luckily, the ASTConsumer class has a better function to override, HandleTranslationUnit(), which is called only after the entire source file is parsed. In this case, a translation unit effectively represents an entire source file. An ASTContext class is used to represent the AST for that source file, and it has a ton of useful members (go read about it!).

So, instead of overriding HandleTopLevelDecl(), let’s go with HandleTranslationUnit() because it works more like we would expect.

    // this replaces "HandleTopLevelDecl"
    // override this to call our ExampleVisitor on the entire source file
    virtual void HandleTranslationUnit(ASTContext &Context) {
        /* we can use ASTContext to get the TranslationUnitDecl, which is
           a single Decl that collectively represents the entire source file */
        visitor->TraverseDecl(Context.getTranslationUnitDecl());
    }

For the most part, you should use HandleTranslationUnit(), especially when you pair it with a RecursiveASTVisitor, like we do below.

Creating a RecursiveASTVisitor

At long last we’re ready to get some real work done. The previous two sections were just to set up infrastructure.The RecursiveASTVisitor is a fascinating class with more to it than meets the eye. It allows you to Visit any type of AST node, such as FunctionDecl and Stmt, simply by overriding a function with that name, e.g., VisitFunctionDecl and VisitStmt. This same format works with any AST class. Clang also offers an official tutorial on this, and while complete, it’s very brief.

For such Visit* functions, you must return true to continue traversing the AST (examining other nodes) and return false to halt the traversal entirely and essentially exit Clang. You shouldn’t ever call any of the Visit* functions directly; instead call TraverseDecl (like we did in our ExampleASTConsumer above), which will call the correct Visit* function behind the scenes.

Based on our objective of rewriting function definitions and statements, we only need to override VisitFunctionDecl and VisitStmt. Here’s how to do so:

class ExampleVisitor : public RecursiveASTVisitor {
private:
    ASTContext *astContext; // used for getting additional AST info

public:
    explicit ExampleVisitor(CompilerInstance *CI)
        : astContext(&(CI->getASTContext())) // initialize private members
    {
        rewriter.setSourceMgr(astContext->getSourceManager(),
            astContext->getLangOpts());
    }

    virtual bool VisitFunctionDecl(FunctionDecl *func) {
        numFunctions++;
        string funcName = func->getNameInfo().getName().getAsString();
        if (funcName == "do_math") {
            rewriter.ReplaceText(func->getLocation(), funcName.length(), "add5");
            errs() << "** Rewrote function def: " << funcName << "\n";         
        }         
        return true;     
    }     
    
    virtual bool VisitStmt(Stmt *st) {
        if (ReturnStmt *ret = dyn_cast(st)) {
            rewriter.ReplaceText(ret->getRetValue()->getLocStart(), 6, "val");
            errs() << "** Rewrote ReturnStmt\n";
        }
        if (CallExpr *call = dyn_cast(st)) {
            rewriter.ReplaceText(call->getLocStart(), 7, "add5");
            errs() << "** Rewrote function call\n";
        }
        return true;
    }
};

The above code introduces the Rewriter class, which lets you make textual changes to the source code. It is commonly used for refactoring or making small code changes. We also used it at the end of our program’s main() function to print out the full modified source code.

Using Rewriter means that you need to find the correct SourceLocation to insert/replace text. Understanding which location to choose (getLocation(), getLocStart(), etc) can be difficult, so I’ve explained several common types of location getters in this post.

Also, note the use of dyn_cast (L24, L28) to check whether the Stmt st is a ReturnStmt or a CallExpr. Click to read more about dyn_cast.

Finally, errs() is simply the stderr stream used throughout LLVM/Clang to print debugging info.

Getting More Specific with Visit* Functions

In this example, we only need to modify return statements (L8 of test.c) and function calls (L7 of test.c). Therefore, instead of overriding VisitStmt (more generic), we can override VisitReturnStmt and VisitCallExpr (more specific). Both ReturnStmt and CallExpr are subclasses of Stmt. This is the beauty of Clang’s AST and the RecursiveASTVisitor class — it can be as generic or as specific as you need. Here’s what that more specific code would look like:

    // this replaces the VisitStmt function above
    virtual bool VisitReturnStmt(ReturnStmt *ret) {
        rewriter.ReplaceText(ret->getLocStart(), 6, "val");
        errs() << "** Rewrote ReturnStmt\n";
        return true;
    }
    virtual bool VisitCallExpr(CallExpr *call) {
        rewriter.ReplaceText(call->getLocStart(), 7, "add5");
        errs() << "** Rewrote function call\n";
        return true;
    }

Putting It All Together and Compiling the Analysis

Again, I have posted the full code for this example, including all #include directives and global variables, as well as a Makefile and instructions for compiling and running it. Clang makefiles can be complex, but I won’t bother describing them because there’s some fairly good documentation on Clang Makefiles. Here’s the full Makefile necessary for building this example:

CLANG_LEVEL := ../..

TOOLNAME = example  #the name of your tool's executable

SOURCES := Example.cpp  #the Clang source files you want to compile

include $(CLANG_LEVEL)/../../Makefile.config

LINK_COMPONENTS := $(TARGETS_TO_BUILD) asmparser bitreader support mc option

USEDLIBS = clangFrontend.a clangSerialization.a clangDriver.a \
           clangTooling.a clangParse.a clangSema.a \
           clangAnalysis.a clangRewriteFrontend.a clangRewriteCore.a \
           clangEdit.a clangAST.a clangLex.a clangBasic.a

include $(CLANG_LEVEL)/Makefile

The above Makefile only works if you keep your sources in the llvm/tools/clang/tools directory. For example, I created a directory at  llvm/tools/clang/tools/example/ for placing the example C++ file and Makefile.

Running the Analysis

At the top of this page, I showed a simple C file, test.c, that we’ll use as the input to our analysis. It’s easiest to run our analysis with a quick shell script:
#!/bin/bash

LLVM_DIR=~/static_analysis/llvm/  #the location of your llvm dir
$LLVM_DIR/Debug+Asserts/bin/example test.c --

You can run this script from anywhere, as long as it’s in the same directory as your test.c file.

Perhaps you’d like to run your LibTooling analysis on multiple source files, or use some additional CFLAGS (or GCC options), or even pass your own command-line arguments to your Clang code. The script below has examples of all three:

#!/bin/bash

LLVM_DIR=~/static_analysis/llvm/  #the location of your llvm dir
$LLVM_DIR/Debug+Asserts/bin/example \
    file1.c file2.c file3.c \
    -myCmdLineArg argument1 \
    -- \
    -Wall -Isome/include/dir

Note that anything before the double dash “--” on L6 is an input to your LibTooling program, argv in main(), while anything after the double dash is an input to Clang itself (you won’t concern yourself with those).

If you want to learn how to parse command-line arguments such as “-myCmdLineArg argument1” above, then check out the LibTooling section of my post Clang: Handling Command-Line Arguments (coming soon!).

Conclusion

Assuming you were able to run your analysis script, congratulations on your first foray into Clang! You should see the following output:

** Rewrote function def: do_math
** Rewrote function call
** Rewrote ReturnStmt

Found 2 functions.

void add5(int *x) {
    *x += 5;
}

int main(void) {
    int result = -1, val = 4;
    add5(&val);
    return val;
}
I hope you enjoyed working through this Clang LibTooling example. Feel free to post questions or follow-ups in the comment section; I’m always willing to add things to the tutorial.

Up next: how to use the Clang Plugin interface.

→ Clang Tutorial Part III: Plugin Example.

72 comments

  1. I did clone your example repository at llvm/tools/clang/tools. Not sure why but when I run make it will compain make: *** No rule to make target `/home/jcwu/repos/llvm-src/tools/clang/tools/example/Makefile', needed by `Makefile'. Stop.

    1. It turns out that if you compile/build your llvm in separate dirs as llvm/llvm-src, llvm/llvm-build, you should put two copies into both paths (still need to follow the path down to the correct subdir) to use the llvm build system. More specifically, the path under llvm-src should at least contain the source code and the path under llvm-build should contain the Makefile.

  2. Hm, that's unusual. I thought perhaps my Makefile was wrong, but I just tried following my own instructions from scratch on a clean install of Ubuntu, and everything worked well. Have you been able to build any other Clang examples? I assume you installed and compiled LLVM and Clang correctly, and that you're running "make" inside of the example directory. Aside from that, I honestly don't know what the problem is… any other information you can give me?

  3. Thanks for tutorial! How can I check if obj-c method was overridden in my class, is there any simple way? or should I parse his parent object and search for method with same name, and go to HIS parent object, on and on untill find needed method or root object?

  4. For other things in tools/ folder, I got "nothing to be done for 'all'."I can't see any difference by comparing your Makefile with others in tools/ folder, however I can't make your example.I use the following Makefile instead, it would work but compiling is very slow, I guess it's due to static link too many libraries…?CXX := clang++LLVMCOMPONENTS := cppbackendRTTIFLAG := -fno-rttiLLVMCONFIG := /home/jcwu/repos/llvm/Debug+Asserts/bin/llvm-configCXXFLAGS := -I$(shell $(LLVMCONFIG) –src-root)/tools/clang/include -I$(shell $(LLVMCONFIG) –obj-root)/tools/clang/include $(shell $(LLVMCONFIG) –cxxflags) $(RTTIFLAG)LLVMLDFLAGS := $(shell $(LLVMCONFIG) –ldflags –libs $(LLVMCOMPONENTS))SOURCES = Example.cppOBJECTS = $(SOURCES:.cpp=.o)EXES = $(OBJECTS:.o=)CLANGLIBS = -lclangTooling\ -lclangFrontendTool\ -lclangFrontend\ -lclangDriver\ -lclangSerialization\ -lclangCodeGen\ -lclangParse\ -lclangSema\ -lclangStaticAnalyzerFrontend\ -lclangStaticAnalyzerCheckers\ -lclangStaticAnalyzerCore\ -lclangAnalysis\ -lclangARCMigrate\ -lclangRewriteFrontend\ -lclangRewriteCore\ -lclangEdit\ -lclangAST\ -lclangLex\ -lclangBasic\ $(shell $(LLVMCONFIG) –libs)all: $(OBJECTS) $(EXES)%: %.o $(CXX) -o $@ $< $(CLANGLIBS) $(LLVMLDFLAGS)clean: -rm -f $(EXES) $(OBJECTS) *~

  5. Glad you got it to work! I'm not exactly sure why my Makefile wouldn't work for you, sorry. Note that my example uses Clang's existing build system within the Clang development directories. Hence the bit about "CLANG_LEVEL" … which effectively says "use the Makefile that came with the Clang sources" … this makes things simpler. Your Makefile is using the Clang++ executable to build the sources of Clang, which is fine if it works for you, but that won't be as simple as using the build system that comes with Clang's sources. The reason it's slower is that you have a lot more options/included libraries than my example. For reference, compiling my example takes about 60 seconds on my machine, so it's not the fastest process in the world. You can always try removing some of those libraries, I doubt you need all of them. However, if your code doesn't utilize a library, I can't imagine why it would need to be compiled into your final executable, so perhaps it doesn't matter after all. I'm not sure, perhaps you should ask the cfe-dev mailing list — they're always friendly & helpful.

  6. If I understand your question, you need to find out whether a given subclass has overridden a certain method from a given parent class. I would check out the ObjCMethodDecl Class in Clang, it has some functions for dealing with overriding, such as getOverriddenMethods() … good luck!

  7. wait….compiling your example takes 60 seconds? The makefile I took from other tutorial takes 52 seconds to finish. perhaps around 1 min is normal in libtooling programming. Anyway, thanks for your libtooling tutorial! It's really helpful.

  8. Thank you! I am trying to create Clang Tool with xcode project, and I need to link clangFrontend.a and other binaries in xproj file, but i cannot find where are they placed? I have builded your example before, and I think that binaries should be builded already. Could you tell my if you made xcode project for Clang Tool and how? Maybe I should run some make scripts in "Build Phases" section of xcode project? Thanks!

  9. I don't have any experience with Xcode, but I do know where the libraries are. Each library is prefaced with "lib", so the file you're looking for is called "libclangFrontend.a" … you can find it in llvm/Debug+Asserts/lib/. Incidentally, this is also the default location for libraries created as part of any clang Plugins you build.

  10. I found that by default llvm/clang is build in debug+asserts mode. I recompile everything to release mode and it takes only 4 seconds to compile. Wel…quite obvious from its path, but I didn't know the difference is that huge.

  11. Yes, that is a known characteristic of building C++ applications, and LLVM/Clang is no exception. I probably should've mentioned something about it in the first tutorial segment, but I wanted to keep it short and simple. As I'm sure you know, you can choose a release build with –enable-optimized. See the quick-start tutorial on this LLVM website.

  12. In the example Tool is initialized with newFrontendActionFactory() object, but I need to store pointer on the ExampleFrontendAction instance object in main function. How can I do that? How can I pass pointer on ExampleFrontendAction instance to the ClangTool instance? Thank you!

    1. A sloppy solution would be to store the pointer as a global variable in the C++ file. I’m not 100% clear on your question and why you need to do this. I don’t know if you can pass a pointer to the ClangTool or the FrontendAction itself, but you can include private/public pointer references in the ExampleFrontendAction and pass them to the ASTConsumer and ASTVisitor classes…

      1. Thank you! I’ve noticed that when I run tool on some source file, tool tries find included files, f.e #import “SomeClass.h” , or #import. And if it cannot find headers, it generates errors: fatal error: ‘Foundation/Foundation.h’ file not found. Could you tell me, if you know, how can I direct tool to the standard frameworks? And How can I direct it to the some header search path? Thank you, sorry for my English.

        1. Assuming you mean #include for a C/C++ header file, you’ll need to manually add these include directories to the set of source files that your Clang Tool will inspect/analyze. To do this, check out Clang’s “HeaderSearchOptions” and “HeaderSearch” classes.

          1. I’ve solved the issue. you can add path to framework with options

            -Iinclude -Ipath_for_foundation/Headers

            after —

            llvm/Debug+Asserts/bin/mytool /somePath/someSource.mm — -Iinclude -Ipath_for_foundation/Headers
            BUT, standard frameworks usually included with name of framework as prefixes

            #import
            frameworks sources are placed in the folder called Headers, so clang cannot find them. So, i’am going to find solution for that issue.

            1. Ah, yes, that’s correct. I misunderstood you, I thought you wanted to add a header to the list of files to traverse with your ASTConsumer. Glad you figured it out.

  13. Hi Kevin, I’ve followed the tutorial and still i just dont get it to work and I feel like I’m missing something. I’ve done a clean setup for Ubuntu 13.10 and built and installed clang completely (version 3.5). I’ve done clone of the example on llvm/tools/clang/tools just as it says so on github.
    I’ve made cd example and make, but nothing happens except that it says there is no Makefile.config file. Since I don’t know much of it I tried on other tool, but keep getting the same response. I hope you can help somehow. Thanks

    1. Hi Kevin, I’ve managed to solve the problem. I’ve done it with these simple steps:
      – Cloned your repository at ../build/tools/clang/tools (the same build directory as in part I of the tutorial)
      – Added
      static cl::OptionCategory ClangExCategory(“examplians”);
      right before the variables declarations at Example.cpp
      – Updated the constructor of the CommonOptionsParser with
      CommonOptionsParser op(argc, argv,ClangExCategory);
      – Then cd example
      – And finally make
      I have to say that I did it out of intuition but it runs pefectly now.

      1. Glad you figured it out. It’s possible that Clang may have changed since I wrote this tutorial, I’ll take a look at the new version to see if your extra steps should be included in my tutorial. Thanks!

  14. Hello. Thank you for this very straightforward tutorial 🙂
    I compiled it as a Visual Studio 2012 project, but when I try to use the compiled binary :

    > LibToolingExampleVS.exe test.c

    …It gives me the following error :

    LLVM ERROR: Could not auto-detect compilation database for file “test.c”
    No compilation database found in D:\Dev\CLANG-Examples\LibToolingExampleVS\bin\Release or any parent directory
    json-compilation-database: Error while opening JSON database: File not found

    I did some research about json compilation database for LLVM, but it didn’t help.
    Actually I don’t need compilation, only parsing and using the AST, so I suppose I should not need that compilation database ?

    1. You shouldn’t need a compilation database for this, that’s for more advanced code setups. Also, I would not recommend setting this up in Visual Studio, because it’s probably using Microsoft’s C++ compiler, which doesn’t always match up with the C++ standard. Unfortunately, I can’t help with those errors because they seem to be caused by your project setup.

    2. Looks like two trailing dashes are missing. Should be:

      > LibToolingExampleVS.exe test.c —

      P.S. I understand it’s too late, just want to answer the question for future readers of the comment.

      1. Hm, interesting. I wonder if that actually makes things run correctly… did you try it out yourself? Seems like it shouldn’t change anything, but I haven’t actually used Visual Studio for this.

        1. I didn’t try running your example, but it’s a thing of CommonOptionsParser if I understand it correctly. If I don’t specify dashes, I get exactly the same error message as Charles. I use GNU/Linux, not Windows.

          Also, you pass double dashes in examples in the post, official Clang tutorials have them too. It should be it.

            1. Right, I did notice that the double dashes are required on my machine, but I wasn’t sure if that was Clang’s requirement or simply a behavior of Bash on Linux vs. something like Cygwin. Thanks for helping me clarify!

    1. AFAIK, there is no static list of all the Visit* functions, because they’re defined using templates and #define macros. Check out this portion of the source code for the RecursiveASTVisitor class. You can see where they define all the Traverse*(), WalkUpFrom*(), and Visit*() functions for each type of Stmt, such as:

      • Unary Operators (Line 299)
      • Binary Operators (Line 318)
      • Compound Assignment Operators (Line 340)

      The same thing is done for all kinds of Decls (starting Line 404). So effectively, the short answer is that Visit*() functions exist for pretty much every type of node in the AST. You should be able to declare and implement Visit*() functions for anything.

  15. Hello,

    thank You very much for the good example. It helped me a lot in getting started. During the analysis and modifications of the code I found the following problem:

    In the main function the ClangTool.run method is executed. After this the rewriter is used to get the reference to the SourceManager. The problem is that the destructor of the SourceManager was already called at this point in time. Therefore the reference to the FileId is only valid by chance. On the Visual Studio 12 debugger the reference is made invalid (it seems) and hence the following code fails.
    Well… I do not have a solution for this problem by now. And I also could not figure out how the API is really meant to be used. Do You have an idea?

  16. First, I don’t recommend using Visual Studio with Clang, it will almost surely give you trouble.
    Second, if you’re wondering how to maintain a persistent instance of a C++ object, that can be done in a variety of ways. You could make it a global variable, for one. Or you could use a type of weak reference. That’s not exactly a problem that’s specific to Clang…

    best of luck!

    1. Hi again,

      actually I have no other chance then getting the tool to run on a Windows machine because this is the target system. So Visual Studio seems to be the solution. Eclipse on MacOSX Mavericks does cause too much trouble and Xcode is a hell with respect to the usability – I did not even understand how to debug one of the many targets in the complete llvm/clang project.

      But the main point which I wanted to make with my post is that also in Your example an error is included. Also for other systems the call for the destructor is done before You use Your variable. This is not specific to Windows or Unix. The rewriter should (if I understand the documentation right) be use in the method “virtual void EndSourceFileAction ()” of the FrontendAction. It is not sufficient to make only the rewriter global.

      But also with this still a different problem comes up which I could not resolve yet. We will see…

      1. Oh, I missed to explain changes.

        It seems like to due to new version of Clang not CentOS.

        Main differences are:

        1) CommonOptionParser needs 3rd parameter cl::OptionCategory.

        cl::OptionCategory my_tool_category(“my tool option”);
        clang::tooling::CommonOptionsParser op(argc, argv, my_tool_category);

        2) ClangTool::run() needs newFrontEndActionFactory().get() rather newFrontEndActionFactory().

        int result = tool.run(clang::tooling::newFrontendActionFactory().get());

        3) last, I changed Makefile so that source files can be located anywhere.

  17. Thanks for this article. It’s very helpful. You have some formatting issues in the code block (the one that has the dyn_cast in it). On Safari here. Many of the lines got scrunched up into one.

  18. Can someone tell how to traverse comments using ASTContext as i wanted to categorise the comments as a single line comment or multiple line comment using clang apis

  19. Apart from the changes mentioned in https://kevinaboos.wordpress.com/2013/07/23/clang-tutorial-part-ii-libtooling-example/comment-page-1/#comment-86, I had to make a couple of more changes (mentioned below). I am running Clang 3.6

    1. In the Makefile, change clangRewriteCore.a to clangRewrite.a (the former lib does not exist in Clang 3.6 — not sure when they removed). A version-based condition would be ideal, but I don’t know how to do that, so I simply replaced it.

    2. Change the constructor of ExampleFrontendAction to the following. The return type of the virtual function CreateASTConsumer has changed, apparently. Hence the cast.

    virtual std::unique_ptr CreateASTConsumer(CompilerInstance &CI, StringRef file) {
    return std::unique_ptr(new ExampleASTConsumer(&CI));
    }
    };

    Thanks for the tutorial! 😀 It helped greatly in getting started.

  20. I’m really loving the theme/design of your blog.
    Do you ever run into aany web browser compatibility issues?
    A handful of my blog visitors have complained about my blog not working correctly in Expllorer but looks great in Opera.
    Do yoou have any suggestions to help fix this problem?

  21. Thank you for your tutorial! I can not wait to accomplish my tool! If I have some problems, I think I will back again for asking you.

    1. Update : I made a few changes to run this example in Clang 3.7
      The code runs fine, shows the output as in tutorial but, doesn’t rewrite anything in the test.c file… what am I missing?

      1. Hi! I was able to resolve the issue… I used “rewriter.overwriteChangedFiles();” to update the test.c file

  22. Clang 3.7

    while building from llvm/tools/clang/tools/example directordy (executing ‘$make’) it gave me below error.

    In-source builds are not allowed. Please configure from a separate build directory!. Stop.

    do I have to build all (using ‘$make all’) from llvm folder or missing something else?

  23. thanks a lot for your post!
    But I found one mistake – when you replace text for function call, you need to replace it only if name of the called function match with do_math. In your source code example its not important, since you have only do_math function call, but still)

    We can use getDirectCallee method to obtain FunctionDecl from CallExpr and determine the name of the function.

    good luck!

  24. Hi, great tutorial to getting started 🙂
    I have however faced a problem, and cannot seem to find the solution. Maybe you can help.
    I use the same code as yours to replace the text or integers, but before replacing text I save the original buffer.

    StringRef original = rewriter.getSourceMgr().getBufferData(rewriter.getSourceMgr().getMainFileID());

    then i replace text like you suggested, save the buffer into a file, and reinitialize the buffer with original buffer “original” like this:

    rewriter.getEditBuffer(rewriter.getSourceMgr().getMainFileID()).Initialize(original);

    This works perfectly fine, but the text which is encountered afterwards gets all messy. Like the cursor gets disturbed.

    For example i have 2 stmnts :
    int a = 4 + b;
    int c = a – 2;

    The tool changes int a = 4 + b to int a = 51 + b, which is right.
    afterwards it messes int c = a – 2 to int c = a 32, as an example.

    This overwriting creates the problem in all following lines :/ any idea how it can be fixed?
    Thanks again btw for the great article 🙂

      1. It didn’t work out 😦 I think I am mixing the functionality of rewrite buffers somehow. Like as soon as I insert/remove/replace text, and then reinitialize the buffer with original, the rewriter loses the track of cursor. For example, if cursor was at loc row 1 col 10, and I replace the text at this point till col 15. So, after reinitialization the cursor will still be at col 15. That’s what I can understand from the behavior. I maybe wrong.

  25. Hi, I’m using ‘llvm 3.8’ and I have build it in a separate ‘build’ folder.
    In my build folder I do not have a Makefile.config so when i run ‘make’:

    ‘include $(CLANG_LEVEL)/../../Makefile.config’ fails

    if i comment and run ‘make’ the process compiles the ‘Example.cpp” but no executable ‘example’ is generated.

    maybe someone can help me out here. Tanks.

  26. Hi Kevin, I am getting seg fault on running your tool (with the modifications suggested by previous comments). Any idea?

    1. Basically its because of this line: rewriter.ReplaceText(ret->getRetValue()->getLocStart(), 6, “val”);

      Also, how do I restrict this code to only parse my source code and not the functions from glibc or some other libraries??

    2. Hi , this might be too late an answer for the above question , but it might help future followers.
      I was also getting segmentation fault , then I realized I did not add
      ” explicit ExampleASTConsumer(CompilerInstance *CI)
      : visitor(new ExampleVisitor(CI)) // initialize the visitor
      { }”
      in the ASTConsumer class .. it fixed my segmentation fault.

  27. Thanks so much for this post!

    Even though it’s been more than three years since you wrote this article this article, the API apparently hasn’t changed too much … I didn’t have to change much to get everything working using the LLVM/Clang 3.9.0 source code when compiling on Windows (Release/x64) using Visual Studio 2015 Update 3.

    For those compiling on Windows using Visual Studio, her are the extra settings that I used:

    Note: I 7zip-ped the llvm-3.9.0.src.tar.xz, cfe-3.9.0.src.tar.xz, and clang-tools-extra-3.9.0.src.tar.xz downloads to C:\LLVM\3.9.0\llvm, C:\LLVM\3.9.0\llvm\tools\clang, and C:\LLVM\3.9.0\llvm\tools\clang\tools\extra, respectively … and then had cmake write the solution, projects, etc. into this build folder: C:\LLVM\3.9.0\build … and then built Release x64.

    Properties | Configuration Properties | C/C++ | General | Additional Include Directories:
    C:\LLVM\3.9.0\build\include
    C:\LLVM\3.9.0\build\tools\clang\include
    C:\LLVM\3.9.0\llvm\include
    C:\LLVM\3.9.0\llvm\tools\clang\include

    Properties | Configuration Properties | Linker | Additional Library Directories:
    C:\LLVM\3.9.0\build\Release\lib

    … and essentially the mincore.lib from the Windows 10 libraries I used to originally build LLVM/Clang and then all of the libraries under C:\LLVM\3.9.0\build\Release\lib … because I got tired of guessing which LLVM/Clang libraries to add. In other words, I used the following:

    Properties | Configuration Properties | Linker | Additional Dependencies:
    C:\Program Files (x86)\Windows Kits\10\Lib\10.0.14393.0\um\x64\mincore.lib
    clangAnalysis.lib
    clangApplyReplacements.lib
    clangARCMigrate.lib
    clangAST.lib
    clangASTMatchers.lib
    clangBasic.lib
    clangCodeGen.lib
    clangDriver.lib
    clangDynamicASTMatchers.lib
    clangEdit.lib
    clangFormat.lib
    clangFrontend.lib
    clangFrontendTool.lib
    clangIncludeFixer.lib
    clangIndex.lib
    clangLex.lib
    clangParse.lib
    clangQuery.lib
    clangRename.lib
    clangRewrite.lib
    clangRewriteFrontend.lib
    clangSema.lib
    clangSerialization.lib
    clangStaticAnalyzerCheckers.lib
    clangStaticAnalyzerCore.lib
    clangStaticAnalyzerFrontend.lib
    clangTidy.lib
    clangTidyBoostModule.lib
    clangTidyCERTModule.lib
    clangTidyCppCoreGuidelinesModule.lib
    clangTidyGoogleModule.lib
    clangTidyLLVMModule.lib
    clangTidyMiscModule.lib
    clangTidyModernizeModule.lib
    clangTidyPerformanceModule.lib
    clangTidyPlugin.lib
    clangTidyReadabilityModule.lib
    clangTidyUtils.lib
    clangTooling.lib
    clangToolingCore.lib
    findAllSymbols.lib
    gtest.lib
    gtest_main.lib
    libclang.lib
    LLVMAArch64AsmParser.lib
    LLVMAArch64AsmPrinter.lib
    LLVMAArch64CodeGen.lib
    LLVMAArch64Desc.lib
    LLVMAArch64Disassembler.lib
    LLVMAArch64Info.lib
    LLVMAArch64Utils.lib
    LLVMAMDGPUAsmParser.lib
    LLVMAMDGPUAsmPrinter.lib
    LLVMAMDGPUCodeGen.lib
    LLVMAMDGPUDesc.lib
    LLVMAMDGPUDisassembler.lib
    LLVMAMDGPUInfo.lib
    LLVMAMDGPUUtils.lib
    LLVMAnalysis.lib
    LLVMARMAsmParser.lib
    LLVMARMAsmPrinter.lib
    LLVMARMCodeGen.lib
    LLVMARMDesc.lib
    LLVMARMDisassembler.lib
    LLVMARMInfo.lib
    LLVMAsmParser.lib
    LLVMAsmPrinter.lib
    LLVMBitReader.lib
    LLVMBitWriter.lib
    LLVMBPFAsmPrinter.lib
    LLVMBPFCodeGen.lib
    LLVMBPFDesc.lib
    LLVMBPFInfo.lib
    LLVMCodeGen.lib
    LLVMCore.lib
    LLVMCoverage.lib
    LLVMDebugInfoCodeView.lib
    LLVMDebugInfoDWARF.lib
    LLVMDebugInfoPDB.lib
    LLVMExecutionEngine.lib
    LLVMGlobalISel.lib
    LLVMHexagonAsmParser.lib
    LLVMHexagonCodeGen.lib
    LLVMHexagonDesc.lib
    LLVMHexagonDisassembler.lib
    LLVMHexagonInfo.lib
    LLVMInstCombine.lib
    LLVMInstrumentation.lib
    LLVMInterpreter.lib
    LLVMipo.lib
    LLVMIRReader.lib
    LLVMLibDriver.lib
    LLVMLineEditor.lib
    LLVMLinker.lib
    LLVMLTO.lib
    LLVMMC.lib
    LLVMMCDisassembler.lib
    LLVMMCJIT.lib
    LLVMMCParser.lib
    LLVMMipsAsmParser.lib
    LLVMMipsAsmPrinter.lib
    LLVMMipsCodeGen.lib
    LLVMMipsDesc.lib
    LLVMMipsDisassembler.lib
    LLVMMipsInfo.lib
    LLVMMIRParser.lib
    LLVMMSP430AsmPrinter.lib
    LLVMMSP430CodeGen.lib
    LLVMMSP430Desc.lib
    LLVMMSP430Info.lib
    LLVMNVPTXAsmPrinter.lib
    LLVMNVPTXCodeGen.lib
    LLVMNVPTXDesc.lib
    LLVMNVPTXInfo.lib
    LLVMObjCARCOpts.lib
    LLVMObject.lib
    LLVMObjectYAML.lib
    LLVMOption.lib
    LLVMOrcJIT.lib
    LLVMPasses.lib
    LLVMPowerPCAsmParser.lib
    LLVMPowerPCAsmPrinter.lib
    LLVMPowerPCCodeGen.lib
    LLVMPowerPCDesc.lib
    LLVMPowerPCDisassembler.lib
    LLVMPowerPCInfo.lib
    LLVMProfileData.lib
    LLVMRuntimeDyld.lib
    LLVMScalarOpts.lib
    LLVMSelectionDAG.lib
    LLVMSparcAsmParser.lib
    LLVMSparcAsmPrinter.lib
    LLVMSparcCodeGen.lib
    LLVMSparcDesc.lib
    LLVMSparcDisassembler.lib
    LLVMSparcInfo.lib
    LLVMSupport.lib
    LLVMSymbolize.lib
    LLVMSystemZAsmParser.lib
    LLVMSystemZAsmPrinter.lib
    LLVMSystemZCodeGen.lib
    LLVMSystemZDesc.lib
    LLVMSystemZDisassembler.lib
    LLVMSystemZInfo.lib
    LLVMTableGen.lib
    LLVMTarget.lib
    LLVMTransformUtils.lib
    LLVMVectorize.lib
    LLVMX86AsmParser.lib
    LLVMX86AsmPrinter.lib
    LLVMX86CodeGen.lib
    LLVMX86Desc.lib
    LLVMX86Disassembler.lib
    LLVMX86Info.lib
    LLVMX86Utils.lib
    LLVMXCoreAsmPrinter.lib
    LLVMXCoreCodeGen.lib
    LLVMXCoreDesc.lib
    LLVMXCoreDisassembler.lib
    LLVMXCoreInfo.lib
    LTO.lib

    … obviously, all of these libraries are not needed … but I decided to let the linker figure what was needed and what wasn’t .. instead of wasting my time on this.

    For the code, I kept everything from your Example.cpp (on github) the same … except for the following changes:

    // I DIDN’T CHANGE CODE BEFORE THIS LINE

    class ExampleFrontendAction : public ASTFrontendAction {
    public:
    virtual std::unique_ptr CreateASTConsumer( CompilerInstance &CI, StringRef file ) {
    return std::make_unique( &CI ); // pass CI pointer to ASTConsumer
    }
    };

    int main( int argc, const char **argv ) {
    llvm::cl::OptionCategory optionCategory( “tool options” );
    // parse the command-line args passed to your code
    CommonOptionsParser op( argc, argv, optionCategory );
    // create a new Clang Tool instance (a LibTooling environment)
    ClangTool Tool( op.getCompilations(), op.getSourcePathList() );

    // run the Clang Tool, creating a new FrontendAction (explained below)
    int result = Tool.run( newFrontendActionFactory().get() );

    // I DIDN’T CHANGE CODE AFTER THIS LINE

    There were a ton of warnings (204 warnings … even on Warning Level 3); however, everything compiled and linked successfully … and ran just the way that was described it this post. Here was my command line:

    C:\Users\Joshua\Documents\Visual Studio 2015\Projects\Example\x64\Release>Example.exe test.c —
    ** Rewrote function def: do_math
    ** Rewrote function call
    ** Rewrote ReturnStmt

    Found 2 functions.

    void add5(int *x) {
    *x += 5;
    }

    int main(void) {
    int result = -1, val = 4;
    add5(&val);
    return val;
    }

    Hope This Helps,
    Joshua

    P.S. – Thanks again for this post!

    1. Unfortunately, my greater than ( > ) and less than ( < ) characters were interpreted as html … didn’t quite show up as I intended … so let me try this:

      class ExampleFrontendAction : public ASTFrontendAction {
      public:
      	virtual std::unique_ptr<ASTConsumer> CreateASTConsumer( CompilerInstance &CI, StringRef file ) {
      		return std::make_unique<ExampleASTConsumer>( &CI ); // pass CI pointer to ASTConsumer
      	}
      };
       
       
       
      int main( int argc, const char **argv ) {
      	llvm::cl::OptionCategory optionCategory( "tool options" );
      	// parse the command-line args passed to your code
      	CommonOptionsParser op( argc, argv, optionCategory );
      	// create a new Clang Tool instance (a LibTooling environment)
      	ClangTool Tool( op.getCompilations(), op.getSourcePathList() );
       
      	// run the Clang Tool, creating a new FrontendAction (explained below)
      	int result = Tool.run( newFrontendActionFactory<ExampleFrontendAction>().get() );
      
  28. Thank you for this post. I use it to build a little tool to parse some source files.
    But as the number of files to parse in one run increases, the memory consumtion also increases a lot.
    Is there a way to destroy the ASTcontext(the in memory version of the AST) when it is no longer needed? Because once I parse one file I extact the information that I need but the AST context still remains in memory. I would like to avoid that. Any hints?

Leave a reply to seedman Cancel reply