LibTooling Example
I’ll start with a LibTooling example because I think it’s the most useful interface to Clang, as described in Part I of this tutorial. You can also use this code in the Plugin environment with just a few simple changes. Onwards to the example!
Note: this is a continuation of Clang Tutorial: Part I.
Let’s say that you want to analyze a simple C file, test.c, as shown below:
void do_math(int *x) { *x += 5; } int main(void) { int result = -1, val = 4; do_math(&val); return result; }
We’d like to do some simple refactoring/fixes on test.c:
- Change the function
do_math
toadd5
, a better name - Change all calls to
do_math
to calladd5
instead to match the first change - Fix the return statement to return
val
instead ofresult
First, we need to set up some structure. Clang comes packaged with a good Plugin example, but not a LibTooling example. No problem, here’s how to create a LibTooling program.
Here is the full source code for this LibTooling example.
You can check out the repository into the proper directory like so:
$ cd llvm/tools/clang/tools/ $ git clone https://github.com/kevinaboos/LibToolingExample.git example
The following sections are a complete walkthrough of the code in Example.cpp
.
Starting with a Main Function
Let’s follow along with the code’s execution flow, starting from the main()
function at the bottom of the file:
int main(int argc, const char **argv) { // parse the command-line args passed to your code CommonOptionsParser op(argc, argv); // create a new Clang Tool instance (a LibTooling environment) ClangTool Tool(op.getCompilations(), op.getSourcePathList()); // run the Clang Tool, creating a new FrontendAction (explained below) int result = Tool.run(newFrontendActionFactory<ExampleFrontendAction>()); errs() << "\nFound " << numFunctions << " functions.\n\n"; // print out the rewritten source code ("rewriter" is a global var.) rewriter.getEditBuffer(rewriter.getSourceMgr().getMainFileID()).write(errs()); return result; }
Everything in main()
is fairly self-explanatory, except rewriter
, which is explained later. You’re just setting up a Clang Tool (L5), passing it command-line arguments (op.getCompilations()
) and the list of source files (op.getSourcePathList()
), and then running it (L8). The benefit of using LibTooling is that you can do things before (L6) and after (L9) the analysis, like printing out the modified code in L12 or counting the number of functions in several source files (L10). You cannot do this with a Clang Plugin.We also need a few global variables (at the top of the file):
Rewriter rewriter; int numFunctions = 0;
Creating a FrontendAction
Now, let’s create our own custom FrontendAction, which is merely a class that will perform some action in the Clang frontend. We’ll choose an ASTFrontendAction because we want to analyze the AST representation of the source code test.c.
class ExampleFrontendAction : public ASTFrontendAction { public: virtual ASTConsumer *CreateASTConsumer(CompilerInstance &CI, StringRef file) { return new ExampleASTConsumer(&CI); // pass CI pointer to ASTConsumer } };
Nothing too complex going on here. We just create a subclass of ASTFrontendAction and overwrite the CreateASTConsumer function so that we can return our own ASTConsumer, as shown below. We also pass a pointer to the CompilerInstance because it contains a lot of contextual information that we’ll need later during our actual analysis.
Creating an ASTConsumer
The ASTConsumer “consumes” (reads) the AST produced by the Clang parser. You can override however many functions you wish, so that your code will be called when a certain type of AST item has been parsed. First, let’s override HandleTopLevelDecl()
, which will be called whenever Clang parses a new set of top-level declarations (such as global variables, function definitions, etc.)
class ExampleASTConsumer : public ASTConsumer { private: ExampleVisitor *visitor; // doesn't have to be private public: // override the constructor in order to pass CI explicit ExampleASTConsumer(CompilerInstance *CI) : visitor(new ExampleVisitor(CI)) // initialize the visitor { } // override this to call our ExampleVisitor on each top-level Decl virtual bool HandleTopLevelDecl(DeclGroupRef DG) { // a DeclGroupRef may have multiple Decls, so we iterate through each one for (DeclGroupRef::iterator i = DG.begin(), e = DG.end(); i != e; i++) { Decl *D = *i; visitor->TraverseDecl(D); // recursively visit each AST node in Decl "D" } return true; } };
Basically, this code uses our ExampleVisitor (described below) to visit the AST nodes in each top-level declaration in the entire source file. For test.c, two Decls would be visited, the FunctionDecl for do_math()
and the FunctionDecl for main()
.
A Better Implementation of ASTConsumer
However, overriding HandleTopLevelDecl()
means that your code in that function will be immediately called each time a new Decl is parsed in the source, not after the entire source file has been parsed. This creates a problem because, from the parser’s point of view, when do_math()
is being visited, it is completely unaware that main()
exists. This means you can’t access or reason about functions defined after the function you’re currently analyzing.
★ ★ hey, that’s important! ★ ★
Luckily, the ASTConsumer class has a better function to override, HandleTranslationUnit()
, which is called only after the entire source file is parsed. In this case, a translation unit effectively represents an entire source file. An ASTContext class is used to represent the AST for that source file, and it has a ton of useful members (go read about it!).
So, instead of overriding HandleTopLevelDecl()
, let’s go with HandleTranslationUnit()
because it works more like we would expect.
// this replaces "HandleTopLevelDecl" // override this to call our ExampleVisitor on the entire source file virtual void HandleTranslationUnit(ASTContext &Context) { /* we can use ASTContext to get the TranslationUnitDecl, which is a single Decl that collectively represents the entire source file */ visitor->TraverseDecl(Context.getTranslationUnitDecl()); }
For the most part, you should use HandleTranslationUnit()
, especially when you pair it with a RecursiveASTVisitor, like we do below.
Creating a RecursiveASTVisitor
At long last we’re ready to get some real work done. The previous two sections were just to set up infrastructure.The RecursiveASTVisitor is a fascinating class with more to it than meets the eye. It allows you to Visit any type of AST node, such as FunctionDecl and Stmt, simply by overriding a function with that name, e.g., VisitFunctionDecl and VisitStmt. This same format works with any AST class. Clang also offers an official tutorial on this, and while complete, it’s very brief.
For such Visit* functions, you must return true
to continue traversing the AST (examining other nodes) and return false
to halt the traversal entirely and essentially exit Clang. You shouldn’t ever call any of the Visit* functions directly; instead call TraverseDecl (like we did in our ExampleASTConsumer above), which will call the correct Visit* function behind the scenes.
Based on our objective of rewriting function definitions and statements, we only need to override VisitFunctionDecl and VisitStmt. Here’s how to do so:
class ExampleVisitor : public RecursiveASTVisitor { private: ASTContext *astContext; // used for getting additional AST info public: explicit ExampleVisitor(CompilerInstance *CI) : astContext(&(CI->getASTContext())) // initialize private members { rewriter.setSourceMgr(astContext->getSourceManager(), astContext->getLangOpts()); } virtual bool VisitFunctionDecl(FunctionDecl *func) { numFunctions++; string funcName = func->getNameInfo().getName().getAsString(); if (funcName == "do_math") { rewriter.ReplaceText(func->getLocation(), funcName.length(), "add5"); errs() << "** Rewrote function def: " << funcName << "\n"; } return true; } virtual bool VisitStmt(Stmt *st) { if (ReturnStmt *ret = dyn_cast(st)) { rewriter.ReplaceText(ret->getRetValue()->getLocStart(), 6, "val"); errs() << "** Rewrote ReturnStmt\n"; } if (CallExpr *call = dyn_cast(st)) { rewriter.ReplaceText(call->getLocStart(), 7, "add5"); errs() << "** Rewrote function call\n"; } return true; } };
The above code introduces the Rewriter class, which lets you make textual changes to the source code. It is commonly used for refactoring or making small code changes. We also used it at the end of our program’s main()
function to print out the full modified source code.
Using Rewriter means that you need to find the correct SourceLocation to insert/replace text. Understanding which location to choose (getLocation()
, getLocStart()
, etc) can be difficult, so I’ve explained several common types of location getters in this post.
Also, note the use of dyn_cast
(L24, L28) to check whether the Stmt st
is a ReturnStmt or a CallExpr. Click to read more about dyn_cast.
Finally, errs()
is simply the stderr stream used throughout LLVM/Clang to print debugging info.
Getting More Specific with Visit* Functions
In this example, we only need to modify return statements (L8 of test.c) and function calls (L7 of test.c). Therefore, instead of overriding VisitStmt (more generic), we can override VisitReturnStmt and VisitCallExpr (more specific). Both ReturnStmt and CallExpr are subclasses of Stmt. This is the beauty of Clang’s AST and the RecursiveASTVisitor class — it can be as generic or as specific as you need. Here’s what that more specific code would look like:
// this replaces the VisitStmt function above virtual bool VisitReturnStmt(ReturnStmt *ret) { rewriter.ReplaceText(ret->getLocStart(), 6, "val"); errs() << "** Rewrote ReturnStmt\n"; return true; } virtual bool VisitCallExpr(CallExpr *call) { rewriter.ReplaceText(call->getLocStart(), 7, "add5"); errs() << "** Rewrote function call\n"; return true; }
Putting It All Together and Compiling the Analysis
Again, I have posted the full code for this example, including all #include
directives and global variables, as well as a Makefile and instructions for compiling and running it. Clang makefiles can be complex, but I won’t bother describing them because there’s some fairly good documentation on Clang Makefiles. Here’s the full Makefile necessary for building this example:
CLANG_LEVEL := ../.. TOOLNAME = example #the name of your tool's executable SOURCES := Example.cpp #the Clang source files you want to compile include $(CLANG_LEVEL)/../../Makefile.config LINK_COMPONENTS := $(TARGETS_TO_BUILD) asmparser bitreader support mc option USEDLIBS = clangFrontend.a clangSerialization.a clangDriver.a \ clangTooling.a clangParse.a clangSema.a \ clangAnalysis.a clangRewriteFrontend.a clangRewriteCore.a \ clangEdit.a clangAST.a clangLex.a clangBasic.a include $(CLANG_LEVEL)/Makefile
The above Makefile only works if you keep your sources in the llvm/tools/clang/tools
directory. For example, I created a directory at llvm/tools/clang/tools/example/
for placing the example C++ file and Makefile.
Running the Analysis
#!/bin/bash LLVM_DIR=~/static_analysis/llvm/ #the location of your llvm dir $LLVM_DIR/Debug+Asserts/bin/example test.c --
You can run this script from anywhere, as long as it’s in the same directory as your test.c file.
Perhaps you’d like to run your LibTooling analysis on multiple source files, or use some additional CFLAGS (or GCC options), or even pass your own command-line arguments to your Clang code. The script below has examples of all three:
#!/bin/bash LLVM_DIR=~/static_analysis/llvm/ #the location of your llvm dir $LLVM_DIR/Debug+Asserts/bin/example \ file1.c file2.c file3.c \ -myCmdLineArg argument1 \ -- \ -Wall -Isome/include/dir
Note that anything before the double dash “--
” on L6 is an input to your LibTooling program, argv
in main()
, while anything after the double dash is an input to Clang itself (you won’t concern yourself with those).
If you want to learn how to parse command-line arguments such as “-myCmdLineArg argument1
” above, then check out the LibTooling section of my post Clang: Handling Command-Line Arguments (coming soon!).
Conclusion
Assuming you were able to run your analysis script, congratulations on your first foray into Clang! You should see the following output:
** Rewrote function def: do_math ** Rewrote function call ** Rewrote ReturnStmt Found 2 functions. void add5(int *x) { *x += 5; } int main(void) { int result = -1, val = 4; add5(&val); return val; }
Up next: how to use the Clang Plugin interface.
I did clone your example repository at llvm/tools/clang/tools. Not sure why but when I run make it will compain make: *** No rule to make target `/home/jcwu/repos/llvm-src/tools/clang/tools/example/Makefile', needed by `Makefile'. Stop.
It turns out that if you compile/build your llvm in separate dirs as llvm/llvm-src, llvm/llvm-build, you should put two copies into both paths (still need to follow the path down to the correct subdir) to use the llvm build system. More specifically, the path under llvm-src should at least contain the source code and the path under llvm-build should contain the Makefile.
Hm, that's unusual. I thought perhaps my Makefile was wrong, but I just tried following my own instructions from scratch on a clean install of Ubuntu, and everything worked well. Have you been able to build any other Clang examples? I assume you installed and compiled LLVM and Clang correctly, and that you're running "make" inside of the example directory. Aside from that, I honestly don't know what the problem is… any other information you can give me?
Thanks for tutorial! How can I check if obj-c method was overridden in my class, is there any simple way? or should I parse his parent object and search for method with same name, and go to HIS parent object, on and on untill find needed method or root object?
For other things in tools/ folder, I got "nothing to be done for 'all'."I can't see any difference by comparing your Makefile with others in tools/ folder, however I can't make your example.I use the following Makefile instead, it would work but compiling is very slow, I guess it's due to static link too many libraries…?CXX := clang++LLVMCOMPONENTS := cppbackendRTTIFLAG := -fno-rttiLLVMCONFIG := /home/jcwu/repos/llvm/Debug+Asserts/bin/llvm-configCXXFLAGS := -I$(shell $(LLVMCONFIG) –src-root)/tools/clang/include -I$(shell $(LLVMCONFIG) –obj-root)/tools/clang/include $(shell $(LLVMCONFIG) –cxxflags) $(RTTIFLAG)LLVMLDFLAGS := $(shell $(LLVMCONFIG) –ldflags –libs $(LLVMCOMPONENTS))SOURCES = Example.cppOBJECTS = $(SOURCES:.cpp=.o)EXES = $(OBJECTS:.o=)CLANGLIBS = -lclangTooling\ -lclangFrontendTool\ -lclangFrontend\ -lclangDriver\ -lclangSerialization\ -lclangCodeGen\ -lclangParse\ -lclangSema\ -lclangStaticAnalyzerFrontend\ -lclangStaticAnalyzerCheckers\ -lclangStaticAnalyzerCore\ -lclangAnalysis\ -lclangARCMigrate\ -lclangRewriteFrontend\ -lclangRewriteCore\ -lclangEdit\ -lclangAST\ -lclangLex\ -lclangBasic\ $(shell $(LLVMCONFIG) –libs)all: $(OBJECTS) $(EXES)%: %.o $(CXX) -o $@ $< $(CLANGLIBS) $(LLVMLDFLAGS)clean: -rm -f $(EXES) $(OBJECTS) *~
Glad you got it to work! I'm not exactly sure why my Makefile wouldn't work for you, sorry. Note that my example uses Clang's existing build system within the Clang development directories. Hence the bit about "CLANG_LEVEL" … which effectively says "use the Makefile that came with the Clang sources" … this makes things simpler. Your Makefile is using the Clang++ executable to build the sources of Clang, which is fine if it works for you, but that won't be as simple as using the build system that comes with Clang's sources. The reason it's slower is that you have a lot more options/included libraries than my example. For reference, compiling my example takes about 60 seconds on my machine, so it's not the fastest process in the world. You can always try removing some of those libraries, I doubt you need all of them. However, if your code doesn't utilize a library, I can't imagine why it would need to be compiled into your final executable, so perhaps it doesn't matter after all. I'm not sure, perhaps you should ask the cfe-dev mailing list — they're always friendly & helpful.
If I understand your question, you need to find out whether a given subclass has overridden a certain method from a given parent class. I would check out the ObjCMethodDecl Class in Clang, it has some functions for dealing with overriding, such as getOverriddenMethods() … good luck!
wait….compiling your example takes 60 seconds? The makefile I took from other tutorial takes 52 seconds to finish. perhaps around 1 min is normal in libtooling programming. Anyway, thanks for your libtooling tutorial! It's really helpful.
haha, yes, I'm running Linux in a VM with limited resources, so it's a bit slow. Glad I could help, good luck!
Thank you! I am trying to create Clang Tool with xcode project, and I need to link clangFrontend.a and other binaries in xproj file, but i cannot find where are they placed? I have builded your example before, and I think that binaries should be builded already. Could you tell my if you made xcode project for Clang Tool and how? Maybe I should run some make scripts in "Build Phases" section of xcode project? Thanks!
I don't have any experience with Xcode, but I do know where the libraries are. Each library is prefaced with "lib", so the file you're looking for is called "libclangFrontend.a" … you can find it in llvm/Debug+Asserts/lib/. Incidentally, this is also the default location for libraries created as part of any clang Plugins you build.
I found that by default llvm/clang is build in debug+asserts mode. I recompile everything to release mode and it takes only 4 seconds to compile. Wel…quite obvious from its path, but I didn't know the difference is that huge.
Yes, that is a known characteristic of building C++ applications, and LLVM/Clang is no exception. I probably should've mentioned something about it in the first tutorial segment, but I wanted to keep it short and simple. As I'm sure you know, you can choose a release build with –enable-optimized. See the quick-start tutorial on this LLVM website.
Hi Kevin, I post a question at clang's forum. It's about specifying files to parse and finding stddef.h. Not sure if you have encountered that before. http://clang-developers.42468.n3.nabble.com/How-to-use-libtooling-to-parse-multiple-files-at-once-and-succesfully-find-stddef-h-td4035389.html
Ah, yes that question goes beyond my experience. Hope someone on the mailing list can help you.
In the example Tool is initialized with newFrontendActionFactory() object, but I need to store pointer on the ExampleFrontendAction instance object in main function. How can I do that? How can I pass pointer on ExampleFrontendAction instance to the ClangTool instance? Thank you!
A sloppy solution would be to store the pointer as a global variable in the C++ file. I’m not 100% clear on your question and why you need to do this. I don’t know if you can pass a pointer to the ClangTool or the FrontendAction itself, but you can include private/public pointer references in the ExampleFrontendAction and pass them to the ASTConsumer and ASTVisitor classes…
Thank you! I’ve noticed that when I run tool on some source file, tool tries find included files, f.e #import “SomeClass.h” , or #import. And if it cannot find headers, it generates errors: fatal error: ‘Foundation/Foundation.h’ file not found. Could you tell me, if you know, how can I direct tool to the standard frameworks? And How can I direct it to the some header search path? Thank you, sorry for my English.
Assuming you mean #include for a C/C++ header file, you’ll need to manually add these include directories to the set of source files that your Clang Tool will inspect/analyze. To do this, check out Clang’s “HeaderSearchOptions” and “HeaderSearch” classes.
I’ve solved the issue. you can add path to framework with options
-Iinclude -Ipath_for_foundation/Headers
after —
llvm/Debug+Asserts/bin/mytool /somePath/someSource.mm — -Iinclude -Ipath_for_foundation/Headers
BUT, standard frameworks usually included with name of framework as prefixes
#import
frameworks sources are placed in the folder called Headers, so clang cannot find them. So, i’am going to find solution for that issue.
Ah, yes, that’s correct. I misunderstood you, I thought you wanted to add a header to the list of files to traverse with your ASTConsumer. Glad you figured it out.
Hi Kevin, I’ve followed the tutorial and still i just dont get it to work and I feel like I’m missing something. I’ve done a clean setup for Ubuntu 13.10 and built and installed clang completely (version 3.5). I’ve done clone of the example on llvm/tools/clang/tools just as it says so on github.
I’ve made cd example and make, but nothing happens except that it says there is no Makefile.config file. Since I don’t know much of it I tried on other tool, but keep getting the same response. I hope you can help somehow. Thanks
Hi Kevin, I’ve managed to solve the problem. I’ve done it with these simple steps:
– Cloned your repository at ../build/tools/clang/tools (the same build directory as in part I of the tutorial)
– Added
static cl::OptionCategory ClangExCategory(“examplians”);
right before the variables declarations at Example.cpp
– Updated the constructor of the CommonOptionsParser with
CommonOptionsParser op(argc, argv,ClangExCategory);
– Then cd example
– And finally make
I have to say that I did it out of intuition but it runs pefectly now.
Glad you figured it out. It’s possible that Clang may have changed since I wrote this tutorial, I’ll take a look at the new version to see if your extra steps should be included in my tutorial. Thanks!
Hello. Thank you for this very straightforward tutorial 🙂
I compiled it as a Visual Studio 2012 project, but when I try to use the compiled binary :
> LibToolingExampleVS.exe test.c
…It gives me the following error :
LLVM ERROR: Could not auto-detect compilation database for file “test.c”
No compilation database found in D:\Dev\CLANG-Examples\LibToolingExampleVS\bin\Release or any parent directory
json-compilation-database: Error while opening JSON database: File not found
I did some research about json compilation database for LLVM, but it didn’t help.
Actually I don’t need compilation, only parsing and using the AST, so I suppose I should not need that compilation database ?
You shouldn’t need a compilation database for this, that’s for more advanced code setups. Also, I would not recommend setting this up in Visual Studio, because it’s probably using Microsoft’s C++ compiler, which doesn’t always match up with the C++ standard. Unfortunately, I can’t help with those errors because they seem to be caused by your project setup.
Looks like two trailing dashes are missing. Should be:
> LibToolingExampleVS.exe test.c —
P.S. I understand it’s too late, just want to answer the question for future readers of the comment.
Hm, interesting. I wonder if that actually makes things run correctly… did you try it out yourself? Seems like it shouldn’t change anything, but I haven’t actually used Visual Studio for this.
I didn’t try running your example, but it’s a thing of CommonOptionsParser if I understand it correctly. If I don’t specify dashes, I get exactly the same error message as Charles. I use GNU/Linux, not Windows.
Also, you pass double dashes in examples in the post, official Clang tutorials have them too. It should be it.
Oups, contradiction. I mean that I specify dashes when running my tools, which use libtooling.
Right, I did notice that the double dashes are required on my machine, but I wasn’t sure if that was Clang’s requirement or simply a behavior of Bash on Linux vs. something like Cygwin. Thanks for helping me clarify!
Where do I get the list of all possible Visit* from RecursiveASTVisitor?
AFAIK, there is no static list of all the Visit* functions, because they’re defined using templates and #define macros. Check out this portion of the source code for the RecursiveASTVisitor class. You can see where they define all the Traverse*(), WalkUpFrom*(), and Visit*() functions for each type of Stmt, such as:
The same thing is done for all kinds of Decls (starting Line 404). So effectively, the short answer is that Visit*() functions exist for pretty much every type of node in the AST. You should be able to declare and implement Visit*() functions for anything.
Ok thx! I saw that previously, but was not sure. Now I can recognize the pattern.
Hello,
thank You very much for the good example. It helped me a lot in getting started. During the analysis and modifications of the code I found the following problem:
In the main function the ClangTool.run method is executed. After this the rewriter is used to get the reference to the SourceManager. The problem is that the destructor of the SourceManager was already called at this point in time. Therefore the reference to the FileId is only valid by chance. On the Visual Studio 12 debugger the reference is made invalid (it seems) and hence the following code fails.
Well… I do not have a solution for this problem by now. And I also could not figure out how the API is really meant to be used. Do You have an idea?
First, I don’t recommend using Visual Studio with Clang, it will almost surely give you trouble.
Second, if you’re wondering how to maintain a persistent instance of a C++ object, that can be done in a variety of ways. You could make it a global variable, for one. Or you could use a type of weak reference. That’s not exactly a problem that’s specific to Clang…
best of luck!
Hi again,
actually I have no other chance then getting the tool to run on a Windows machine because this is the target system. So Visual Studio seems to be the solution. Eclipse on MacOSX Mavericks does cause too much trouble and Xcode is a hell with respect to the usability – I did not even understand how to debug one of the many targets in the complete llvm/clang project.
But the main point which I wanted to make with my post is that also in Your example an error is included. Also for other systems the call for the destructor is done before You use Your variable. This is not specific to Windows or Unix. The rewriter should (if I understand the documentation right) be use in the method “virtual void EndSourceFileAction ()” of the FrontendAction. It is not sufficient to make only the rewriter global.
But also with this still a different problem comes up which I could not resolve yet. We will see…
Thank you for great tutorials on clang. This post was helped me a lot.
But when I tested your example code with Clang 3.5 & g++ 4.8.2 on CentOS 6.3, I got compile error.
So I made some changes, finally it worked.
Anyone who are interested in this example code, can get from https://github.com/mysqlguru/clang-libtooling-example
Thanks. (for reading my poor English)
Interesting, what specific changes did you make? Were those changes due to the new version of Clang or due to using CentOS?
Oh, I missed to explain changes.
It seems like to due to new version of Clang not CentOS.
Main differences are:
1) CommonOptionParser needs 3rd parameter cl::OptionCategory.
cl::OptionCategory my_tool_category(“my tool option”);
clang::tooling::CommonOptionsParser op(argc, argv, my_tool_category);
2) ClangTool::run() needs newFrontEndActionFactory().get() rather newFrontEndActionFactory().
int result = tool.run(clang::tooling::newFrontendActionFactory().get());
3) last, I changed Makefile so that source files can be located anywhere.
Thanks for this article. It’s very helpful. You have some formatting issues in the code block (the one that has the dyn_cast in it). On Safari here. Many of the lines got scrunched up into one.
Thanks for bringing that to my attention, it must’ve happened when I transferred the blog from Google Blogger to wordpress. It should be sorted out now.
Can someone tell how to traverse comments using ASTContext as i wanted to categorise the comments as a single line comment or multiple line comment using clang apis
Just wanted to say thank you for this tutorial. It got me going in the right direction.
Apart from the changes mentioned in https://kevinaboos.wordpress.com/2013/07/23/clang-tutorial-part-ii-libtooling-example/comment-page-1/#comment-86, I had to make a couple of more changes (mentioned below). I am running Clang 3.6
1. In the Makefile, change clangRewriteCore.a to clangRewrite.a (the former lib does not exist in Clang 3.6 — not sure when they removed). A version-based condition would be ideal, but I don’t know how to do that, so I simply replaced it.
2. Change the constructor of ExampleFrontendAction to the following. The return type of the virtual function CreateASTConsumer has changed, apparently. Hence the cast.
virtual std::unique_ptr CreateASTConsumer(CompilerInstance &CI, StringRef file) {
return std::unique_ptr(new ExampleASTConsumer(&CI));
}
};
Thanks for the tutorial! 😀 It helped greatly in getting started.
Oh well, the std::unique_ptr type is templated. The template type is ASTConsumer but the HTML formatting removed that tag in the above comment. Find the proper code here: http://pastebin.com/bCDPstbF
I’m really loving the theme/design of your blog.
Do you ever run into aany web browser compatibility issues?
A handful of my blog visitors have complained about my blog not working correctly in Expllorer but looks great in Opera.
Do yoou have any suggestions to help fix this problem?
Thank you for your tutorial! I can not wait to accomplish my tool! If I have some problems, I think I will back again for asking you.
I want to run my FrontendAction after running the preprocessor, how can I do that?
Is this tutorial updated for Clang 3.7?
Update : I made a few changes to run this example in Clang 3.7
The code runs fine, shows the output as in tutorial but, doesn’t rewrite anything in the test.c file… what am I missing?
Hi! I was able to resolve the issue… I used “rewriter.overwriteChangedFiles();” to update the test.c file
Clang 3.7
while building from llvm/tools/clang/tools/example directordy (executing ‘$make’) it gave me below error.
In-source builds are not allowed. Please configure from a separate build directory!. Stop.
do I have to build all (using ‘$make all’) from llvm folder or missing something else?
Error message says that you should apply configure message from another directory.
So following commands are enough to configure system.
cd ..
./llvm/configure
thanks a lot for your post!
But I found one mistake – when you replace text for function call, you need to replace it only if name of the called function match with do_math. In your source code example its not important, since you have only do_math function call, but still)
We can use getDirectCallee method to obtain FunctionDecl from CallExpr and determine the name of the function.
good luck!
Hi, great tutorial to getting started 🙂
I have however faced a problem, and cannot seem to find the solution. Maybe you can help.
I use the same code as yours to replace the text or integers, but before replacing text I save the original buffer.
StringRef original = rewriter.getSourceMgr().getBufferData(rewriter.getSourceMgr().getMainFileID());
then i replace text like you suggested, save the buffer into a file, and reinitialize the buffer with original buffer “original” like this:
rewriter.getEditBuffer(rewriter.getSourceMgr().getMainFileID()).Initialize(original);
This works perfectly fine, but the text which is encountered afterwards gets all messy. Like the cursor gets disturbed.
For example i have 2 stmnts :
int a = 4 + b;
int c = a – 2;
The tool changes int a = 4 + b to int a = 51 + b, which is right.
afterwards it messes int c = a – 2 to int c = a 32, as an example.
This overwriting creates the problem in all following lines any idea how it can be fixed?
Thanks again btw for the great article 🙂
Try using std::string instead of StringRef. StringRef doesn’t own the string, which is probably gets reallocated before you use it again.
It didn’t work out 😦 I think I am mixing the functionality of rewrite buffers somehow. Like as soon as I insert/remove/replace text, and then reinitialize the buffer with original, the rewriter loses the track of cursor. For example, if cursor was at loc row 1 col 10, and I replace the text at this point till col 15. So, after reinitialization the cursor will still be at col 15. That’s what I can understand from the behavior. I maybe wrong.
Hi, I’m using ‘llvm 3.8’ and I have build it in a separate ‘build’ folder.
In my build folder I do not have a Makefile.config so when i run ‘make’:
‘include $(CLANG_LEVEL)/../../Makefile.config’ fails
if i comment and run ‘make’ the process compiles the ‘Example.cpp” but no executable ‘example’ is generated.
maybe someone can help me out here. Tanks.
These tutorials use make build system. Latest version of llvm removed it (apparently it’s “not modern enough” for them; yeah, they think that this is a real reason to do something…). You have to use cmake now. Their documentation (http://llvm.org/docs/CMake.html) can help you with that.
Hi Kevin, I am getting seg fault on running your tool (with the modifications suggested by previous comments). Any idea?
Basically its because of this line: rewriter.ReplaceText(ret->getRetValue()->getLocStart(), 6, “val”);
Also, how do I restrict this code to only parse my source code and not the functions from glibc or some other libraries??
Hi , this might be too late an answer for the above question , but it might help future followers.
I was also getting segmentation fault , then I realized I did not add
” explicit ExampleASTConsumer(CompilerInstance *CI)
: visitor(new ExampleVisitor(CI)) // initialize the visitor
{ }”
in the ASTConsumer class .. it fixed my segmentation fault.
Thanks so much for this post!
Even though it’s been more than three years since you wrote this article this article, the API apparently hasn’t changed too much … I didn’t have to change much to get everything working using the LLVM/Clang 3.9.0 source code when compiling on Windows (Release/x64) using Visual Studio 2015 Update 3.
For those compiling on Windows using Visual Studio, her are the extra settings that I used:
Note: I 7zip-ped the llvm-3.9.0.src.tar.xz, cfe-3.9.0.src.tar.xz, and clang-tools-extra-3.9.0.src.tar.xz downloads to C:\LLVM\3.9.0\llvm, C:\LLVM\3.9.0\llvm\tools\clang, and C:\LLVM\3.9.0\llvm\tools\clang\tools\extra, respectively … and then had cmake write the solution, projects, etc. into this build folder: C:\LLVM\3.9.0\build … and then built Release x64.
Properties | Configuration Properties | C/C++ | General | Additional Include Directories:
C:\LLVM\3.9.0\build\include
C:\LLVM\3.9.0\build\tools\clang\include
C:\LLVM\3.9.0\llvm\include
C:\LLVM\3.9.0\llvm\tools\clang\include
Properties | Configuration Properties | Linker | Additional Library Directories:
C:\LLVM\3.9.0\build\Release\lib
… and essentially the mincore.lib from the Windows 10 libraries I used to originally build LLVM/Clang and then all of the libraries under C:\LLVM\3.9.0\build\Release\lib … because I got tired of guessing which LLVM/Clang libraries to add. In other words, I used the following:
Properties | Configuration Properties | Linker | Additional Dependencies:
C:\Program Files (x86)\Windows Kits\10\Lib\10.0.14393.0\um\x64\mincore.lib
clangAnalysis.lib
clangApplyReplacements.lib
clangARCMigrate.lib
clangAST.lib
clangASTMatchers.lib
clangBasic.lib
clangCodeGen.lib
clangDriver.lib
clangDynamicASTMatchers.lib
clangEdit.lib
clangFormat.lib
clangFrontend.lib
clangFrontendTool.lib
clangIncludeFixer.lib
clangIndex.lib
clangLex.lib
clangParse.lib
clangQuery.lib
clangRename.lib
clangRewrite.lib
clangRewriteFrontend.lib
clangSema.lib
clangSerialization.lib
clangStaticAnalyzerCheckers.lib
clangStaticAnalyzerCore.lib
clangStaticAnalyzerFrontend.lib
clangTidy.lib
clangTidyBoostModule.lib
clangTidyCERTModule.lib
clangTidyCppCoreGuidelinesModule.lib
clangTidyGoogleModule.lib
clangTidyLLVMModule.lib
clangTidyMiscModule.lib
clangTidyModernizeModule.lib
clangTidyPerformanceModule.lib
clangTidyPlugin.lib
clangTidyReadabilityModule.lib
clangTidyUtils.lib
clangTooling.lib
clangToolingCore.lib
findAllSymbols.lib
gtest.lib
gtest_main.lib
libclang.lib
LLVMAArch64AsmParser.lib
LLVMAArch64AsmPrinter.lib
LLVMAArch64CodeGen.lib
LLVMAArch64Desc.lib
LLVMAArch64Disassembler.lib
LLVMAArch64Info.lib
LLVMAArch64Utils.lib
LLVMAMDGPUAsmParser.lib
LLVMAMDGPUAsmPrinter.lib
LLVMAMDGPUCodeGen.lib
LLVMAMDGPUDesc.lib
LLVMAMDGPUDisassembler.lib
LLVMAMDGPUInfo.lib
LLVMAMDGPUUtils.lib
LLVMAnalysis.lib
LLVMARMAsmParser.lib
LLVMARMAsmPrinter.lib
LLVMARMCodeGen.lib
LLVMARMDesc.lib
LLVMARMDisassembler.lib
LLVMARMInfo.lib
LLVMAsmParser.lib
LLVMAsmPrinter.lib
LLVMBitReader.lib
LLVMBitWriter.lib
LLVMBPFAsmPrinter.lib
LLVMBPFCodeGen.lib
LLVMBPFDesc.lib
LLVMBPFInfo.lib
LLVMCodeGen.lib
LLVMCore.lib
LLVMCoverage.lib
LLVMDebugInfoCodeView.lib
LLVMDebugInfoDWARF.lib
LLVMDebugInfoPDB.lib
LLVMExecutionEngine.lib
LLVMGlobalISel.lib
LLVMHexagonAsmParser.lib
LLVMHexagonCodeGen.lib
LLVMHexagonDesc.lib
LLVMHexagonDisassembler.lib
LLVMHexagonInfo.lib
LLVMInstCombine.lib
LLVMInstrumentation.lib
LLVMInterpreter.lib
LLVMipo.lib
LLVMIRReader.lib
LLVMLibDriver.lib
LLVMLineEditor.lib
LLVMLinker.lib
LLVMLTO.lib
LLVMMC.lib
LLVMMCDisassembler.lib
LLVMMCJIT.lib
LLVMMCParser.lib
LLVMMipsAsmParser.lib
LLVMMipsAsmPrinter.lib
LLVMMipsCodeGen.lib
LLVMMipsDesc.lib
LLVMMipsDisassembler.lib
LLVMMipsInfo.lib
LLVMMIRParser.lib
LLVMMSP430AsmPrinter.lib
LLVMMSP430CodeGen.lib
LLVMMSP430Desc.lib
LLVMMSP430Info.lib
LLVMNVPTXAsmPrinter.lib
LLVMNVPTXCodeGen.lib
LLVMNVPTXDesc.lib
LLVMNVPTXInfo.lib
LLVMObjCARCOpts.lib
LLVMObject.lib
LLVMObjectYAML.lib
LLVMOption.lib
LLVMOrcJIT.lib
LLVMPasses.lib
LLVMPowerPCAsmParser.lib
LLVMPowerPCAsmPrinter.lib
LLVMPowerPCCodeGen.lib
LLVMPowerPCDesc.lib
LLVMPowerPCDisassembler.lib
LLVMPowerPCInfo.lib
LLVMProfileData.lib
LLVMRuntimeDyld.lib
LLVMScalarOpts.lib
LLVMSelectionDAG.lib
LLVMSparcAsmParser.lib
LLVMSparcAsmPrinter.lib
LLVMSparcCodeGen.lib
LLVMSparcDesc.lib
LLVMSparcDisassembler.lib
LLVMSparcInfo.lib
LLVMSupport.lib
LLVMSymbolize.lib
LLVMSystemZAsmParser.lib
LLVMSystemZAsmPrinter.lib
LLVMSystemZCodeGen.lib
LLVMSystemZDesc.lib
LLVMSystemZDisassembler.lib
LLVMSystemZInfo.lib
LLVMTableGen.lib
LLVMTarget.lib
LLVMTransformUtils.lib
LLVMVectorize.lib
LLVMX86AsmParser.lib
LLVMX86AsmPrinter.lib
LLVMX86CodeGen.lib
LLVMX86Desc.lib
LLVMX86Disassembler.lib
LLVMX86Info.lib
LLVMX86Utils.lib
LLVMXCoreAsmPrinter.lib
LLVMXCoreCodeGen.lib
LLVMXCoreDesc.lib
LLVMXCoreDisassembler.lib
LLVMXCoreInfo.lib
LTO.lib
… obviously, all of these libraries are not needed … but I decided to let the linker figure what was needed and what wasn’t .. instead of wasting my time on this.
For the code, I kept everything from your Example.cpp (on github) the same … except for the following changes:
// I DIDN’T CHANGE CODE BEFORE THIS LINE
class ExampleFrontendAction : public ASTFrontendAction {
public:
virtual std::unique_ptr CreateASTConsumer( CompilerInstance &CI, StringRef file ) {
return std::make_unique( &CI ); // pass CI pointer to ASTConsumer
}
};
int main( int argc, const char **argv ) {
llvm::cl::OptionCategory optionCategory( “tool options” );
// parse the command-line args passed to your code
CommonOptionsParser op( argc, argv, optionCategory );
// create a new Clang Tool instance (a LibTooling environment)
ClangTool Tool( op.getCompilations(), op.getSourcePathList() );
// run the Clang Tool, creating a new FrontendAction (explained below)
int result = Tool.run( newFrontendActionFactory().get() );
// I DIDN’T CHANGE CODE AFTER THIS LINE
There were a ton of warnings (204 warnings … even on Warning Level 3); however, everything compiled and linked successfully … and ran just the way that was described it this post. Here was my command line:
C:\Users\Joshua\Documents\Visual Studio 2015\Projects\Example\x64\Release>Example.exe test.c —
** Rewrote function def: do_math
** Rewrote function call
** Rewrote ReturnStmt
Found 2 functions.
void add5(int *x) {
*x += 5;
}
int main(void) {
int result = -1, val = 4;
add5(&val);
return val;
}
Hope This Helps,
Joshua
P.S. – Thanks again for this post!
Unfortunately, my greater than ( > ) and less than ( < ) characters were interpreted as html … didn’t quite show up as I intended … so let me try this:
That’s better! The code above is what I made a few minor changes to … I kept everything else the same.
Thank you for this post. I use it to build a little tool to parse some source files.
But as the number of files to parse in one run increases, the memory consumtion also increases a lot.
Is there a way to destroy the ASTcontext(the in memory version of the AST) when it is no longer needed? Because once I parse one file I extact the information that I need but the AST context still remains in memory. I would like to avoid that. Any hints?