Strings in JAVA
By Andrius Miasnikovas
A few days ago I needed to extract all strings from .java files and also thought that it would be a good idea to keep count how many times a string is used. So I came up with this simple python script. It’s kind of a quick and dirty solution, but it met my needs for the particular task.
import sys, os, re
from operator import itemgetter
files = []
strings = {}
exp = re.compile("(\".+?\")")
def klist(bdir):
dir = os.listdir(bdir)
for fname in dir:
if fname.endswith(".java"):
files.append(bdir+"\\"+fname)
if os.path.isdir(bdir+"\\"+fname):
klist(bdir+"\\"+fname)
def get_strings(fname):
fp = open(fname)
data = fp.readlines()
fp.close()
print fname[fname.rfind("\\")+1:]+":"
for line in data:
k = 1
while(k
m = exp.search(line, k)
if m!=None:
fstr = m.groups()[0]
print " "+fstr
cnt = 1
if strings.has_key(fstr):
cnt = strings[fstr] + 1
strings.update({fstr : cnt})
k = m.end()
else:
k = len(line)
if __name__ == "__main__":
if len(sys.argv)<2:
print "Usage: get_strings.py base_directory"
exit(-1);
klist(sys.argv[1])
for fname in files:
get_strings(fname)
print "-"*70
di = strings.items()
di.sort(key=lambda x: x[1])
for (k, v) in di:
print v, ":", k
So what this basically does is gather the strings and prints out strings for each file and then after a separator line it prints some usage stats. This might contain bugs, because I was in a hurry to write it, so if you use do it at your own risk ;)