Friday, January 9, 2009
关键有待改进
下一个feaure是改变使的所有更soot相关的工作全部用job包装,再用rule来规定所有的soot job不能同时运行,这样应该能很好的解决soot的问题。
Saturday, January 3, 2009
Eclipse 使用技巧
Alt+左箭头,右箭头 以在编辑窗口切换标签
Alt+上下箭头, 以自动选择鼠标所在行,并将其上下移动
Ctrl+f6 可以弹出菜单,上面列出可以切换的编辑窗口,这样不用鼠标也可切换
Ctrl+f7 可以在视图之间切换 ,如编辑视图,输出视图,工程视图
Ctrl+f8 可以在不同的观察视图中切换,就是在java视图,调试视图,等之间切换
Ctrl+m 可以在最大化当前窗口和还原当前窗口之间切换
Ctrl+e 弹出输入窗口,可以输入你想要编辑的代码窗口,和Ctrl+f6的功能相同,只不过一个是选择的方式,一个是输入的方式,切换窗口
Ctrl+T 可以直接显示光标所在内容的类图,可以直接输入,并跳到输入内容部分
按住Ctrl键,然后鼠标指向变量名,方法名,类名 在源代码中快速跳转
Ctrl + F11 快速执行程序
Ctrl+Shift+F 程序代码自动排版
Ctrl+Shift+O 自动加入引用。说明: 假设我们没有Import任何类别时,当我们在程序里打入: ResourceAttirbute ra =new ResourceAttribute(); Eclipse会提示说没有引用类别,这时我们只要按下Ctrl+Shift+O ,它就会自动帮我们Import这个类别。 非常方便
Ctrl+/ 将选取的块注释起来:在Debug时很方便。
Alt + / 就是大家都应该最常用的代码辅助了
Ctrl+h 搜索,打开搜索对话框
Ctrl+Shift+Space 参数提示,如果此时位于方法体中,就会出现方法的参数提示,当前光标所在位置的参数会用粗体显示
作用域 功能 快捷键
全局 查找并替换 Ctrl+F
文本编辑器 查找上一个 Ctrl+Shift+K
文本编辑器 查找下一个 Ctrl+K
全局 撤销 Ctrl+Z
全局 复制 Ctrl+C
全局 恢复上一个选择 Alt+Shift+↓
全局 剪切 Ctrl+X
全局 快速修正 Ctrl1+1
全局 内容辅助 Alt+/
全局 全部选中 Ctrl+A
全局 删除 Delete
全局 上下文信息 Alt+?
Alt+Shift+?
Ctrl+Shift+Space
Java编辑器 显示工具提示描述 F2
Java编辑器 选择封装元素 Alt+Shift+↑
Java编辑器 选择上一个元素 Alt+Shift+←
Java编辑器 选择下一个元素 Alt+Shift+→
文本编辑器 增量查找 Ctrl+J
文本编辑器 增量逆向查找 Ctrl+Shift+J
全局 粘贴 Ctrl+V
全局 重做 Ctrl+Y
查看
作用域 功能 快捷键
全局 放大 Ctrl+=
全局 缩小 Ctrl+-
窗口
作用域 功能 快捷键
全局 激活编辑器 F12
全局 切换编辑器 Ctrl+Shift+W
全局 上一个编辑器 Ctrl+Shift+F6
全局 上一个视图 Ctrl+Shift+F7
全局 上一个透视图 Ctrl+Shift+F8
全局 下一个编辑器 Ctrl+F6
全局 下一个视图 Ctrl+F7
全局 下一个透视图 Ctrl+F8
文本编辑器 显示标尺上下文菜单 Ctrl+W
全局 显示视图菜单 Ctrl+F10
全局 显示系统菜单 Alt+-
导航
作用域 功能 快捷键
Java编辑器 打开结构 Ctrl+F3
全局 打开类型 Ctrl+Shift+T
全局 打开类型层次结构 F4
全局 打开声明 F3
全局 打开外部javadoc Shift+F2
全局 打开资源 Ctrl+Shift+R
全局 后退历史记录 Alt+←
全局 前进历史记录 Alt+→
全局 上一个 Ctrl+,
全局 下一个 Ctrl+.
Java编辑器 显示大纲 Ctrl+O
全局 在层次结构中打开类型 Ctrl+Shift+H
全局 转至匹配的括号 Ctrl+Shift+P
全局 转至上一个编辑位置 Ctrl+Q
Java编辑器 转至上一个成员 Ctrl+Shift+↑
Java编辑器 转至下一个成员 Ctrl+Shift+↓
文本编辑器 转至行 Ctrl+L
搜索
作用域 功能 快捷键
全局 出现在文件中 Ctrl+Shift+U
全局 打开搜索对话框 Ctrl+H
全局 工作区中的声明 Ctrl+G
全局 工作区中的引用 Ctrl+Shift+G
文本编辑
作用域 功能 快捷键
文本编辑器 改写切换 Insert
文本编辑器 上滚行 Ctrl+↑
文本编辑器 下滚行 Ctrl+↓
文件
作用域 功能 快捷键
全局 保存 Ctrl+X
Ctrl+S
全局 打印 Ctrl+P
全局 关闭 Ctrl+F4
全局 全部保存 Ctrl+Shift+S
全局 全部关闭 Ctrl+Shift+F4
全局 属性 Alt+Enter
全局 新建 Ctrl+N
项目
作用域 功能 快捷键
全局 全部构建 Ctrl+B
源代码
作用域 功能 快捷键
Java编辑器 格式化 Ctrl+Shift+F
Java编辑器 取消注释 Ctrl+\
Java编辑器 注释 Ctrl+/
Java编辑器 添加导入 Ctrl+Shift+M
Java编辑器 组织导入 Ctrl+Shift+O
Java编辑器 使用try/catch块来包围 未设置,太常用了,所以在这里列出,建议自己设置。
也可以使用Ctrl+1自动修正。
运行
作用域 功能 快捷键
全局 单步返回 F7
全局 单步跳过 F6
全局 单步跳入 F5
全局 单步跳入选择 Ctrl+F5
全局 调试上次启动 F11
全局 继续 F8
全局 使用过滤器单步执行 Shift+F5
全局 添加/去除断点 Ctrl+Shift+B
全局 显示 Ctrl+D
全局 运行上次启动 Ctrl+F11
全局 运行至行 Ctrl+R
全局 执行 Ctrl+U
重构
作用域 功能 快捷键
全局 撤销重构 Alt+Shift+Z
全局 抽取方法 Alt+Shift+M
全局 抽取局部变量 Alt+Shift+L
全局 内联 Alt+Shift+I
全局 移动 Alt+Shift+V
全局 重命名 Alt+Shift+R
全局 重做 Alt+Shift+Y
热键篇:Template:Alt + /修改处:窗口->喜好设定->工作台->按键->编辑->内容辅助。个人习惯:Shift+SPACE(空白)。简易说明:编辑程序代码时,打sysout +Template启动键,就会自动出现:System.out.println(); 。设定Template的格式:窗口->喜好设定->Java->编辑器->模板。程序代码自动排版:Ctrl+Shift+F 修改处:窗口->喜好设定->工作台->按键->程序代码->格式。个人习惯:Alt+Z。自动排版设定:窗口-> 喜好设定->Java->程序代码格式制作程序。样式页面->将插入tab(而非空格键)以内缩,该选项取消勾选,下面空格数目填4,这样在自动编排时会以空格4作缩排。快速执行程序:Ctrl + F11个人习惯:ALT+X修改处:窗口->喜好设定->工作台->按键->执行->启动前一次的启动作业。简易说明:第一次执行时,它会询问您执行模式,设置好后,以后只要按这个热键,它就会快速执行。
<ALT+Z(排版完)、ATL+X(执行)>..我觉得很顺手^___^自动汇入所需要的类别:Ctrl+Shift+O简易说明:假设我们没有Import任何类别时,当我们在程序里打入:
BufferedReader buf =
new BufferedReader(new InputStreamReader(System.in));
此时Eclipse会警示说没有汇入类别,这时我们只要按下Ctrl+Shift+O,它就会自动帮我们Import类别。查看使用类别的原始码:Ctrl+鼠标左键点击简易说明:可以看到您所使用类别的原始码。将选取的文字批注起来:Ctrl+/简易说明:Debug时很方便。修改处:窗口 ->喜好设定->工作台->按键->程序代码->批注视景切换:Ctrl+F8个人习惯:Alt+S。修改处:窗口 ->喜好设定->工作台->按键->窗口->下一个视景。简易说明:可以方便我们快速切换编辑、除错等视景。密技篇:一套 Eclipse可同时切换,英文、繁体、简体显示:
1.首先要先安装完中文化包。
2.在桌面的快捷方式后面加上参数即可,英文-> -nl "zh_US"繁体-> -nl "zh_TW"简体-> -nl "zh_CN"。
(其它语系以此类推)像我2.1.2中文化后,我在我桌面的Eclipse快捷方式加入参数-n1 "zh_US"。
"C:\Program Files\eclipse\eclipse.exe" -n "zh_US"接口就会变回英文语系噜。利用Eclipse,在Word编辑文书时可不必将程序代码重新编排:将Eclipse程序编辑区的程序代码整个复制下来(Ctrl+C),直接贴(Ctrl+V)到
Word或WordPad上,您将会发现在Word里的程序代码格式,跟Eclipse所设定的完全一样,包括字型、缩排、关键词颜色。我曾试过JBuilder、GEL、NetBeans...使用复制贴上时,只有缩排格式一样,字型、颜色等都不会改变。外挂篇:外挂安装:将外挂包下载回来后,将其解压缩后,您会发现features、
plugins这2个数据夹,将里面的东西都复制或移动到Eclipse的features、plugins数据夹内后,重新启动Eclipse即可。让Eclipse可以像JBuilderX一样使用拖拉方式建构GUI的外挂:
1.Jigloo SWT/Swing GUI Builder :http://cloudgarden.com/jigloo/index.html下载此版本:Jigloo plugin for Eclipse (using Java 1.4 or 1.5)安装后即可由档案->新建->其它->GUI Form选取要建构的GUI类型。
2.Eclipse Visual Editor Project:http://www.eclipse.org/vep/点选下方Download Page,再点选Latest Release 0.5.0进入下载。除了VE-runtime-0.5.0.zip要下载外,以下这2个也要:
EMF build 1.1.1: (build page) (download zip)
GEF Build 2.1.2: (build page) (download zip)
3.0 M8版本,请下载:
EMF build I200403250631
GEF Build I20040330
VE- runtime-1.0M1安装成功后,便可由File->New->Visual Class开始UI设计。安装成功后,即可由新建->Java->AWT与Swing里选择所要建构的GUI类型开始进行设计。VE必须配合着对应版本,才能正常使用,否则即使安装成功,使用上仍会有问题。使用Eclipse来开发JSP程序:外挂名称:lomboz(下载页面)http://forge.objectweb.org/project/showfiles.php?group_id=97请选择适合自己版本的 lomboz下载,lomboz.212.p1.zip表示2.1.2版,
lomboz.3m7.zip表示M7版本....以此类推。
lomboz安装以及设置教学:Eclipse开发JSP-教学文件
Java转exe篇:实现方式:Eclipse搭配JSmooth(免费)。
1.先由Eclipse制作包含Manifest的JAR。制作教学
2.使用JSmooth将做好的JAR包装成EXE。
JSmooth下载页面:http://jsmooth.sourceforge.net/index.php
3.制作完成的exe文件,可在有装置JRE的Windows上执行。
Eclipse-Java编辑器最佳设定:编辑器字型设定:工作台->字型->Java编辑器文字字型。
(建议设定Courier New -regular 10)编辑器相关设定:窗口->喜好设定->Java->编辑器外观:显示行号、强调对称显示的方括号、强调显示现行行、显示打印边距,将其勾选,Tab宽度设4,打印编距字段设80。程序代码协助:采预设即可。语法:可设定关键词、字符串等等的显示颜色。附注:采预设即可。输入:全部字段都勾选。浮动说明:采预设即可。导览:采预设即可。使自动排版排出来的效果,最符合Java设计惯例的设定:自动排版设定:窗口->喜好设定 ->Java->程序代码制作格式。换行:全部不勾选。分行:行长度上限设:80。样式:只将强制转型后插入空白勾选。内缩空格数目:设为 4。
1. Control-Shift-T: 打开类型(Open type)。如果你不是有意磨洋工,还是忘记通过源码树(source tree)打开的方式吧。
2. Control-Shift-R: 打开资源(不只是用来寻找Java文件)。小提示:利用Navigator视图的黄色双向箭头按钮让你的编辑窗口和导航器相关联。这会让你打开的文件对应显示在导航器的层级结构中,这样便于组织信息。如果这影响了速度,就关掉它。
3. F3: 打开申明(Open declaration)。或者,利用Declaration Tab(在Java视图模式下,选择Windows --> Show View -- > Declaration)。当你选中代码中的一个方法,然后按这个按键,它会把整个方法在申明方框里显示出来。
4. Alt-left arrow: 在导航历史记录(Navigation History)中后退。就像Web浏览器的后退按钮一样,在利用F3跳转之后,特别有用。(用来返回原先编译的地方)
5. Alt-right arrow: 导航历史记录中向前。
6. Control-Q: 回到最后依次编辑的地方。这个快捷键也是当你在代码中跳转后用的。特别是当你钻的过深,忘记你最初在做什么的时候。
7. Control-Shift-G: 在workspace中搜索引用(reference)。这是重构的前提。对于方法,这个热键的作用和F3恰好相反。它使你在方法的栈中,向上找出一个方法的所有调用者。一个与此相关的功能是开启“标记”功能(occurrence marking)。选择Windows->Preferences->Java-> Editor-> Mark Occurrences,勾选选项。这时,当你单击一个元素的时候,代码中所有该元素存在的地方都会被高亮显示。我个人只使用“标记本地变量”(Mark Local Variables)。注意:太多的高亮显示会拖慢Eclipse。
Code Style,然后设置Code Formatter,Code Style和Organize Imports。利用导出(Export)功能来生成配置文件。我们把这些配置文件放在wiki上,然后团队里的每个人都导入到自己的Eclipse中。JavaPreferences8. Control-Shift-F: 根据代码风格设定重新格式化代码。我们的团队有统一的代码格式,我们把它放在我们的wiki上。要这么做,我们打开Eclipse,选择Window
9. Control-O: 快速概要(quick outline)。通过这个快捷键,你可以迅速的跳到一个方法或者属性,只需要输入名字的头几个字母。
10. Control-/: 对一行注释或取消注释。对于多行也同样适用。
11. Control-Alt-down arrow: 复制高亮显示的一行或多行。
12. Alt-down arrow: 将一行或多行向下移动。Alt-up arrow会向上移动。
其他的热键在菜单里有。你可以通过按下Control-Shift-L(从3.1版本开始),看到所有快捷键的列表。按下Control-Shift-L两次,会显示热键对话框(Keys Preferences dialog),你可以在这里自己设置热键。我欢迎你在Talkback部分发表你的Eclipse提示。
Ctrl+1 快速修复(最经典的快捷键,就不用多说了)
Ctrl+D: 删除当前行
Ctrl+Alt+↓ 复制当前行到下一行(复制增加)
Ctrl+Alt+↑ 复制当前行到上一行(复制增加)
Alt+↓ 当前行和下面一行交互位置(特别实用,可以省去先剪切,再粘贴了)
Alt+↑ 当前行和上面一行交互位置(同上)
Alt+← 前一个编辑的页面
Alt+→ 下一个编辑的页面(当然是针对上面那条来说了)
Alt+Enter 显示当前选择资源(工程,or 文件 or文件)的属性
Shift+Enter 在当前行的下一行插入空行(这时鼠标可以在当前行的任一位置,不一定是最后)
Shift+Ctrl+Enter 在当前行插入空行(原理同上条)
Ctrl+Q 定位到最后编辑的地方
Ctrl+L 定位在某行 (对于程序超过100的人就有福音了)
Ctrl+M 最大化当前的Edit或View (再按则反之)
Ctrl+/ 注释当前行,再按则取消注释
Ctrl+O 快速显示 OutLine
Ctrl+T 快速显示当前类的继承结构
Ctrl+W 关闭当前Editer
Ctrl+K 参照选中的Word快速定位到下一个
Ctrl+E 快速显示当前Editer的下拉列表(如果当前页面没有显示的用黑体表示)
Ctrl+/(小键盘) 折叠当前类中的所有代码
Ctrl+×(小键盘) 展开当前类中的所有代码
Ctrl+Space 代码助手完成一些代码的插入(但一般和输入法有冲突,可以修改输入法的热键,也可以暂用Alt+/来代替)
Ctrl+Shift+E 显示管理当前打开的所有的View的管理器(可以选择关闭,激活等操作)
Ctrl+J 正向增量查找(按下Ctrl+J后,你所输入的每个字母编辑器都提供快速匹配定位到某个单词,如果没有,则在stutes line中显示没有找到了,查一个单词时,特别实用,这个功能Idea两年前就有了)
Ctrl+Shift+J 反向增量查找(和上条相同,只不过是从后往前查)
Ctrl+Shift+F4 关闭所有打开的Editer
Ctrl+Shift+X 把当前选中的文本全部变味小写
Ctrl+Shift+Y 把当前选中的文本全部变为小写
Ctrl+Shift+F 格式化当前代码
Ctrl+Shift+P 定位到对于的匹配符(譬如{}) (从前面定位后面时,光标要在匹配符里面,后面到前面,则反之)
下面的快捷键是重构里面常用的,本人就自己喜欢且常用的整理一下(注:一般重构的快捷键都是Alt+Shift开头的了)
Alt+Shift+R 重命名 (是我自己最爱用的一个了,尤其是变量和类的Rename,比手工方法能节省很多劳动力)
Alt+Shift+M 抽取方法 (这是重构里面最常用的方法之一了,尤其是对一大堆泥团代码有用)
Alt+Shift+C 修改函数结构(比较实用,有N个函数调用了这个方法,修改一次搞定)
Alt+Shift+L 抽取本地变量( 可以直接把一些魔法数字和字符串抽取成一个变量,尤其是多处调用的时候)
Alt+Shift+F 把Class中的local变量变为field变量 (比较实用的功能)
Alt+Shift+I 合并变量(可能这样说有点不妥Inline)
Alt+Shift+V 移动函数和变量(不怎么常用)
Alt+Shift+Z 重构的后悔药(Undo)
经常用到的Eclipse快捷键
存盘 Ctrl+s(一定记住)
注释代码 Ctrl+/
取消注释 Ctrl+\(Eclipse3已经都合并到Ctrl+/了)
代码辅助 Alt+/
快速修复 Ctrl+1
代码格式化 Ctrl+Shift+f
整理导入 Ctrl+Shift+o
切换窗口 Ctrl+f6<可改为ctrl+tab方便>
ctrl+shift+M 导入未引用的包
ctrl+w 关闭单个窗口
F3 跳转到类、变量的声明
F11 运行上次程序
Ctrl + F11 调试上次程序
Alt + 回下一个编辑点
ctrl+shift+T 查找工程中的类
检测Java对象所占内存大小
By Vladimir Roubtsov, JavaWorld.com, 08/16/02
Recently, I helped design a Java server application that resembled an in-memory database. That is, we biased the design toward caching tons of data in memory to provide super-fast query performance.
Once we got the prototype running, we naturally decided to profile the data memory footprint after it had been parsed and loaded from disk. The unsatisfactory initial results, however, prompted me to search for explanations.
Since Java purposefully hides many aspects of memory management, discovering how much memory your objects consume takes some work. You could use the Runtime.freeMemory() method to measure heap size differences before and after several objects have been allocated. Several articles, such as Ramchander Varadarajan's "Question of the Week No. 107" (Sun Microsystems, September 2000) and Tony Sintes's "Memory Matters" (JavaWorld, December 2001), detail that idea. Unfortunately, the former article's solution fails because the implementation employs a wrong Runtime method, while the latter article's solution has its own imperfections:
- A single call to Runtime.freeMemory() proves insufficient because a JVM may decide to increase its current heap size at any time (especially when it runs garbage collection). Unless the total heap size is already at the -Xmx maximum size, we should use Runtime.totalMemory()-Runtime.freeMemory() as the used heap size.
- Executing a single Runtime.gc() call may not prove sufficiently aggressive for requesting garbage collection. We could, for example, request object finalizers to run as well. And since Runtime.gc() is not documented to block until collection completes, it is a good idea to wait until the perceived heap size stabilizes.
- If the profiled class creates any static data as part of its per-class class initialization (including static class and field initializers), the heap memory used for the first class instance may include that data. We should ignore heap space consumed by the first class instance.
Considering those problems, I present Sizeof, a tool with which I snoop at various Java core and application classes:
- public class Sizeof
- {
- public static void main (String [] args) throws Exception
- {
- // Warm up all classes/methods we will use
- runGC ();
- usedMemory ();
- // Array to keep strong references to allocated objects
- final int count = 100000;
- Object [] objects = new Object [count];
-
- long heap1 = 0;
- // Allocate count+1 objects, discard the first one
- for (int i = -1; i <>
- {
- Object object = null;
-
- // Instantiate your data here and assign it to object
-
- object = new Object ();
- //object = new Integer (i);
- //object = new Long (i);
- //object = new String ();
- //object = new byte [128][1]
-
- if (i >= 0)
- objects [i] = object;
- else
- {
- object = null; // Discard the warm up object
- runGC ();
- heap1 = usedMemory (); // Take a before heap snapshot
- }
- }
- runGC ();
- long heap2 = usedMemory (); // Take an after heap snapshot:
-
- final int size = Math.round (((float)(heap2 - heap1))/count);
- System.out.println ("'before' heap: " + heap1 +
- ", 'after' heap: " + heap2);
- System.out.println ("heap delta: " + (heap2 - heap1) +
- ", {" + objects [0].getClass () + "} size = " + size + " bytes");
- for (int i = 0; i <>null;
- objects = null;
- }
- private static void runGC () throws Exception
- {
- // It helps to call Runtime.gc()
- // using several method calls:
- for (int r = 0; r < 4; ++ r) _runGC ();
- }
- private static void _runGC () throws Exception
- {
- long usedMem1 = usedMemory (), usedMem2 = Long.MAX_VALUE;
- for (int i = 0; (usedMem1 <>500); ++ i)
- {
- s_runtime.runFinalization ();
- s_runtime.gc ();
- Thread.currentThread ().yield ();
-
- usedMem2 = usedMem1;
- usedMem1 = usedMemory ();
- }
- }
- private static long usedMemory ()
- {
- return s_runtime.totalMemory () - s_runtime.freeMemory ();
- }
-
- private static final Runtime s_runtime = Runtime.getRuntime ();
- } // End of class
Sizeof's key methods are runGC() and usedMemory(). I use a runGC() wrapper method to call _runGC() several times because it appears to make the method more aggressive. (I am not sure why, but it's possible creating and destroying a method call-stack frame causes a change in the reachability root set and prompts the garbage collector to work harder. Moreover, consuming a large fraction of the heap space to create enough work for the garbage collector to kick in also helps. In general, it is hard to ensure everything is collected. The exact details depend on the JVM and garbage collection algorithm.)
Note carefully the places where I invoke runGC(). You can edit the code between the heap1 and heap2 declarations to instantiate anything of interest.
Also note how Sizeof prints the object size: the transitive closure of data required by all count class instances, divided by count. For most classes, the result will be memory consumed by a single class instance, including all of its owned fields. That memory footprint value differs from data provided by many commercial profilers that report shallow memory footprints (for example, if an object has an int[] field, its memory consumption will appear separately).
The results
Let's apply this simple tool to a few classes, then see if the results match our expectations.
Note: The following results are based on Sun's JDK 1.3.1 for Windows. Due to what is and is not guaranteed by the Java language and JVM specifications, you cannot apply these specific results to other platforms or other Java implementations.
java.lang.Object
Well, the root of all objects just had to be my first case. For java.lang.Object, I get:
So, a plain Object takes 8 bytes; of course, no one should expect the size to be 0, as every instance must carry around fields that support base operations like equals(), hashCode(), wait()/notify(), and so on.
java.lang.Integer
My colleagues and I frequently wrap native ints into Integer instances so we can store them in Java collections. How much does it cost us in memory?
The 16-byte result is a little worse than I expected because an int value can fit into just 4 extra bytes. Using an Integer costs me a 300 percent memory overhead compared to when I can store the value as a primitive type.
java.lang.Long
Long should take more memory than Integer, but it does not:
Clearly, actual object size on the heap is subject to low-level memory alignment done by a particular JVM implementation for a particular CPU type. It looks like a Long is 8 bytes of Object overhead, plus 8 bytes more for the actual long value. In contrast, Integer had an unused 4-byte hole, most likely because the JVM I use forces object alignment on an 8-byte word boundary.
Arrays
Playing with primitive type arrays proves instructive, partly to discover any hidden overhead and partly to justify another popular trick: wrapping primitive values in a size-1 array to use them as objects. By modifying Sizeof.main() to have a loop that increments the created array length on every iteration, I get for int arrays:
- length: 0, {class [I} size = 16 bytes
- length: 1, {class [I} size = 16 bytes
- length: 2, {class [I} size = 24 bytes
- length: 3, {class [I} size = 24 bytes
- length: 4, {class [I} size = 32 bytes
- length: 5, {class [I} size = 32 bytes
- length: 6, {class [I} size = 40 bytes
- length: 7, {class [I} size = 40 bytes
- length: 8, {class [I} size = 48 bytes
- length: 9, {class [I} size = 48 bytes
- length: 10, {class [I} size = 56 bytes
length: 0, {class [I} size = 16 bytes length: 1, {class [I} size = 16 bytes length: 2, {class [I} size = 24 bytes length: 3, {class [I} size = 24 bytes length: 4, {class [I} size = 32 bytes length: 5, {class [I} size = 32 bytes length: 6, {class [I} size = 40 bytes length: 7, {class [I} size = 40 bytes length: 8, {class [I} size = 48 bytes length: 9, {class [I} size = 48 bytes length: 10, {class [I} size = 56 bytesand for char arrays:
- length: 0, {class [C} size = 16 bytes
- length: 1, {class [C} size = 16 bytes
- length: 2, {class [C} size = 16 bytes
- length: 3, {class [C} size = 24 bytes
- length: 4, {class [C} size = 24 bytes
- length: 5, {class [C} size = 24 bytes
- length: 6, {class [C} size = 24 bytes
- length: 7, {class [C} size = 32 bytes
- length: 8, {class [C} size = 32 bytes
- length: 9, {class [C} size = 32 bytes
- length: 10, {class [C} size = 32 bytes
length: 0, {class [C} size = 16 bytes length: 1, {class [C} size = 16 bytes length: 2, {class [C} size = 16 bytes length: 3, {class [C} size = 24 bytes length: 4, {class [C} size = 24 bytes length: 5, {class [C} size = 24 bytes length: 6, {class [C} size = 24 bytes length: 7, {class [C} size = 32 bytes length: 8, {class [C} size = 32 bytes length: 9, {class [C} size = 32 bytes length: 10, {class [C} size = 32 bytesAbove, the evidence of 8-byte alignment pops up again. Also, in addition to the inevitable Object 8-byte overhead, a primitive array adds another 8 bytes (out of which at least 4 bytes support the length field). And using int[1] appears to not offer any memory advantages over an Integer instance, except maybe as a mutable version of the same data.
Multidimensional arrays
Multidimensional arrays offer another surprise. Developers commonly employ constructs like int[dim1][dim2] in numerical and scientific computing. In an int[dim1][dim2] array instance, every nested int[dim2] array is an Object in its own right. Each adds the usual 16-byte array overhead. When I don't need a triangular or ragged array, that represents pure overhead. The impact grows when array dimensions greatly differ. For example, a int[128][2] instance takes 3,600 bytes. Compared to the 1,040 bytes an int[256] instance uses (which has the same capacity), 3,600 bytes represent a 246 percent overhead. In the extreme case of byte[256][1], the overhead factor is almost 19! Compare that to the C/C++ situation in which the same syntax does not add any storage overhead.
java.lang.String
Let's try an empty String, first constructed as new String():
- 'before' heap: 510696, 'after' heap: 4510696
- heap delta: 4000000, {class java.lang.String} size = 40 bytes
'before' heap: 510696, 'after' heap: 4510696 heap delta: 4000000, {class java.lang.String} size = 40 bytesThe result proves quite depressing. An empty String takes 40 bytes—enough memory to fit 20 Java characters.
Before I try Strings with content, I need a helper method to create Strings guaranteed not to get interned. Merely using literals as in:
- object = "string with 20 chars";
object = "string with 20 chars";
will not work because all such object handles will end up pointing to the same String instance. The language specification dictates such behavior (see also the java.lang.String.intern() method). Therefore, to continue our memory snooping, try:
- public static String createString (final int length)
- {
- char [] result = new char [length];
- for (int i = 0; i <>char) i;
-
- return new String (result);
- }
public static String createString (final int length) { char [] result = new char [length]; for (int i = 0; i <>
After arming myself with this String creator method, I get the following results:
Java代码- length: 0, {class java.lang.String} size = 40 bytes
- length: 1, {class java.lang.String} size = 40 bytes
- length: 2, {class java.lang.String} size = 40 bytes
- length: 3, {class java.lang.String} size = 48 bytes
- length: 4, {class java.lang.String} size = 48 bytes
- length: 5, {class java.lang.String} size = 48 bytes
- length: 6, {class java.lang.String} size = 48 bytes
- length: 7, {class java.lang.String} size = 56 bytes
- length: 8, {class java.lang.String} size = 56 bytes
- length: 9, {class java.lang.String} size = 56 bytes
- length: 10, {class java.lang.String} size = 56 bytes
length: 0, {class java.lang.String} size = 40 bytes length: 1, {class java.lang.String} size = 40 bytes length: 2, {class java.lang.String} size = 40 bytes length: 3, {class java.lang.String} size = 48 bytes length: 4, {class java.lang.String} size = 48 bytes length: 5, {class java.lang.String} size = 48 bytes length: 6, {class java.lang.String} size = 48 bytes length: 7, {class java.lang.String} size = 56 bytes length: 8, {class java.lang.String} size = 56 bytes length: 9, {class java.lang.String} size = 56 bytes length: 10, {class java.lang.String} size = 56 bytes
The results clearly show that a String's memory growth tracks its internal char array's growth. However, the String class adds another 24 bytes of overhead. For a nonempty String of size 10 characters or less, the added overhead cost relative to useful payload (2 bytes for each char plus 4 bytes for the length), ranges from 100 to 400 percent.
Of course, the penalty depends on your application's data distribution. Somehow I suspected that 10 characters represents the typical String length for a variety of applications. To get a concrete data point, I instrumented the SwingSet2 demo (by modifying the String class implementation directly) that came with JDK 1.3.x to track the lengths of the Strings it creates. After a few minutes playing with the demo, a data dump showed that about 180,000 Strings were instantiated. Sorting them into size buckets confirmed my expectations:
Java代码- [0-10]: 96481
- [10-20]: 27279
- [20-30]: 31949
- [30-40]: 7917
- [40-50]: 7344
- [50-60]: 3545
- [60-70]: 1581
- [70-80]: 1247
- [80-90]: 874
- ...
[0-10]: 96481 [10-20]: 27279 [20-30]: 31949 [30-40]: 7917 [40-50]: 7344 [50-60]: 3545 [60-70]: 1581 [70-80]: 1247 [80-90]: 874 ...
That's right, more than 50 percent of all String lengths fell into the 0-10 bucket, the very hot spot of String class inefficiency!
In reality, Strings can consume even more memory than their lengths suggest: Strings generated out of StringBuffers (either explicitly or via the '+' concatenation operator) likely have char arrays with lengths larger than the reported String lengths because StringBuffers typically start with a capacity of 16, then double it on append() operations. So, for example, createString(1) + ' ' ends up with a char array of size 16, not 2.
What do we do?
"This is all very well, but we don't have any choice but to use Strings and other types provided by Java, do we?" I hear you ask. Let's find out.
Wrapper classes
Wrapper classes like java.lang.Integer seem a bad choice for storing large data amounts in memory. If you strive to be memory-economic, avoid them altogether. Rolling your own vector class for primitive ints isn't difficult. Of course, it would be great if the Java core API already contained such libraries. Perhaps the situation will improve when Java has generic types.
Multidimensional arrays
For large data structures built with multidimensional arrays, you can oftentimes reduce the extra dimension overhead by an easy indexing change: convert every int[dim1][dim2] instance to an int[dim1*dim2] instance and change all expressions like a[i][j] to a[i*dim1 + j]. Of course, you pay a price from the lack of index-range checking on dim1 dimension (which also boosts performance).
java.lang.String
You can try a few simple tricks to reduce your application's String static memory size.
First, you can try one common technique when an application loads and caches many Strings from a data file or a network connection, and the String value range proves limited. For example, if you want to parse an XML file in which you frequently encounter a certain attribute, but the attribute is limited to just two possible values. Your goal: filter all Strings through a hash map and reduce all equal but distinct Strings to identical object references:
Java代码- public String internString (String s)
- {
- if (s == null) return null;
-
- String is = (String) m_strings.get (s);
- if (is != null)
- return is;
- else
- {
- m_strings.put (s, s);
- return s;
- }
- }
-
- private Map m_strings = new HashMap ();
public String internString (String s) { if (s == null) return null; String is = (String) m_strings.get (s); if (is != null) return is; else { m_strings.put (s, s); return s; } } private Map m_strings = new HashMap ();
When applicable, that trick can decrease your static memory requirements by hundreds of percent. An experienced reader may observe that the trick duplicates java.lang.String.intern()'s functionality. Numerous reasons exist to avoid the String.intern() method. One is that few modern JVMs can intern large amounts of data.
What if your Strings are all different? For the second trick, recollect that for small Strings the underlying char array takes half the memory occupied by the String that wraps it. Thus, when my application caches many distinct String values, I can just keep the arrays in memory and convert them to Strings as needed. That works well if each such String then serves as a transient, quickly discarded object. A simple experiment with caching 90,000 words taken from a sample dictionary file shows that this data takes about 5.6 MB in String form and only 3.4 MB in char[] form, a 65 percent reduction.
The second trick contains one obvious disadvantage: you cannot convert a char[] back to a String through a constructor that would take ownership of the array without cloning it. Why? Because the entire public String API ensures that every String is immutable, so every String constructor defensively clones input data passed through its parameters.
Still, you can try a third trick when the cost of converting from char arrays to Strings proves too high. The trick exploits java.lang.String.substr()'s ability to avoid data copying: the method implementation exploits String immutability and creates a shallow String object that shares the char content array with the original String but has its internal start and end indices adjusted correspondingly. To make an example, new String("smiles").substring(1,5) is a String configured to start at index 1 and end at index 4 within a char buffer "smiles" shared by reference with the originally constructed String. You can exploit that fact as follows: given a large String set, you can merge its char content into one large char array, create a String out of it, and recreate the original Strings as subStrings of this master String, as the following method illustrates:
Java代码- public static String [] compactStrings (String [] strings)
- {
- String [] result = new String [strings.length];
- int offset = 0;
-
- for (int i = 0; i <>
- offset += strings [i].length ();
-
- // Can't use StringBuffer due to how it manages capacity
- char [] allchars = new char [offset];
-
- offset = 0;
- for (int i = 0; i <>
- {
- strings [i].getChars (0, strings [i].length (), allchars, offset);
- offset += strings [i].length ();
- }
-
- String allstrings = new String (allchars);
-
- offset = 0;
- for (int i = 0; i <>
- result [i] = allstrings.substring (offset,
- offset += strings [i].length ());
-
- return result;
- }
public static String [] compactStrings (String [] strings) { String [] result = new String [strings.length]; int offset = 0; for (int i = 0; i < allchars =" new" offset =" 0;" i =" 0;" allstrings =" new" offset =" 0;" i =" 0;">
The above method returns a new set of Strings equivalent to the input set but more compact in memory. Recollect from earlier measurements that every char[] adds 16 bytes of overhead; effectively removed by this method. The savings could be significant when cached data comprises mostly short Strings. When you apply this trick to the same 90,000-word dictionary mentioned above, the memory size drops from 5.6 MB to 4.2 MB, a 30 percent reduction. (An astute reader will observe in that particular example the Strings tend to share many prefixes and the compactString() method could be further optimized to reduce the merged char array's size.)
As a side effect, compactString() also removes StringBuffer-related inefficiencies mentioned earlier.
Is it worth the effort?
To many, the techniques I presented may seem like micro-optimizations not worth the time it takes to implement them. However, remember the applications I had in mind: server-side applications that cache massive amounts of data in memory to achieve performance impossible when data comes from a disk or database. Several hundred megabytes of cached data represents a noticeable fraction of maximum heap sizes of today's 32-bit JVMs. Shaving 30 percent or more off is nothing to scoff at; it could push an application's scalability limits quite noticeably. Of course, these tricks cannot substitute for beginning with well-designed data structures and profiling your application to determine its actual hot spots. In any case, you're now more aware of how much memory your objects consume.
Author Bio
Vladimir Roubtsov has programmed in a variety of languages for more than 12 years, including Java since 1995. Currently, he develops enterprise software as a senior developer for Trilogy in Austin, Texas. When coding for fun, Vladimir develops software tools based on Java byte code or source code instrumentation.