Erlang非业余研究

~水手~!! 2011-04-25

展开全文

淘宝商品库是淘宝网最核心的数据库之一，采用MySQL主备集群的架构，特点是数据量大且增长速度快，读多写少，对安全性要求高，并发请求高。

演讲内容包括淘宝商品库硬件的选型决策，安全性和性能的平衡，特别是创新引入PCI-E Flash卡和Flashcache作为Cache提高IO性能，在保证安全性的前提下就包括MySQL、InnoDB引擎、文件系统、系统Page Cache、 IO调度算法、DM层（Flashcache）、Raid卡、设备驱动在内的整条IO路径的Cache进行优化，进一步挖掘了系统IO的潜能，重点介绍优化过程中的一些经验教训、测量手段和工具。

淘宝商品库MySQL优化实践

View more presentations from Feng Yu.

玩得开心！

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 调优 Tags: mysql, 调优

EEP 36: Line numbers in exceptions

April 2nd, 2011 Yu Feng No comments

原创文章，转载请注明： 转载自Erlang非业余研究

本文链接地址: EEP 36: Line numbers in exceptions

最近关于Erlang程序在异常打印堆栈时候带上行号信息的提案开始被讨论了，具体看这里：
EEP 36: Line numbers in exceptions: http://www./eeps/eep-0036.html

初学Erlang的人估计都有这个困惑，程序异常的时候打印堆栈不假，但是只打出函数名，如果模块很长的话，很难找到具体发生异常的点，通常再通过打日志的方式来定位，非常的低效无聊。有人开玩笑说是Erlang鼓励写短函数和模块. 我曾经想了个方法解决这个问题, 见这里，但不是完美的方案。

EEP 36则是从编译器直接搞定这个问题，会爽很多, 我们看下他的效果：
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Erlang探索 Tags: EEP, line numbers

oprofile抓不到采样数据问题和解决方法

April 1st, 2011 Yu Feng 1 comment

原创文章，转载请注明： 转载自Erlang非业余研究

本文链接地址: oprofile抓不到采样数据问题和解决方法

最近有同学反映在某些新机器上做性能调优的时候， oprofile 有时抓不到数据，我之前也遇到这个情况，很是无语，今天特地验证了下。

view source

print ?

# 我们的操作系统和机器配置大概是这样的：

$sudo aspersa/summary

# Aspersa System Summary Report ##############################

Date | 2011-03-31 16:26:05 UTC (local TZ: CST +0800)

Hostname | my031226.sqa.cm4

Uptime | 10:00, 4 users, load average: 0.00, 0.78, 5.29

System | Huawei Technologies Co., Ltd.; Tecal RH2285; vV100R001 (Main Server Chassis)

Service Tag | 2102317716N0AA000062

Release | Red Hat Enterprise Linux Server release 5.4 (Tikanga)

Kernel | 2.6.18-164.el5

Architecture | CPU = 64-bit, OS = 64-bit

Threading | NPTL 2.5

Compiler | GNU CC version 4.1.2 20080704 (Red Hat 4.1.2-44).

SELinux | Disabled

# Processor ##################################################

Processors | physical = 2, cores = 12, virtual = 24, hyperthreading = yes

Speeds | 24x2400.151

Models | 24xIntel(R) Xeon(R) CPU X5670 @ 2.93GHz

Caches | 24x12288 KB

..

$sudo rm -f /root/.oprofile/daemonrc

$sudo opcontrol --setup --no-vmlinux

$sudo opcontrol --init

$sudo opcontrol --reset

$sudo opcontrol --start

Using 2.6+ OProfile kernel interface.

Using log file /var/lib/oprofile/samples/oprofiled.log

Daemon started.

Profiler running.

$sudo opcontrol --status

Daemon running: pid 9253

Separate options: none

vmlinux file: none

Image filter: none

Call-graph depth: 0

#这里喝杯茶，让子弹飞一会儿

$sudo opcontrol --shutdown

Stopping profiling.

Killing daemon.

$opreport

opreport error: No sample file found: try running opcontrol --dump

or specify a session containing sample files

$tree /var/lib/oprofile/samples/current/

/var/lib/oprofile/samples/current/

0 directories, 0 files

确实是没抓到sample文件！

经过无数次的分析和判断，再加上goolge大神的帮助，找到问题的根源了:
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 工具介绍, 调优 Tags: oprofile, timer=1

Linux下方便的socket读写查看器（socktop）

March 31st, 2011 Yu Feng No comments

原创文章，转载请注明： 转载自Erlang非业余研究

本文链接地址: Linux下方便的socket读写查看器（socktop）

晚上雕梁说要找个工具来调查下unix域套接字的发送和接受情况，比如说A程序是否送出，B程序是否接收到，他找了tcpdump ,wireshark什么的，貌似都不支持。

这时候还是伟大的systemtap来救助了。因为所有的socket通讯都是通过socket接口来的，任何family的通讯包括unix域套接都要走的，所以只要截获了socket 读写的几个syscall 就搞定了.

systemtap发行版本提供了个工具socktop，位于 /usr/share/doc/systemtap/examples/network/socktop, 是个非常方便的工具, 干这个事情最合适了。
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 工具介绍, 网络编程 Tags: socktop, systemtap

Linux pagecache的行为图

March 30th, 2011 Yu Feng No comments

原创文章，转载请注明： 转载自Erlang非业余研究

本文链接地址: Linux pagecache的行为图

看图不说话：

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 调优 Tags: linux, pagecache

latencytop深度了解你的Linux系统的延迟

March 29th, 2011 Yu Feng 6 comments

原创文章，转载请注明： 转载自Erlang非业余研究

本文链接地址: latencytop深度了解你的Linux系统的延迟

我们在系统调优或者定位问题的时候，经常会发现多线程程序的效率很低，但是又不知道问题出在哪里，就知道上下文切换很多，但是为什么上下文切换，是谁导致切换，我们就不知道了。上下文切换可以用dstat这样的工具查看，比如：

view source

print ?

$dstat

----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--

usr sys idl wai hiq siq| read writ| recv send| in out | int csw

9 2 87 2 0 1|7398k 31M| 0 0 | 9.8k 11k| 16k 64k

20 4 69 3 0 4| 26M 56M| 34M 172M| 0 0 | 61k 200k

21 5 64 6 0 3| 26M 225M| 35M 175M| 0 0 | 75k 216k

21 5 66 4 0 4| 25M 119M| 34M 173M| 0 0 | 66k 207k

19 4 68 5 0 3| 23M 56M| 33M 166M| 0 0 | 60k 197k

#或者用systemtap脚本来看

$sudo stap -e 'global cnt; probe scheduler.cpu_on {cnt<<<1;} probe timer.s(1){printf("%d\n", @count(cnt)); delete cnt;}'

217779

234141

234759

每秒高达200k左右的的上下文切换，谁能告诉我发生了什么? 好吧，latencytop来救助了!

它的官网：http://www./

Skipping audio, slower servers, everyone knows the symptoms of latency. But to know what’s going on in the system, what’s causing the latency, how to fix it… that’s a hard question without good answers right now.

LatencyTOP is a Linux* tool for software developers (both kernel and userspace), aimed at identifying where in the system latency is happening, and what kind of operation/action is causing the latency to happen so that the code can be changed to avoid the worst latency hiccups.

它是Intel贡献的另外一个性能查看器，还有一个是powertop,都是很不错的工具.
Read more…

Post Footer automatically generated by wp-posturl plugin for wordpress.

Categories: Linux, 工具介绍, 调优 Tags: dstat, latencytop, systemtap