问题:在Linux 文件系统I/O写数据操作,内核代码开销到底多少?
目标:分析基于SCSI设备Linux文件系统写数据时内核代码开销。
方法:
(1)在文件系统写入口函数vfs_write()统计每次同步写耗时。
(2)在SCSI设备IO完成操作函数blk_account_io_done()中统计SCSI设备层处理I/O写请求的时间。
(3)写I/O操作Linux内核代码耗时为:vfs_write()总耗时 – SCSI设备处理时间。
(4)因为Linux I/O内核代码执行路径不变,在各分支情况对整体影响不大,便于准确统计,我们以每10万次I/O写为统计单位,计算平均值。
(5)以4K同步写为基础,计算Linux内核I/O代码开销。
(6)为了时间准确性,我们以CPU Cycles为计算单位。
(7)基于ext4文件系统,测试工具为fio,单并发、IOdepth为1。
配置:
(1)内核版本:3.10.0-327.alios7.x86_64
(2)机器型号:Inspur SA5212M4
(3)硬件配置:2xE5-2684 v2, 384GB mem, Avago 3008 SAS HBA, 10x800GB Intel SSD DC 3510
测试结果:
在进行5100000次4K同步写操作后,vfs_write()平均耗时为110971 CPU Cycles,blk_account_io_done()统计到的SCSI设备4K同步写平均耗时102207 CPU Cycles。
即我们得出的Linux同步写I/O整个内核协议栈代码开销为110971- 102207= 8764 CPU Cycles。即整个Linux内核I/O栈开销约为8700 CPU Cycles。
那么我们接下来转换为时间:以我们E5-2682 v4 2.5GHz为例。
Linux内核代码开销为8764/2500000000 = 0.0000035056,即3.5us。
当我们的CPU打开Turbo后,CPU频率为2.9GHz及以上,那么Linux内核I/O开销为8764/2900000000 =0.000003022068966,即3us。
#dmesg
blk_io_done_account() ===>> WRITE Disk:sdf, Sector:1410199904, data_len:0:, Duration:100173 cycles, average:<105595> cycles, Total Writes:99999
==>> vfs_write() Duration:124727, filename:sdf, pos:409600000, diskname:(null), Aver TSC:114489, Total Writes:100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1410999888, data_len:0:, Duration:99800 cycles, average:<105428> cycles, Total Writes:199999
==>> vfs_write() Duration:108812, filename:sdf, pos:819200000, diskname:(null), Aver TSC:114447, Total Writes:200000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1411799872, data_len:0:, Duration:99276 cycles, average:<105337> cycles, Total Writes:299999
==>> vfs_write() Duration:109385, filename:sdf, pos:1228800000, diskname:(null), Aver TSC:114351, Total Writes:300000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1412599856, data_len:0:, Duration:101400 cycles, average:<105274> cycles, Total Writes:399999
==>> vfs_write() Duration:110121, filename:sdf, pos:1638400000, diskname:(null), Aver TSC:114353, Total Writes:400000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1413399840, data_len:0:, Duration:100321 cycles, average:<105294> cycles, Total Writes:499999
==>> vfs_write() Duration:160739, filename:sdf, pos:2048000000, diskname:(null), Aver TSC:114501, Total Writes:500000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1414461968, data_len:0:, Duration:99646 cycles, average:<105303> cycles, Total Writes:599999
==>> vfs_write() Duration:110406, filename:sdf, pos:2457600000, diskname:(null), Aver TSC:114511, Total Writes:600000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1415261952, data_len:0:, Duration:100285 cycles, average:<105344> cycles, Total Writes:699999
==>> vfs_write() Duration:111148, filename:sdf, pos:2867200000, diskname:(null), Aver TSC:114654, Total Writes:700000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1416061936, data_len:0:, Duration:270797 cycles, average:<105344> cycles, Total Writes:799999
==>> vfs_write() Duration:109676, filename:sdf, pos:3276800000, diskname:(null), Aver TSC:114710, Total Writes:800000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1416861912, data_len:0:, Duration:100157 cycles, average:<105351> cycles, Total Writes:899999
==>> vfs_write() Duration:108173, filename:sdf, pos:3686400000, diskname:(null), Aver TSC:114755, Total Writes:900000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1417661896, data_len:0:, Duration:100921 cycles, average:<105359> cycles, Total Writes:999999
==>> vfs_write() Duration:111842, filename:sdf, pos:4096000000, diskname:(null), Aver TSC:114789, Total Writes:1000000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1418724024, data_len:0:, Duration:125397 cycles, average:<105373> cycles, Total Writes:1099999
==>> vfs_write() Duration:110818, filename:sdf, pos:4505600000, diskname:(null), Aver TSC:114823, Total Writes:1100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1419524008, data_len:0:, Duration:197564 cycles, average:<105326> cycles, Total Writes:1199999
==>> vfs_write() Duration:111442, filename:sdf, pos:4915200000, diskname:(null), Aver TSC:114756, Total Writes:1200000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1420323992, data_len:0:, Duration:105658 cycles, average:<105307> cycles, Total Writes:1299999
==>> vfs_write() Duration:109888, filename:sdf, pos:5324800000, diskname:(null), Aver TSC:114746, Total Writes:1300000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1421123976, data_len:0:, Duration:100270 cycles, average:<105309> cycles, Total Writes:1399999
==>> vfs_write() Duration:110433, filename:sdf, pos:5734400000, diskname:(null), Aver TSC:114761, Total Writes:1400000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1422186104, data_len:0:, Duration:114721 cycles, average:<105296> cycles, Total Writes:1499999
==>> vfs_write() Duration:108334, filename:sdf, pos:6144000000, diskname:(null), Aver TSC:114745, Total Writes:1500000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1422986104, data_len:0:, Duration:132436 cycles, average:<105288> cycles, Total Writes:1599999
==>> vfs_write() Duration:108827, filename:sdf, pos:6553600000, diskname:(null), Aver TSC:114743, Total Writes:1600000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1423786088, data_len:0:, Duration:109746 cycles, average:<105269> cycles, Total Writes:1699999
==>> vfs_write() Duration:109118, filename:sdf, pos:6963200000, diskname:(null), Aver TSC:114734, Total Writes:1700000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1424586072, data_len:0:, Duration:100885 cycles, average:<105258> cycles, Total Writes:1799999
==>> vfs_write() Duration:110109, filename:sdf, pos:7372800000, diskname:(null), Aver TSC:114735, Total Writes:1800000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1425386056, data_len:0:, Duration:93700 cycles, average:<104948> cycles, Total Writes:1899999
==>> vfs_write() Duration:100321, filename:sdf, pos:7782400000, diskname:(null), Aver TSC:114352, Total Writes:1900000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1426448184, data_len:0:, Duration:95203 cycles, average:<104614> cycles, Total Writes:1999999
==>> vfs_write() Duration:100100, filename:sdf, pos:8192000000, diskname:(null), Aver TSC:113940, Total Writes:2000000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1427248168, data_len:0:, Duration:94327 cycles, average:<104307> cycles, Total Writes:2099999
==>> vfs_write() Duration:100470, filename:sdf, pos:8601600000, diskname:(null), Aver TSC:113557, Total Writes:2100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1428048152, data_len:0:, Duration:93461 cycles, average:<104026> cycles, Total Writes:2199999
==>> vfs_write() Duration:101376, filename:sdf, pos:9011200000, diskname:(null), Aver TSC:113216, Total Writes:2200000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1428848136, data_len:0:, Duration:94997 cycles, average:<103768> cycles, Total Writes:2299999
==>> vfs_write() Duration:101206, filename:sdf, pos:9420800000, diskname:(null), Aver TSC:112894, Total Writes:2300000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1429648120, data_len:0:, Duration:92782 cycles, average:<103530> cycles, Total Writes:2399999
==>> vfs_write() Duration:103882, filename:sdf, pos:9830400000, diskname:(null), Aver TSC:112602, Total Writes:2400000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1430710264, data_len:0:, Duration:92476 cycles, average:<103309> cycles, Total Writes:2499999
==>> vfs_write() Duration:100160, filename:sdf, pos:10240000000, diskname:(null), Aver TSC:112331, Total Writes:2500000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1431510248, data_len:0:, Duration:92615 cycles, average:<103105> cycles, Total Writes:2599999
==>> vfs_write() Duration:99706, filename:sdf, pos:10649600000, diskname:(null), Aver TSC:112082, Total Writes:2600000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1410027984, data_len:0:, Duration:92269 cycles, average:<102923> cycles, Total Writes:2699999
==>> vfs_write() Duration:99827, filename:sdf, pos:321781760, diskname:(null), Aver TSC:111856, Total Writes:2700000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1410827968, data_len:0:, Duration:92291 cycles, average:<102752> cycles, Total Writes:2799999
==>> vfs_write() Duration:99915, filename:sdf, pos:731381760, diskname:(null), Aver TSC:111652, Total Writes:2800000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1411627952, data_len:0:, Duration:92297 cycles, average:<102601> cycles, Total Writes:2899999
==>> vfs_write() Duration:100955, filename:sdf, pos:1140981760, diskname:(null), Aver TSC:111454, Total Writes:2900000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1412427936, data_len:0:, Duration:92288 cycles, average:<102449> cycles, Total Writes:2999999
==>> vfs_write() Duration:100909, filename:sdf, pos:1550581760, diskname:(null), Aver TSC:111271, Total Writes:3000000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1413227920, data_len:0:, Duration:92128 cycles, average:<102313> cycles, Total Writes:3099999
==>> vfs_write() Duration:100427, filename:sdf, pos:1960181760, diskname:(null), Aver TSC:111097, Total Writes:3100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1414290064, data_len:0:, Duration:92554 cycles, average:<102182> cycles, Total Writes:3199999
==>> vfs_write() Duration:101303, filename:sdf, pos:2369781760, diskname:(null), Aver TSC:110931, Total Writes:3200000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1415090048, data_len:0:, Duration:92094 cycles, average:<102059> cycles, Total Writes:3299999
==>> vfs_write() Duration:161842, filename:sdf, pos:2779381760, diskname:(null), Aver TSC:110779, Total Writes:3300000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1415890032, data_len:0:, Duration:92052 cycles, average:<101945> cycles, Total Writes:3399999
==>> vfs_write() Duration:101515, filename:sdf, pos:3188981760, diskname:(null), Aver TSC:110632, Total Writes:3400000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1416690016, data_len:0:, Duration:91861 cycles, average:<101835> cycles, Total Writes:3499999
==>> vfs_write() Duration:103003, filename:sdf, pos:3598581760, diskname:(null), Aver TSC:110493, Total Writes:3500000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1417490000, data_len:0:, Duration:92200 cycles, average:<101730> cycles, Total Writes:3599999
==>> vfs_write() Duration:101388, filename:sdf, pos:4008181760, diskname:(null), Aver TSC:110360, Total Writes:3600000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1418552128, data_len:0:, Duration:92342 cycles, average:<101633> cycles, Total Writes:3699999
==>> vfs_write() Duration:100043, filename:sdf, pos:4417781760, diskname:(null), Aver TSC:110239, Total Writes:3700000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1419352112, data_len:0:, Duration:92418 cycles, average:<101539> cycles, Total Writes:3799999
==>> vfs_write() Duration:99682, filename:sdf, pos:4827381760, diskname:(null), Aver TSC:110122, Total Writes:3800000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1420152096, data_len:0:, Duration:92594 cycles, average:<101452> cycles, Total Writes:3899999
==>> vfs_write() Duration:100812, filename:sdf, pos:5236981760, diskname:(null), Aver TSC:110014, Total Writes:3900000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1420952096, data_len:0:, Duration:94809 cycles, average:<101371> cycles, Total Writes:3999999
==>> vfs_write() Duration:110330, filename:sdf, pos:5646581760, diskname:(null), Aver TSC:109918, Total Writes:4000000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1421752080, data_len:0:, Duration:100379 cycles, average:<101480> cycles, Total Writes:4099999
==>> vfs_write() Duration:111273, filename:sdf, pos:6056181760, diskname:(null), Aver TSC:110062, Total Writes:4100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1422814208, data_len:0:, Duration:99573 cycles, average:<101579> cycles, Total Writes:4199999
==>> vfs_write() Duration:109303, filename:sdf, pos:6465781760, diskname:(null), Aver TSC:110194, Total Writes:4200000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1423614192, data_len:0:, Duration:100379 cycles, average:<101669> cycles, Total Writes:4299999
==>> vfs_write() Duration:109915, filename:sdf, pos:6875381760, diskname:(null), Aver TSC:110309, Total Writes:4300000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1424414176, data_len:0:, Duration:281558 cycles, average:<101751> cycles, Total Writes:4399999
==>> vfs_write() Duration:109206, filename:sdf, pos:7284981760, diskname:(null), Aver TSC:110407, Total Writes:4400000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1425214160, data_len:0:, Duration:100870 cycles, average:<101828> cycles, Total Writes:4499999
==>> vfs_write() Duration:108867, filename:sdf, pos:7694581760, diskname:(null), Aver TSC:110503, Total Writes:4500000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1426014144, data_len:0:, Duration:99734 cycles, average:<101903> cycles, Total Writes:4599999
==>> vfs_write() Duration:110357, filename:sdf, pos:8104181760, diskname:(null), Aver TSC:110596, Total Writes:4600000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1427076272, data_len:0:, Duration:100318 cycles, average:<101970> cycles, Total Writes:4699999
==>> vfs_write() Duration:110667, filename:sdf, pos:8513781760, diskname:(null), Aver TSC:110677, Total Writes:4700000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1427876256, data_len:0:, Duration:212303 cycles, average:<102030> cycles, Total Writes:4799999
==>> vfs_write() Duration:108276, filename:sdf, pos:8923381760, diskname:(null), Aver TSC:110748, Total Writes:4800000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1428676240, data_len:0:, Duration:104764 cycles, average:<102087> cycles, Total Writes:4899999
==>> vfs_write() Duration:107722, filename:sdf, pos:9332981760, diskname:(null), Aver TSC:110818, Total Writes:4900000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1429476224, data_len:0:, Duration:99606 cycles, average:<102148> cycles, Total Writes:4999999
==>> vfs_write() Duration:109582, filename:sdf, pos:9742581760, diskname:(null), Aver TSC:110897, Total Writes:5000000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1430538352, data_len:0:, Duration:102515 cycles, average:<102207> cycles, Total Writes:5099999
==>> vfs_write() Duration:110294, filename:sdf, pos:10152181760, diskname:(null), Aver TSC:110971, Total Writes:5100000
blk_io_done_account() ===>> WRITE
Disk:sdf, Sector:1431338336, data_len:0:, Duration:102285 cycles, average:<102261> cycles, Total Writes:5199999
#fio -bs=4k -direct=1 -thread -rw=write -size=10G -filename=sdf -name=”sdx” -iodepth=1 -runtime=180
sdx: (g=0): rw=write, bs=4096B-4096B,4096B-4096B,4096B-4096B, ioengine=psync, iodepth=1
fio-2.18
Starting 1 thread
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=83.8MiB/s][r=0,w=21.5k IOPS][eta 00m:00s]
sdx: (groupid=0, jobs=1): err= 0: pid=30495: Mon Apr 17 09:08:26 2017
write: IOPS=22.3k, BW=87.3MiB/s (91.3MB/s)(10.0GiB/117665msec)
clat (usec): min=38, max=4782, avg=44.45, stdev=18.05
lat (usec): min=38, max=4782, avg=44.53, stdev=18.05
clat percentiles (usec):
| 1.00th=[ 40], 5.00th=[ 40], 10.00th=[ 40], 20.00th=[ 40],
| 30.00th=[ 41], 40.00th=[ 41], 50.00th=[ 43], 60.00th=[ 44],
| 70.00th=[ 44], 80.00th=[ 45], 90.00th=[ 46], 95.00th=[ 50],
| 99.00th=[ 89], 99.50th=[ 113], 99.90th=[ 124], 99.95th=[ 147],
| 99.99th=[ 159]
lat (usec) : 50=94.82%, 100=4.41%, 250=0.77%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.01%
lat (msec) : 2=0.01%, 10=0.01%
cpu : usr=4.48%, sys=6.69%, ctx=2621472, majf=0, minf=10
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=0,2621440,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=87.3MiB/s (91.3MB/s), 87.3MiB/s-87.3MiB/s (91.3MB/s-91.3MB/s), io=10.0GiB (10.8GB), run=117665-117665msec
Disk stats (read/write):
sdf: ios=0/2621323, merge=0/23, ticks=0/18446744073499937050, in_queue=105437, util=89.55%
int iss_sdx_device(struct request *req) { if(!req || !req->rq_disk) return 0; if(strstr(req->rq_disk->disk_name, "sdf")) return 1; if(req->rq_disk->major==8 && req->rq_disk->first_minor >=80 && req->rq_disk->first_minor<90) return 1; return 0; } void blk_account_io_done(struct request *req) { static unsigned long io_done_count=0; static unsigned long long io_done_total_tsc=0; /* * Account IO completion. flush_rq isn't accounted as a * normal IO on queueing nor completion. Accounting the * containing request is enough. */ if (blk_do_io_stat(req) && !(req->cmd_flags & REQ_FLUSH_SEQ)) { unsigned long duration = get_cycles() - req->start_time; const int rw = rq_data_dir(req); struct hd_struct *part; int cpu; cpu = part_stat_lock(); part = req->part; part_stat_inc(cpu, part, ios[rw]); part_stat_add(cpu, part, ticks[rw], duration); part_round_stats(cpu, part); part_dec_in_flight(part, rw); if(iss_sdx_device(req)) { if(req->bio && req->bio->bi_rw){ //printk("\n <<=== READ "); } else { io_done_total_tsc = io_done_total_tsc+duration; if(++io_done_count%100000==0){ printk("\n blk_io_done_account() ===>> WRITE "); if(req->rq_disk){ printk(" Disk:%s, Sector:%lu, data_len:%u:, Duration:%lu cycles, average:<%llu> cycles, Total Writes:%lu \n" , req->rq_disk->disk_name, req->__sector, req->__data_len, duration, io_done_total_tsc/(io_done_count-1), io_done_count-1); } } } } hd_struct_put(part); part_stat_unlock(); } }
ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos) { ssize_t ret; /*Debug Kernel Code Excution Time*/ u64 new_tsc = 0, prev_tsc = 0, tsc=0; static unsigned long long total_tsc=0;// w_count=0; static unsigned long long w_count=0; struct gendisk *disk=NULL; if(file->f_inode && file->f_inode->i_bdev){ disk = file->f_inode->i_bdev->bd_disk; } rdtsc_barrier(); prev_tsc = get_cycles(); rdtsc_barrier(); if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!file->f_op || (!file->f_op->write && !file->f_op->aio_write)) return -EINVAL; if (unlikely(!access_ok(VERIFY_READ, buf, count))) return -EFAULT; ret = rw_verify_area(WRITE, file, pos, count); if (ret >= 0) { count = ret; file_start_write(file); if (file->f_op->write) ret = file->f_op->write(file, buf, count, pos); else ret = do_sync_write(file, buf, count, pos); if (ret > 0) { fsnotify_modify(file); add_wchar(current, ret); } inc_syscw(current); file_end_write(file); } /*Final TSC*/ rdtsc_barrier(); new_tsc = get_cycles(); rdtsc_barrier(); if((file->f_path.dentry && strstr(file->f_path.dentry->d_iname, "sdf"))|| (disk && strstr(disk->disk_name, "sdf"))) { tsc = new_tsc-prev_tsc; total_tsc = total_tsc + tsc; if(++w_count%100000==0) printk("\n==>> vfs_write() Duration:%llu, filename:%s, pos:%llu, diskname:%s, Aver TSC:%llu, Total Writes:%llu ", tsc, file->f_path.dentry->d_iname, *pos, disk->disk_name, total_tsc/(w_count-1),w_count); tsc=0; } prev_tsc = new_tsc = 0; return ret; } EXPORT_SYMBOL(vfs_write);
Leave a Reply