本文的llvm版本,是基于llvm7.1.0的release包添加了一些利于理解的日志,没有改功能,已经上传到github:https://github.com/tedcy/llvm7_test
在conanio/gcc5:2.91
的镜像版本,使用项目中的build.sh就能编译
类图
RuntimeDyldImpl是RuntimeDyld的PIMPL模式,因此省略了RuntimeDyld
类图里面有让人眼花缭乱的MemMgr和Resolver传递,实际上都是指向的RTDyldMemoryManager
classDiagram
direction TB
class EngineBuilder {
+setEngineKind(kind)
+setMCJITMemoryManager(RTDyldMemoryManager* mcjmm) MemMgr=mcjmm,Resolver=mcjmm
+ExecutionEngine* create()
}
class ExecutionEngine {
<<abstract>>
+virtual addModule(...)
+virtual finalizeObject()
+virtual uint64 getFunctionAddress(name)
}
class MCJIT {
+addModule(...) override
+finalizeObject() override
+uint64 getFunctionAddress(name) override
}
class RuntimeDyldImpl {
<<合并RuntimeDyld>>
+RuntimeDyldImpl(MemoryManager &MemMgr, \nJITSymbolResolver &Resolver)
+loadObjectImpl() 加载重定位,先计算全局符号表的函数和变量,后调用processRelocationRef计算重定位的符号\n整个过程在findOrEmitSection处理段,会调用MemoryManager的allocateCodeSection和allocateDataSection分配段
+virtual processRelocationRef()
+resolveRelocations() 实施重定位,先调用resolveExternalSymbols处理外部符号,后调用resolveRelocationList处理内部符号
-resolveRelocationList() 调用resolveRelocation
-resolveExternalSymbols() 调用Resolver.lookup()
+virtual resolveRelocation()
}
class RuntimeDyldELF {
<<封装了各平台的重定位细节>>
+resolveRelocation() override
+processRelocationRef() override
}
class JITSymbolResolver {
<<abstract>>
+virtual lookup()
}
class LegacyJITSymbolResolver {
<<abstract>>
+virtual findSymbolInLogicalDylib()
+virtual findSymbol()
+lookup() 调用findSymbolInLogicalDylib和findSymbol
}
class LinkingSymbolResolver {
<<代理模式,先查MCJIT,再查持有的LegacyJITSymbolResolver>>
+findSymbolInLogicalDylib() override
+findSymbol() override
+LinkingSymbolResolver(MCJIT &Parent,\n LegacyJITSymbolResolver* Resolver)
-MCJIT &ParentEngine
-LegacyJITSymbolResolver* ClientResolver
}
class MemoryManager {
<<abstract>>
+virtual allocateCodeSection()
+virtual allocateDataSection()
}
EngineBuilder *-- MCJITMemoryManager: MemMgr
EngineBuilder *-- LegacyJITSymbolResolver: Resolver
EngineBuilder ..> ExecutionEngine : create(),这里调用的MCJIT#58;#58;\ncreateJIT(this->MemMgr, this->Resolver)
MCJIT *-- MCJITMemoryManager : MemMgr(MemMgr)
MCJIT *-- RuntimeDyldImpl : Dyld = RuntimeDyldELF#58;#58;create\n(this->MemMgr, this->Resolver)
MCJIT *-- LinkingSymbolResolver : Resolver(*this, Resolver)
LegacyJITSymbolResolver <|-- LinkingSymbolResolver
JITSymbolResolver <|-- LegacyJITSymbolResolver
RuntimeDyldImpl *-- MemoryManager : MemMgr
RuntimeDyldImpl *-- JITSymbolResolver : Resolver
RuntimeDyldImpl <|-- RuntimeDyldELF
ExecutionEngine <|-- MCJIT
RTDyldMemoryManager <|-- SectionMemoryManager
MCJITMemoryManager <|-- RTDyldMemoryManager
LegacyJITSymbolResolver <|-- RTDyldMemoryManager
MemoryManager <|-- MCJITMemoryManager
demo代码
时序图展示代码在https://github.com/tedcy/llvm7_test/blob/master/demo/engine/main.cpp
核心逻辑就是
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 extern "C" { int pow2 (int val) { return val * val; } } int main () { std::unique_ptr<llvm::Module> module = llvm::make_unique <llvm::Module>("MyModule" , context); llvm::ExecutionEngine* ee = llvm::EngineBuilder (move (module )) .setEngineKind (llvm::EngineKind::JIT) .setMCJITMemoryManager (std::unique_ptr <llvm::RTDyldMemoryManager>( new llvm::SectionMemoryManager)) .setErrorStr (&errMsg) .setVerifyModules (true ) .create (); ee->addGlobalMapping ("pow2" , (uint64_t )&pow2); LLVM_ObjectCache objCache (fileStr) ; ee->setObjectCache (&objCache); ee->finalizeObject (); ee->setObjectCache (nullptr ); uint64_t addr = ee->getFunctionAddress ("pow4" ); typedef int (*pow4_t ) (int ) ; pow4_t fn = (pow4_t )addr; auto result = fn (2 ); cout << result << endl; return 0 ; }
demo时序图
计算重定位
这一部分的重点
把全局函数和变量的符号,找到存到GlobalSymbolTable
计算重定位符号
如果目标是符号
如果能在GlobalSymbolTable找到,说明是内部符号,建立基于段的倒排索引Relocations[目标符号SectionID].push_back(重定位符号SectionID,Offset, Addend等信息)
如果不能找到,说明是外部符号,建立基于外部符号名的倒排索引ExternalSymbolRelocations[目标符号名].push_back(重定位符号SectionID,Offset, Addend等信息)
如果目标是段,直接建立基于段的倒排索引Relocations[目标SectionID].push_back(重定位符号SectionID,Offset, Addend等信息)
sequenceDiagram
main ->>+ MCJIT : addGlobalMapping(Addr="pow2")
MCJIT -> MCJIT : EEState.getGlobalAddressMap()[Name] = Addr
MCJIT ->>- main : return
main ->>+ MCJIT : setObjectCache(&objCache)
MCJIT -> MCJIT : ObjCache = NewCache
MCJIT ->>- main : return
main ->>+ MCJIT : finalizeObject()
loop each M in MCJIT.OwnedModules
MCJIT ->>+ MCJIT : generateCodeForModule(M)
MCJIT ->>+ MCJIT : 如果缓存存在,使用缓存:<br>if(ObjCache) ObjectToLoad=ObjCache->getObject(M)<br>如果缓存不存在,JIT编译:<br>if (!ObjectToLoad) ObjectToLoad=emitObject(M)<br>从Object文件创建ObjectFile:<br>LoadedObject=createObjectFile(ObjectToLoad->getMemBufferRef())
MCJIT ->>+ RuntimeDyld : Dyld = RuntimeDyldELF::create(MemMgr, Resolver)
RuntimeDyld ->>+ RuntimeDyldImpl : Dyld.loadObject(LoadedObject)
RuntimeDyldImpl ->>+ RuntimeDyldImpl : loadObjectImpl(LoadedObject)
loop for I in [LoadedObject.symbol_begin(),LoadedObject.symbol_end()]
RuntimeDyldImpl ->> RuntimeDyldImpl : 处理Weak和Common符号(Weak只是把它Weak去掉,Common符号C++没用上)
alt 计算全局符号表的函数和变量存下来<br>if (SymType == ST_Function or ST_Data)
RuntimeDyldImpl ->>+ RuntimeDyldImpl : findOrEmitSection(I->getSection())
RuntimeDyldImpl ->>+ SectionMemoryManager : MemMgr.allocateDataSection()<br> or MemMgr.allocateCodeSection()
SectionMemoryManager ->>- RuntimeDyldImpl : return
RuntimeDyldImpl ->>- RuntimeDyldImpl : return
RuntimeDyldImpl ->> RuntimeDyldImpl : GlobalSymbolTable[Name]=SymbolTableEntry(I->getSection(), SectOffset)
end
loop each section_iterator Si in [Obj.section_begin(), Obj.section_end()]<br>计算重定位的符号
RuntimeDyldImpl ->>+ RuntimeDyldImpl : findOrEmitSection(*Si)
RuntimeDyldImpl ->>+ SectionMemoryManager : MemMgr.allocateDataSection()<br> or MemMgr.allocateCodeSection()
SectionMemoryManager ->>- RuntimeDyldImpl : return
RuntimeDyldImpl ->>- RuntimeDyldImpl : return
loop each relocation_iterator I in [Si.relocation_begin(), Si.relocation_end()]
RuntimeDyldImpl ->>+ RuntimeDyldELF : processRelocationRef()
alt 如果GlobalSymbolTable能找到
RuntimeDyldELF ->> RuntimeDyldELF : Value.SectionID = SymInfo.getSectionID()<br>Value.Offset = SymInfo.getOffset()<br>Value.Addend = SymInfo.getOffset() + Addend
else 如果SymType是Section
RuntimeDyldELF ->> RuntimeDyldELF : Value.SectionID = Symbol->getSection()<br>Value.Addend = Addend
end
alt 如果Arch是Triple::x86_64 且 RelType是最简单类型(例如R_X86_64_64)
RuntimeDyldELF ->>+ RuntimeDyldELF : processSimpleRelocation(...)
alt 如果Value.SymbolName不为空
RuntimeDyldELF ->>+ RuntimeDyldELF : addRelocationForSymbol(...)
alt 如果全局表GlobalSymbolTable找不到这个符号
RuntimeDyldELF ->> RuntimeDyldELF : 建立倒排索引(目标符号名为主键)<br>ExternalSymbolRelocations[Value.SymbolName].push_back(RE);
else 能找到,改下Addend
RuntimeDyldELF ->> RuntimeDyldELF : 建立倒排索引(目标SectionID为主键)<br>Relocations[Value.SectionID].push_back(RECopy);
end
RuntimeDyldELF ->>- RuntimeDyldELF : return
else 如果Value.SymbolName为空
RuntimeDyldELF ->> RuntimeDyldELF : 建立倒排索引(目标SectionID为主键)<br>Relocations[Value.SectionID].push_back(RE);
end
RuntimeDyldELF ->>- RuntimeDyldELF : return
end
RuntimeDyldELF ->>- RuntimeDyldImpl : return
end
end
end
RuntimeDyldImpl ->>- RuntimeDyldImpl : return
RuntimeDyldImpl ->>- RuntimeDyld : return
RuntimeDyld ->>- MCJIT : return
MCJIT ->>- MCJIT : return
MCJIT ->>- MCJIT : return
end
MCJIT ->>+ MCJIT : finalizeLoadedModules()省略,下一节展示
MCJIT ->>- MCJIT : return
MCJIT ->>- main : return
main ->>+ MCJIT : getFunctionAddress("pow4")
MCJIT ->>- main : return
写入重定位到内存
上一节计算结果在四个数据结构里面:
MCJIT的EEState.getGlobalAddressMap()
通过addGlobalMapping()接口主动导入的符号表
RuntimeDyldELF的GlobalSymbolTable
全局函数和变量的符号表
RuntimeDyldELF的ExternalSymbolRelocations
基于外部符号的倒排索引,本节基于EEState.getGlobalAddressMap()和dlsym查找对他写入重定位到内存
RuntimeDyldELF的Relocations
基于段的倒排索引,本节对他写入重定位到内存
sequenceDiagram
main ->>+ MCJIT : finalizeObject()
loop each M in MCJIT.OwnedModules
MCJIT ->>+ MCJIT : generateCodeForModule(M)省略,上一节展示
MCJIT ->>- MCJIT : return
end
MCJIT ->>+ MCJIT : finalizeLoadedModules()
MCJIT ->>+ RuntimeDyldImpl : 应用重定位:<br>resolveRelocations()
RuntimeDyldImpl ->>+ RuntimeDyldImpl : 应用外部符号重定位:resolveExternalSymbols()
RuntimeDyldImpl ->> RuntimeDyldImpl: NewSymbols = ExternalSymbolRelocations
RuntimeDyldImpl ->>+ LegacyJITSymbolResolver : Resolver.lookup<br>(NewSymbols)
loop SymName in NewSymbols
LegacyJITSymbolResolver ->> LinkingSymbolResolver : 从本地动态库找<br>(MCJIT不支持,<br>固定返回nullptr)<br>findSymbolInLogicalDylib<br>(SymName)
LegacyJITSymbolResolver ->>+ LinkingSymbolResolver : findSymbol(SymName)
LinkingSymbolResolver ->>+ MCJIT : ParentEngine.findSymbol(SymName)
MCJIT ->>+ MCJIT : findExistingSymbol(SymName)
MCJIT ->>+ MCJIT : getPointerToGlobalIfAvailable(SymName)
MCJIT ->>+ MCJIT : getAddressToGlobalIfAvailable(SymName)
MCJIT ->>+ MCJIT : 在EEState.getGlobalAddressMap()里面查找
MCJIT ->>- MCJIT : return
MCJIT ->>- MCJIT : return
MCJIT ->>- MCJIT : return
alt 如果在EEState.getGlobalAddressMap()里面没找到
MCJIT ->>+ RuntimeDyldImpl : Dyld.getSymbol(SymName)
RuntimeDyldImpl ->> RuntimeDyldImpl : 在已经加载的全局符号表里面查找<br>GlobalSymbolTable.find(SymName)
RuntimeDyldImpl ->>- MCJIT : return
MCJIT ->>- MCJIT : return
MCJIT ->>- LinkingSymbolResolver : return
end
alt 如果在ParentEngine里面findSymbol没找到
LinkingSymbolResolver ->>+ RTDyldMemoryManager : ClientResolver-><br>findSymbol(SymName)
RTDyldMemoryManager ->>+ RTDyldMemoryManager : getSymbolAddress(SymName)
RTDyldMemoryManager ->>+ RTDyldMemoryManager : getSymbolAddressInProcess(SymName)
RTDyldMemoryManager ->>+ sys#58;#58;DynamicLibrary : SearchForAddressOfSymbol<br>(SymName)
sys#58;#58;DynamicLibrary ->> sys#58;#58;DynamicLibrary : 通过sys::DynamicLibrary::AddSymbol()<br>加入的符号会在ExplicitSymbols里面:<br>ExplicitSymbols->find(SymbolName)<br>在全部已经加载的动态库里面用dlsym找:<br>::dlsym(SymbolName)
sys#58;#58;DynamicLibrary ->>- RTDyldMemoryManager : return
RTDyldMemoryManager ->>- RTDyldMemoryManager : return
RTDyldMemoryManager ->>- RTDyldMemoryManager : return
RTDyldMemoryManager ->>- LinkingSymbolResolver : return
end
LinkingSymbolResolver ->>- LegacyJITSymbolResolver : return
end
LegacyJITSymbolResolver ->>- RuntimeDyldImpl : return
RuntimeDyldImpl ->>- RuntimeDyldImpl : return
loop 遍历目标SectionId -> 需要重定位的RelocationList的倒排索引<br>each [int 目标SectionID, RelocationList Relocs] in [Relocations.begin(), Relocations.end()]
RuntimeDyldImpl ->>+ RuntimeDyldImpl : 应用内部段重定位:<br>目标Addr是段地址:<br>目标Addr=Sections[目标SectionID].getLoadAddress()<br>resolveRelocationList(RelocationList, 目标Addr)
loop each [RelocationEntry RE] in RelocationList
RuntimeDyldImpl ->>+ RuntimeDyldELF : resolveRelocation(RE, 目标Addr)
RuntimeDyldELF ->>+ RuntimeDyldELF : 如果Arch是Triple::x86_64,那么<br>resolveX86_64Relocation<br>(SectionEntry 重定位Section=Sections[RE.SectionID], <br>RE.Offset, 目标Addr, RE.RelType, RE.Addend, RE.SymOffset)
RuntimeDyldELF ->> RuntimeDyldELF : 重定位Section.getAddressWithOffset(RE.Offset)=<br>目标Addr + RE.Addend
RuntimeDyldELF ->>- RuntimeDyldELF : return
RuntimeDyldELF ->>- RuntimeDyldImpl : return
end
RuntimeDyldImpl ->>- RuntimeDyldImpl : return
end
RuntimeDyldImpl ->>- MCJIT : return
MCJIT ->>- MCJIT : return
MCJIT ->>- main : return
从demo日志看实际重定位细节
绝对地址重定位(R_X86_64_64)
demo所要读取的绝对地址重定位的a.o代码在这里:https://github.com/tedcy/llvm7_test/blob/master/demo/engine/cpp_file/a.cc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 extern "C" { int pow2 (int val) ; void dyFunc () ; static int globalV0 asm ("globalV0" ) = 0 ; static const int globalConstV0 asm ("globalConstV0" ) = 0 ; static int globalV1 asm ("globalV1" ) = 1 ; int __attribute__((weak)) globalWeakV0 asm ("globalWeakV0" ) = 1 ; void pow3 () { globalV0 = 2 ; *(int *)&globalConstV0 = 2 ; globalV1 = 2 ; globalWeakV0 = 2 ; } int pow4 (int val) { dyFunc (); return pow2 (val) * pow2 (val); } }
使用这个目录下面的build.sh编译得到a.o
1 2 3 4 5 6 7 8 9 10 11 ~ cat build.sh clang++ -m64 -g -std=c++0x -fno-use-cxa-atexit -fnon-call-exceptions -c -emit-llvm a.cc llvm-dis a.bc # got模式(必须开启PIC模式,否则全局变量会在使用R_X86_64_32S后因为绝对地址大于32位挂掉) # llc -filetype=obj -relocation-model=pic a.bc -o a.o # 绝对地址模式 llc -filetype=obj -code-model=large a.bc -o a.o ~ build.sh
符号表如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 ~ readelf -s a.o Symbol table '.symtab' contains 18 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS a.cc 2: 0000000000000000 4 OBJECT LOCAL DEFAULT 6 globalConstV0 3: 0000000000000000 4 OBJECT LOCAL DEFAULT 5 globalV0 4: 0000000000000004 4 OBJECT LOCAL DEFAULT 4 globalV1 5: 0000000000000000 0 SECTION LOCAL DEFAULT 2 6: 0000000000000000 0 SECTION LOCAL DEFAULT 4 7: 0000000000000000 0 SECTION LOCAL DEFAULT 5 8: 0000000000000000 0 SECTION LOCAL DEFAULT 6 9: 0000000000000000 0 SECTION LOCAL DEFAULT 7 10: 0000000000000000 0 SECTION LOCAL DEFAULT 8 11: 0000000000000000 0 SECTION LOCAL DEFAULT 9 12: 0000000000000000 0 SECTION LOCAL DEFAULT 20 13: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND dyFunc 14: 0000000000000000 4 OBJECT WEAK DEFAULT 4 globalWeakV0 15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND pow2 16: 0000000000000000 70 FUNC GLOBAL DEFAULT 2 pow3 17: 0000000000000050 65 FUNC GLOBAL DEFAULT 2 pow4
重定位表的一部分如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x698 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000006 000700000001 R_X86_64_64 0000000000000000 .bss + 0 000000000016 000800000001 R_X86_64_64 0000000000000000 .rodata + 0 000000000026 000600000001 R_X86_64_64 0000000000000000 .data + 4 000000000036 000e00000001 R_X86_64_64 0000000000000000 globalWeakV0 + 0 00000000005b 000d00000001 R_X86_64_64 0000000000000000 dyFunc + 0 00000000006a 000f00000001 R_X86_64_64 0000000000000000 pow2 + 0 00000000007b 000f00000001 R_X86_64_64 0000000000000000 pow2 + 0 Relocation section '.rela.debug_info' at offset 0x740 contains 20 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000006 000a0000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0 00000000000c 00090000000a R_X86_64_32 0000000000000000 .debug_str + 0 000000000012 00090000000a R_X86_64_32 0000000000000000 .debug_str + 7c 000000000016 000c0000000a R_X86_64_32 0000000000000000 .debug_line + 0 00000000001a 00090000000a R_X86_64_32 0000000000000000 .debug_str + 81 00000000001e 000500000001 R_X86_64_64 0000000000000000 .text + 0 00000000002b 00090000000a R_X86_64_32 0000000000000000 .debug_str + a7 000000000037 000e00000001 R_X86_64_64 0000000000000000 globalWeakV0 + 0 000000000040 00090000000a R_X86_64_32 0000000000000000 .debug_str + b4 000000000047 00090000000a R_X86_64_32 0000000000000000 .debug_str + b8 000000000053 000700000001 R_X86_64_64 0000000000000000 .bss + 0 00000000005c 00090000000a R_X86_64_32 0000000000000000 .debug_str + c1 000000000068 000800000001 R_X86_64_64 0000000000000000 .rodata + 0 000000000076 00090000000a R_X86_64_32 0000000000000000 .debug_str + cf 000000000082 000600000001 R_X86_64_64 0000000000000000 .data + 4 000000000090 000500000001 R_X86_64_64 0000000000000000 .text + 0 00000000009e 00090000000a R_X86_64_32 0000000000000000 .debug_str + d8 0000000000a5 000500000001 R_X86_64_64 0000000000000000 .text + 50 0000000000b3 00090000000a R_X86_64_32 0000000000000000 .debug_str + dd 0000000000c1 00090000000a R_X86_64_32 0000000000000000 .debug_str + e2
用demo去执行a.o日志如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 llvm_register_debuger_printer set ok! dbg|JIT: Map 'pow2' to [0x40e499] dbg|ev_start_finalizeObject|cost:0 us dbg|ev_start_generateCodeForModule|cost:13 us dbg|ev_start_createObjectFile|cost:7 us dbg|ev_end_createObjectFile|cost:38 us dbg|ev_start_loadObject|cost:3 us dbg|start Dyld->loadObject|cost:18 us dbg|start RuntimeDyldImpl::loadObjectImpl|cost:6 us dbg|start Parse symbols|cost:34 us dbg| start emitSection: .rodata|cost:33 us dbg| emitSection SectionID: 0 Name: .rodata obj addr: 0x3d0117c new addr: 0x7f0866641000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .rodata|cost:35 us dbg| Type: 1 Name: globalConstV0 SID: 0 Offset: (nil) flags: 0 dbg| start emitSection: .bss|cost:21 us dbg| emitSection SectionID: 1 Name: .bss obj addr: (nil) new addr: 0x7f0866640000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .bss|cost:21 us dbg| Type: 1 Name: globalV0 SID: 1 Offset: (nil) flags: 0 dbg| start emitSection: .data|cost:18 us dbg| emitSection SectionID: 2 Name: .data obj addr: 0x3d01174 new addr: 0x7f0866640004 DataSize: 8 StubBufSize: 0 Allocate: 8 dbg| end emitSection: .data|cost:16 us dbg| Type: 1 Name: globalV1 SID: 2 Offset: 0x4 flags: 0 dbg| Type: 1 Name: globalWeakV0 SID: 2 Offset: (nil) flags: 70 dbg| start emitSection: .text|cost:57 us dbg| emitSection SectionID: 3 Name: .text obj addr: 0x3d010e0 new addr: 0x7f086663f000 DataSize: 145 StubBufSize: 0 Allocate: 145 dbg| end emitSection: .text|cost:29 us dbg| Type: 4 Name: pow3 SID: 3 Offset: (nil) flags: 66 dbg| Type: 4 Name: pow4 SID: 3 Offset: 0x50 flags: 66 dbg|end Parse symbols|cost:33 us dbg|start Parse relocations|cost:3 us dbg|start Parse relocations for SectionID:3|cost:15 us dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 6 Target SectionID: 1 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 1 RelocationEntry SectionID: 3 Offset: 6 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 22 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 0 RelocationEntry SectionID: 3 Offset: 22 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 4 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 38 Target SectionID: 2 Offset: 0 Addend: 4 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 3 Offset: 38 Addend: 4 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: globalWeakV0 dbg| SectionID: 3 Offset: 54 Target SectionID: 2 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 3 Offset: 54 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dyFunc dbg| SectionID: 3 Offset: 91 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:dyFunc RelocationEntry SectionID: 3 Offset: 91 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: pow2 dbg| SectionID: 3 Offset: 106 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:pow2 RelocationEntry SectionID: 3 Offset: 106 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: pow2 dbg| SectionID: 3 Offset: 123 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:pow2 RelocationEntry SectionID: 3 Offset: 123 Addend: 0 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:3|cost:185 us dbg| start emitSection: .debug_info|cost:8 us dbg| emitSection SectionID: 4 Name: .debug_info obj addr: 0x3d012ed new addr: 0 DataSize: 205 StubBufSize: 78 Allocate: 0 dbg| end emitSection: .debug_info|cost:30 us dbg|start Parse relocations for SectionID:4|cost:5 us dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| start emitSection: .debug_abbrev|cost:17 us dbg| emitSection SectionID: 5 Name: .debug_abbrev obj addr: 0x3d01266 new addr: 0 DataSize: 135 StubBufSize: 0 Allocate: 0 dbg| end emitSection: .debug_abbrev|cost:15 us dbg| SectionID: 4 Offset: 6 Target SectionID: 5 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 5 RelocationEntry SectionID: 4 Offset: 6 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| start emitSection: .debug_str|cost:44 us dbg| emitSection SectionID: 6 Name: .debug_str obj addr: 0x3d01180 new addr: 0 DataSize: 230 StubBufSize: 0 Allocate: 0 dbg| end emitSection: .debug_str|cost:15 us dbg| SectionID: 4 Offset: 12 Target SectionID: 6 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 12 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 124 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 18 Target SectionID: 6 Offset: 0 Addend: 124 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 18 Addend: 124 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| start emitSection: .debug_line|cost:54 us dbg| emitSection SectionID: 7 Name: .debug_line obj addr: 0x3d01520 new addr: 0 DataSize: 99 StubBufSize: 0 Allocate: 0 dbg| end emitSection: .debug_line|cost:17 us dbg| SectionID: 4 Offset: 22 Target SectionID: 7 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 7 RelocationEntry SectionID: 4 Offset: 22 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 129 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 26 Target SectionID: 6 Offset: 0 Addend: 129 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 26 Addend: 129 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 30 Target SectionID: 3 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 4 Offset: 30 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 167 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 43 Target SectionID: 6 Offset: 0 Addend: 167 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 43 Addend: 167 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: globalWeakV0 dbg| SectionID: 4 Offset: 55 Target SectionID: 2 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 4 Offset: 55 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 180 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 64 Target SectionID: 6 Offset: 0 Addend: 180 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 64 Addend: 180 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 184 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 71 Target SectionID: 6 Offset: 0 Addend: 184 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 71 Addend: 184 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 83 Target SectionID: 1 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 1 RelocationEntry SectionID: 4 Offset: 83 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 193 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 92 Target SectionID: 6 Offset: 0 Addend: 193 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 92 Addend: 193 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 104 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 0 RelocationEntry SectionID: 4 Offset: 104 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 207 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 118 Target SectionID: 6 Offset: 0 Addend: 207 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 118 Addend: 207 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 4 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 130 Target SectionID: 2 Offset: 0 Addend: 4 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 4 Offset: 130 Addend: 4 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 144 Target SectionID: 3 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 4 Offset: 144 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 216 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 158 Target SectionID: 6 Offset: 0 Addend: 216 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 158 Addend: 216 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 80 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 165 Target SectionID: 3 Offset: 0 Addend: 80 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 4 Offset: 165 Addend: 80 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 221 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 179 Target SectionID: 6 Offset: 0 Addend: 221 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 179 Addend: 221 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 10 Addend: 226 TargetName: dbg| This is section symbol dbg| SectionID: 4 Offset: 193 Target SectionID: 6 Offset: 0 Addend: 226 dbg| addRelocationForSection add Relocations, Target SectionID: 6 RelocationEntry SectionID: 4 Offset: 193 Addend: 226 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:4|cost:367 us dbg| start emitSection: .debug_pubnames|cost:7 us dbg| emitSection SectionID: 8 Name: .debug_pubnames obj addr: 0x3d013bb new addr: 0 DataSize: 97 StubBufSize: 6 Allocate: 0 dbg| end emitSection: .debug_pubnames|cost:17 us dbg|start Parse relocations for SectionID:8|cost:4 us dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 8 Offset: 6 Target SectionID: 4 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 8 Offset: 6 Addend: 0 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:8|cost:25 us dbg| start emitSection: .debug_pubtypes|cost:6 us dbg| emitSection SectionID: 9 Name: .debug_pubtypes obj addr: 0x3d0141c new addr: 0 DataSize: 26 StubBufSize: 6 Allocate: 0 dbg| end emitSection: .debug_pubtypes|cost:17 us dbg|start Parse relocations for SectionID:9|cost:7 us dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 9 Offset: 6 Target SectionID: 4 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 9 Offset: 6 Addend: 0 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:9|cost:25 us dbg| start emitSection: .eh_frame|cost:6 us dbg| emitSection SectionID: 10 Name: .eh_frame obj addr: 0x3d014b8 new addr: 0x7f0866641008 DataSize: 108 StubBufSize: 0 Allocate: 108 dbg| end emitSection: .eh_frame|cost:19 us dbg|start Parse relocations for SectionID:10|cost:4 us dbg| start processRelocationRef for RelType: 24 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 10 Offset: 32 Target SectionID: 3 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 10 Offset: 32 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 24 Addend: 80 TargetName: dbg| This is section symbol dbg| SectionID: 10 Offset: 72 Target SectionID: 3 Offset: 0 Addend: 80 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 10 Offset: 72 Addend: 80 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:10|cost:47 us dbg|start Parse relocations for SectionID:7|cost:5 us dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 7 Offset: 41 Target SectionID: 3 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 3 RelocationEntry SectionID: 7 Offset: 41 Addend: 0 dbg| end processRelocationRef dbg|end Parse relocations for SectionID:7|cost:23 us dbg|end Parse relocations|cost:3 us dbg|end RuntimeDyldImpl::loadObjectImpl|cost:17 us dbg|end Dyld->loadObject|cost:8 us dbg|ev_end_loadObject|cost:3 us dbg|ev_notifyObjectEmitted|cost:3 us dbg|ev_start_storageBufferAndObjects|cost:43 us dbg|ev_end_storageBufferAndObjects|cost:6 us dbg|ev_end_generateCodeForModule|cost:5 us dbg|ev_start_finalizeLoadedModules|cost:2 us dbg|ev_start_relocations|cost:2 us dbg|----- Contents of section .rodata before relocations ----- dbg|0x00007f0866641000: 00 00 00 00 dbg|----- Contents of section .bss before relocations ----- dbg|0x00007f0866640000: 00 00 00 00 dbg|----- Contents of section .data before relocations ----- dbg|0x00007f0866640000: 01 00 00 00 01 00 00 00 dbg|----- Contents of section .text before relocations ----- dbg|0x00007f086663f000: 55 48 89 e5 48 b8 00 00 00 00 00 00 00 00 c7 00 dbg|0x00007f086663f010: 02 00 00 00 48 b8 00 00 00 00 00 00 00 00 c7 00 dbg|0x00007f086663f020: 02 00 00 00 48 b8 00 00 00 00 00 00 00 00 c7 00 dbg|0x00007f086663f030: 02 00 00 00 48 b8 00 00 00 00 00 00 00 00 c7 00 dbg|0x00007f086663f040: 02 00 00 00 5d c3 90 90 90 90 90 90 90 90 90 90 dbg|0x00007f086663f050: 55 48 89 e5 53 50 89 7d f4 48 b8 00 00 00 00 00 dbg|0x00007f086663f060: 00 00 00 ff d0 8b 7d f4 48 b8 00 00 00 00 00 00 dbg|0x00007f086663f070: 00 00 ff d0 89 c3 8b 7d f4 48 b8 00 00 00 00 00 dbg|0x00007f086663f080: 00 00 00 ff d0 0f af d8 89 d8 48 83 c4 08 5b 5d dbg|0x00007f086663f090: c3 dbg|----- Contents of section .debug_info before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_abbrev before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_str before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_line before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_pubnames before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_pubtypes before relocations ----- dbg| <section not emitted> dbg|----- Contents of section .eh_frame before relocations ----- dbg|0x00007f0866641000: 14 00 00 00 00 00 00 00 dbg|0x00007f0866641010: 01 7a 52 00 01 78 10 01 1c 0c 07 08 90 01 00 00 dbg|0x00007f0866641020: 24 00 00 00 1c 00 00 00 00 00 00 00 00 00 00 00 dbg|0x00007f0866641030: 46 00 00 00 00 00 00 00 00 41 0e 10 86 02 43 0d dbg|0x00007f0866641040: 06 02 41 0c 07 08 00 00 24 00 00 00 44 00 00 00 dbg|0x00007f0866641050: 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 dbg|0x00007f0866641060: 00 41 0e 10 86 02 43 0d 06 42 83 03 7a 0c 07 08 dbg|0x00007f0866641070: 00 00 00 00 dbg|start resolveExternalSymbols dbg| ------------ ExternalSymbolRelocations.size() = 2 ------------ dbg| ExternalSymbolRelocations[0] = pow2 dbg| ExternalSymbolRelocations[1] = dyFunc dbg| ------------------------------------------ dbg|resolveExternalSymbols while start dbg| NewSymbols.insert(pow2) dbg| NewSymbols.insert(dyFunc) dbg|LegacyJITSymbolResolver::lookup dbg| find SymName=dyFunc dbg| not found and no Sym.takeError dbg| dyFunc found in OpenedHandles dbg| findSymbol success dbg| has address dbg| find SymName=pow2 dbg| not found and no Sym.takeError dbg| pow2 found in findExistingSymbol dbg| findSymbol success dbg| has address dbg|resolveExternalSymbols while end, has new symbols, try resolve more dbg|resolveExternalSymbols while start dbg|resolveExternalSymbols while end, no new symbols resolved, break dbg| ------------ ExternalSymbolMap.size() = 2 ------------ dbg| ExternalSymbolMap[0] = pow2 dbg| ExternalSymbolMap[1] = dyFunc dbg| ------------------------------------------ dbg|Resolving relocations Name: pow2 0x40e499 dbg|Type R_X86_64_64 Writing 0x40e499 at 0x7f086663f06a dbg|Type R_X86_64_64 Writing 0x40e499 at 0x7f086663f07b dbg|Resolving relocations Name: dyFunc 0x7f0864f666c0 dbg|Type R_X86_64_64 Writing 0x7f0864f666c0 at 0x7f086663f05b dbg|end resolveExternalSymbols dbg|Resolving relocations Section #4 (nil) dbg|Resolving relocations Section #3 0x7f086663f000 dbg|Type R_X86_64_PC64 Writing 0xffffffffffffdfd8 at 0x7f0866641028 dbg|Type R_X86_64_PC64 Writing 0xffffffffffffe000 at 0x7f0866641050 dbg|Resolving relocations Section #7 (nil) dbg|Resolving relocations Section #6 (nil) dbg|Resolving relocations Section #1 0x7f0866640000 dbg|Type R_X86_64_64 Writing 0x7f0866640000 at 0x7f086663f006 dbg|Resolving relocations Section #0 0x7f0866641000 dbg|Type R_X86_64_64 Writing 0x7f0866641000 at 0x7f086663f016 dbg|Resolving relocations Section #5 (nil) dbg|Resolving relocations Section #2 0x7f0866640004 dbg|Type R_X86_64_64 Writing 0x7f0866640008 at 0x7f086663f026 dbg|Type R_X86_64_64 Writing 0x7f0866640004 at 0x7f086663f036 dbg|----- Contents of section .rodata after relocations ----- dbg|0x00007f0866641000: 00 00 00 00 dbg|----- Contents of section .bss after relocations ----- dbg|0x00007f0866640000: 00 00 00 00 dbg|----- Contents of section .data after relocations ----- dbg|0x00007f0866640000: 01 00 00 00 01 00 00 00 dbg|----- Contents of section .text after relocations ----- dbg|0x00007f086663f000: 55 48 89 e5 48 b8 00 00 64 66 08 7f 00 00 c7 00 dbg|0x00007f086663f010: 02 00 00 00 48 b8 00 10 64 66 08 7f 00 00 c7 00 dbg|0x00007f086663f020: 02 00 00 00 48 b8 08 00 64 66 08 7f 00 00 c7 00 dbg|0x00007f086663f030: 02 00 00 00 48 b8 04 00 64 66 08 7f 00 00 c7 00 dbg|0x00007f086663f040: 02 00 00 00 5d c3 90 90 90 90 90 90 90 90 90 90 dbg|0x00007f086663f050: 55 48 89 e5 53 50 89 7d f4 48 b8 c0 66 f6 64 08 dbg|0x00007f086663f060: 7f 00 00 ff d0 8b 7d f4 48 b8 99 e4 40 00 00 00 dbg|0x00007f086663f070: 00 00 ff d0 89 c3 8b 7d f4 48 b8 99 e4 40 00 00 dbg|0x00007f086663f080: 00 00 00 ff d0 0f af d8 89 d8 48 83 c4 08 5b 5d dbg|0x00007f086663f090: c3 dbg|----- Contents of section .debug_info after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_abbrev after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_str after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_line after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_pubnames after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .debug_pubtypes after relocations ----- dbg| <section not emitted> dbg|----- Contents of section .eh_frame after relocations ----- dbg|0x00007f0866641000: 14 00 00 00 00 00 00 00 dbg|0x00007f0866641010: 01 7a 52 00 01 78 10 01 1c 0c 07 08 90 01 00 00 dbg|0x00007f0866641020: 24 00 00 00 1c 00 00 00 d8 df ff ff ff ff ff ff dbg|0x00007f0866641030: 46 00 00 00 00 00 00 00 00 41 0e 10 86 02 43 0d dbg|0x00007f0866641040: 06 02 41 0c 07 08 00 00 24 00 00 00 44 00 00 00 dbg|0x00007f0866641050: 00 e0 ff ff ff ff ff ff 41 00 00 00 00 00 00 00 dbg|0x00007f0866641060: 00 41 0e 10 86 02 43 0d 06 42 83 03 7a 0c 07 08 dbg|0x00007f0866641070: 00 00 00 00 dbg|ev_end_relocations|cost:730 us dbg|ev_start_registerEHFrames|cost:3 us dbg|ev_end_registerEHFrames|cost:10 us dbg|ev_end_finalizeLoadedModules|cost:14 us dbg|ev_end_finalizeObject|cost:6 us ============================== engine loaded engine get getFunctionAddress start dbg| pow4 found in findExistingSymbol ...省略一部分dump日志 engine get getFunctionAddress end dyFunc 16
日志量不小,从变量和函数分析吧
内部符号
globalConstV0
1 2 3 4 5 static int globalV0 asm ("globalV0" ) = 0 ;void pow3 () { globalV0 = 2 ; }
计算重定位
globalConstV0是全局的const变量,虽然初始化为0(初始化为0的非const变量在.bss段),但是依然在.rodata
下面的日志是在扫描全局变量,Parse symbols时,为object文件中地址0x3d0117c的段.rodata分配了虚拟内存地址0x7f0866641000
1 2 3 4 dbg| start emitSection: .rodata|cost:33 us dbg| emitSection SectionID: 0 Name: .rodata obj addr: 0x3d0117c new addr: 0x7f0866641000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .rodata|cost:35 us dbg| Type: 1 Name: globalConstV0 SID: 0 Offset: (nil) flags: 0
然后进行计算重定位Parse relocations,SectionID为3,Offset为22(也就是0x16)的地方使用了这个符号
1 2 3 4 5 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 22 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 0 RelocationEntry SectionID: 3 Offset: 22 Addend: 0 dbg| end processRelocationRef
SectionID 3是什么呢,扫码全局变量,Parse symbols时的日志有这么一段
1 2 3 dbg| start emitSection: .text|cost:57 us dbg| emitSection SectionID: 3 Name: .text obj addr: 0x3d010e0 new addr: 0x7f086663f000 DataSize: 145 StubBufSize: 0 Allocate: 145 dbg| end emitSection: .text|cost:29 us
也就是说.text
的offset为0x16的地方,需要重定位成Target SectionID: 0 Offset: 0
,也就是.rodata+0
的globalConstV0
回顾重定位表也可以看到一样的信息
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x698 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000016 000800000001 R_X86_64_64 0000000000000000 .rodata + 0
写入重定位到内存
上面的关键信息已经写入了倒排索引Relocations[目标SectionID=0,也就是.rodata+0].push_back(重定位SectionID=3也就是.text, 重定位Offset: 22 = 0x16)
在RelocationList的时候会触发写入重定位到内存:
把SectionID = 0,也就是.rodata + 0
的内存地址0x7f0866641000
写入到
重定位SectionID=3也就是.text, 重定位Offset: 22=0x16
的内存地址0x7f086663f000 + 0x16 = 0x7f086663f016
日志如下:
1 2 dbg|Resolving relocations Section #0 0x7f0866641000 dbg|Type R_X86_64_64 Writing 0x7f0866641000 at 0x7f086663f016
写入前地址的内存,0x7f086663f016对应第7个字符:
1 dbg|0x00007f086663f010: 02 00 00 00 48 b8 (00 00 00 00 00 00 00 00) c7 00
写入后地址的内存,0x7f086663f016对应第7个字符:
1 dbg|0x00007f086663f010: 02 00 00 00 48 b8 (00 10 64 66 08 7f 00 00) c7 00
写入了8个字节(00 10 64 66 08 7f 00 00):也就是0x00007f0866641000
globalV0,globalV1,globalWeakV0
1 2 3 4 5 6 7 8 9 10 11 12 static int globalV0 asm ("globalV0" ) = 0 ;static const int globalConstV0 asm ("globalConstV0" ) = 0 ;static int globalV1 asm ("globalV1" ) = 1 ;int __attribute__((weak)) globalWeakV0 asm ("globalWeakV0" ) = 1 ;void pow3 () { *(int *)&globalConstV0 = 2 ; globalV1 = 2 ; globalWeakV0 = 2 ; }
globalV0由于没有初始化,在.bss区域
globalV1初始化了,在.data区域
globalWeakV0虽然声明成了weak符号,但是在JIT相当于是静态链接的情况下,是不是weak没区别
看符号表和重定位表信息如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ~ readelf -s a.o Symbol table '.symtab' contains 18 entries: Num: Value Size Type Bind Vis Ndx Name 3: 0000000000000000 4 OBJECT LOCAL DEFAULT 5 globalV0 4: 0000000000000004 4 OBJECT LOCAL DEFAULT 4 globalV1 14: 0000000000000000 4 OBJECT WEAK DEFAULT 4 globalWeakV0 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x698 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000006 000700000001 R_X86_64_64 0000000000000000 .bss + 0 000000000026 000600000001 R_X86_64_64 0000000000000000 .data + 4 000000000036 000e00000001 R_X86_64_64 0000000000000000 globalWeakV0 + 0
对应日志是类似的,只有globalWeakV0的重定位有点小区别,是指向的全局变量
在计算重定位的日志中,这个全局变量最后还是转换成了目标段的偏移量,存到了Relocations,因此流程还是一致的
1 2 3 4 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: globalWeakV0 dbg| SectionID: 3 Offset: 54 Target SectionID: 2 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 3 Offset: 54 Addend: 0 dbg| end processRelocationRef
debug_info等debug段
debug段单独拿出来说,是因为他只有计算重定位,没有写入重定位
这是因为debug段是给gdb等debug工具使用的,平时用不上,可以不进行分配
计算重定位
以.debug_info
的第一个符号为例,查看重定位表
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.debug_info' at offset 0x740 contains 20 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000006 000a0000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0
这个符号是.debug_info
段offset为6的位置,指向了.debug_abbrev
段,因此在计算重定位的时候,也为.debug_abbrev
段进行emitSection了
1 2 3 4 5 6 7 8 9 10 11 12 dbg| start emitSection: .debug_info|cost:8 us dbg| emitSection SectionID: 4 Name: .debug_info obj addr: 0x3d012ed new addr: 0 DataSize: 205 StubBufSize: 78 Allocate: 0 dbg| end emitSection: .debug_info|cost:30 us dbg|start Parse relocations for SectionID:4|cost:5 us dbg| start processRelocationRef for RelType: 10 Addend: 0 TargetName: dbg| This is section symbol dbg| start emitSection: .debug_abbrev|cost:17 us dbg| emitSection SectionID: 5 Name: .debug_abbrev obj addr: 0x3d01266 new addr: 0 DataSize: 135 StubBufSize: 0 Allocate: 0 dbg| end emitSection: .debug_abbrev|cost:15 us dbg| SectionID: 4 Offset: 6 Target SectionID: 5 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 5 RelocationEntry SectionID: 4 Offset: 6 Addend: 0 dbg| end processRelocationRef
但是emitSection的日志和上文不一样了,上文是
1 dbg| emitSection SectionID: 0 Name: .rodata obj addr: 0x3d0117c new addr: 0x7f0866641000 DataSize: 4 StubBufSize: 0 Allocate: 4
这里是
1 2 dbg| emitSection SectionID: 4 Name: .debug_info obj addr: 0x3d012ed new addr: 0 DataSize: 205 StubBufSize: 78 Allocate: 0 dbg| emitSection SectionID: 5 Name: .debug_abbrev obj addr: 0x3d01266 new addr: 0 DataSize: 135 StubBufSize: 0 Allocate: 0
后面写入重定位到内存的日志处理Relocations的倒排索引时可以看到
由于emitSection没有分配虚拟内存地址,因此只有nil,没有发生写入重定位到内存
1 2 dbg|Resolving relocations Section #4 (nil) dbg|Resolving relocations Section #5 (nil)
这是因为.debug_info
等debug段没有SHF_ALLOC的flag,不会分配内存,就不用写入重定位到内存了
什么是SHF_ALLOC段呢,查看段表
Flags里面带A的,就是SHF_ALLOC,这里面只有.text, .data, .bss, rodata, .eh_frame
需要分配内存
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 ~ readelf -S a.o There are 23 section headers, starting at offset 0xab0: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .strtab STRTAB 0000000000000000 00000998 0000000000000113 0000000000000000 0 0 1 [ 2] .text PROGBITS 0000000000000000 00000040 0000000000000091 0000000000000000 AX 0 0 16 [ 3] .rela.text RELA 0000000000000000 00000698 00000000000000a8 0000000000000018 22 2 8 [ 4] .data PROGBITS 0000000000000000 000000d4 0000000000000008 0000000000000000 WA 0 0 4 [ 5] .bss NOBITS 0000000000000000 000000dc 0000000000000004 0000000000000000 WA 0 0 4 [ 6] .rodata PROGBITS 0000000000000000 000000dc 0000000000000004 0000000000000000 A 0 0 4 [ 7] .debug_str PROGBITS 0000000000000000 000000e0 00000000000000e6 0000000000000001 MS 0 0 1 [ 8] .debug_abbrev PROGBITS 0000000000000000 000001c6 0000000000000087 0000000000000000 0 0 1 [ 9] .debug_info PROGBITS 0000000000000000 0000024d 00000000000000cd 0000000000000000 0 0 1 [10] .rela.debug_info RELA 0000000000000000 00000740 00000000000001e0 0000000000000018 22 9 8 [11] .debug_macinfo PROGBITS 0000000000000000 0000031a 0000000000000001 0000000000000000 0 0 1 [12] .debug_pubnames PROGBITS 0000000000000000 0000031b 0000000000000061 0000000000000000 0 0 1 [13] .rela.debug_pubna RELA 0000000000000000 00000920 0000000000000018 0000000000000018 22 12 8 [14] .debug_pubtypes PROGBITS 0000000000000000 0000037c 000000000000001a 0000000000000000 0 0 1 [15] .rela.debug_pubty RELA 0000000000000000 00000938 0000000000000018 0000000000000018 22 14 8 [16] .comment PROGBITS 0000000000000000 00000396 000000000000007d 0000000000000001 MS 0 0 1 [17] .note.GNU-stack PROGBITS 0000000000000000 00000413 0000000000000000 0000000000000000 0 0 1 [18] .eh_frame X86_64_UNWIND 0000000000000000 00000418 0000000000000068 0000000000000000 A 0 0 8 [19] .rela.eh_frame RELA 0000000000000000 00000950 0000000000000030 0000000000000018 22 18 8 [20] .debug_line PROGBITS 0000000000000000 00000480 0000000000000063 0000000000000000 0 0 1 [21] .rela.debug_line RELA 0000000000000000 00000980 0000000000000018 0000000000000018 22 20 8 [22] .symtab SYMTAB 0000000000000000 000004e8 00000000000001b0 0000000000000018 1 13 8 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), l (large) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)
外部符号
外部符号的计算重定位和写入重定位到内存逻辑和内部符号是一样的
唯一的区别就是外部符号地址不可知,需要查找
查找地址
回顾一下,a.cc的代码文件里,外部符号只有两个:dyFunc
和pow2
1 2 3 4 5 6 7 8 9 10 11 12 extern "C" { int pow2 (int val) ; void dyFunc () ; ...省略 int pow4 (int val) { dyFunc (); return pow2 (val) * pow2 (val); } }
动态库外部符号dyFunc
这个符号是在demo代码https://github.com/tedcy/llvm7_test/blob/master/demo/engine/main.cpp里面通过::dlopen
加载进来的,因此也只能通过::dlsym
找到
1 2 3 4 5 6 7 8 9 10 11 12 13 void loadDylib () { bool ok = ::dlopen ("dylib/libdylib.so" , RTLD_NOW | RTLD_GLOBAL); if (!ok) { cout << "dlopen error. please run ./build.sh in dylib" << endl; exit (1 ); } } int main (int argc, char ** argv) { ... loadDylib (); ... }
看下日志是否符合通过::dlsym
找到呢
计算重定位的日志里面,dyFunc由于不能在全局符号表GlobalSymbolTable里面找到,因此被加入了ExternalSymbolRelocations
1 2 3 4 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dyFunc dbg| SectionID: 3 Offset: 91 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations dbg| end processRelocationRef, Target SymbolName:dyFunc RelocationEntry SectionID: 3 Offset: 91 Addend: 0
随后在写入重定位到内存里面,在resolveExternalSymbols的LegacyJITSymbolResolver::lookup中
1 2 3 4 5 6 7 8 dbg|resolveExternalSymbols while start dbg| NewSymbols.insert(dyFunc) dbg|LegacyJITSymbolResolver::lookup dbg| find SymName=dyFunc dbg| not found and no Sym.takeError dbg| dyFunc found in OpenedHandles dbg| findSymbol success dbg| has address
打印出来日志OpenedHandles,实际上就是::dlsym
找到的
给一个完整的堆栈
1 2 3 4 5 6 7 8 9 10 11 12 13 # 0 llvm::sys::DynamicLibrary::SearchForAddressOfSymbol (SymbolName=0x7ffefdf65ad0 "dyFunc" ) at /root/llvm7_test/lib/Support/DynamicLibrary.cpp:181 # 1 0x0000000000c9fe1d in llvm::RTDyldMemoryManager::getSymbolAddressInProcess (Name="dyFunc" ) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp:287 # 2 0x0000000000c98026 in llvm::RTDyldMemoryManager::getSymbolAddress (this=0x3aafd00, Name="dyFunc" ) at /root/llvm7_test/include/llvm/ExecutionEngine/RTDyldMemoryManager.h:87 # 3 0x0000000000c9807a in llvm::RTDyldMemoryManager::findSymbol (this=0x3aafd00, Name="dyFunc" ) at /root/llvm7_test/include/llvm/ExecutionEngine/RTDyldMemoryManager.h:102 # 4 0x0000000000c4d95e in llvm::LinkingSymbolResolver::findSymbol (this=0x3ab32c0, Name="dyFunc" ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:722 # 5 0x0000000000c9a46c in llvm::LegacyJITSymbolResolver::lookup (this=0x3ab32c0, Symbols=std::set with 2 elements = {...}) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/JITSymbol.cpp:87 # 6 0x0000000000ca6b05 in llvm::RuntimeDyldImpl::resolveExternalSymbols (this=0x3ab3520) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1043 # 7 0x0000000000ca12aa in llvm::RuntimeDyldImpl::resolveRelocations (this=0x3ab3520) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:121 # 8 0x0000000000ca7c9a in llvm::RuntimeDyld::resolveRelocations (this=0x3ab32e0) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1276 # 9 0x0000000000c4a606 in llvm::MCJIT::finalizeLoadedModules (this=0x3ab3030) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:277 # 10 0x0000000000c4ab7e in llvm::MCJIT::finalizeObject (this=0x3ab3030) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:309 # 11 0x000000000040e7ba in main (argc=2, argv=0x7ffefdf665b8) at main.cpp:126
主动传入地址的外部符号pow2
这个符号是在demo代码https://github.com/tedcy/llvm7_test/blob/master/demo/engine/main.cpp里面通过addGlobalMapping加入到MCJIT的EEState.getGlobalAddressMap()
1 2 3 4 5 6 7 8 9 extern "C" {int pow2 (int val) { return val * val; }} int main (int argc, char ** argv) { ... ee->addGlobalMapping ("pow2" , (uint64_t )&pow2); ... }
看下日志是否符合在EEState.getGlobalAddressMap()找到呢
计算重定位的日志里面,pow2由于不能在全局符号表GlobalSymbolTable里面找到,因此被加入了ExternalSymbolRelocations
这里有两次,因为函数pow4里面使用了两次,需要重定位两次
1 2 3 4 5 6 7 8 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: pow2 dbg| SectionID: 3 Offset: 106 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:pow2 RelocationEntry SectionID: 3 Offset: 106 Addend: 0 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: pow2 dbg| SectionID: 3 Offset: 123 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:pow2 RelocationEntry SectionID: 3 Offset: 106 Addend: 0 dbg| end processRelocationRef
随后在写入重定位到内存里面,在resolveExternalSymbols的LegacyJITSymbolResolver::lookup中
1 2 3 4 5 6 7 8 dbg|resolveExternalSymbols while start dbg| NewSymbols.insert(dyFunc) dbg|LegacyJITSymbolResolver::lookup dbg| find SymName=pow2 dbg| not found and no Sym.takeError dbg| pow2 found in findExistingSymbol dbg| findSymbol success dbg| has address
打印出来日志findExistingSymbol,实际上就是EEState.getGlobalAddressMap()找到的
给一个完整堆栈
1 2 3 4 5 6 7 8 9 10 11 12 13 # 0 llvm::ExecutionEngine::getAddressToGlobalIfAvailable (this=0x362c030, S=...) at /root/llvm7_test/lib/ExecutionEngine/ExecutionEngine.cpp:278 # 1 0x0000000000c82a0a in llvm::ExecutionEngine::getPointerToGlobalIfAvailable (this=0x362c030, S=...) at /root/llvm7_test/lib/ExecutionEngine/ExecutionEngine.cpp:291 # 2 0x0000000000c4af0a in llvm::MCJIT::findExistingSymbol (this=0x362c030, Name="pow2" ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:329 # 3 0x0000000000c4b54b in llvm::MCJIT::findSymbol (this=0x362c030, Name="pow2" , CheckFunctionsOnly=false ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:385 # 4 0x0000000000c4d8eb in llvm::LinkingSymbolResolver::findSymbol (this=0x362c2c0, Name="pow2" ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:717 # 5 0x0000000000c9a46c in llvm::LegacyJITSymbolResolver::lookup (this=0x362c2c0, Symbols=std::set with 2 elements = {...}) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/JITSymbol.cpp:87 # 6 0x0000000000ca6b05 in llvm::RuntimeDyldImpl::resolveExternalSymbols (this=0x362c520) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1043 # 7 0x0000000000ca12aa in llvm::RuntimeDyldImpl::resolveRelocations (this=0x362c520) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:121 # 8 0x0000000000ca7c9a in llvm::RuntimeDyld::resolveRelocations (this=0x362c2e0) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1276 # 9 0x0000000000c4a606 in llvm::MCJIT::finalizeLoadedModules (this=0x362c030) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:277 # 10 0x0000000000c4ab7e in llvm::MCJIT::finalizeObject (this=0x362c030) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:309 # 11 0x000000000040e7ba in main (argc=2, argv=0x7ffee28e6d38) at main.cpp:126
不需要重定位的全局符号
不是所有符号都需要重定位的,如果没有代码引用到就不需要,例如pow3
和pow4
日志如下
1 2 3 4 5 dbg| start emitSection: .text|cost:57 us dbg| emitSection SectionID: 3 Name: .text obj addr: 0x3d010e0 new addr: 0x7f086663f000 DataSize: 145 StubBufSize: 0 Allocate: 145 dbg| end emitSection: .text|cost:29 us dbg| Type: 4 Name: pow3 SID: 3 Offset: (nil) flags: 66 dbg| Type: 4 Name: pow4 SID: 3 Offset: 0x50 flags: 66
随后就再也没有出现过这两个符号的日志了,直到代码里面主动查询pow4
的日志
1 2 3 4 5 6 uint64_t addr = ee->getFunctionAddress ("pow4" ); typedef int (*pow4_t ) (int ) ;pow4_t fn = (pow4_t )addr; auto result = fn (2 ); cout << result << endl;
对应的堆栈是
1 2 3 4 5 6 7 # 0 llvm::RuntimeDyldImpl::getSymbol (this=0x4d40520, Name=...) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h:509 # 1 0x0000000000ca7be7 in llvm::RuntimeDyld::getSymbol (this=0x4d402e0, Name=...) at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1264 # 2 0x0000000000c4af89 in llvm::MCJIT::findExistingSymbol (this=0x4d40030, Name="pow4" ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:334 # 3 0x0000000000c4b54b in llvm::MCJIT::findSymbol (this=0x4d40030, Name="pow4" , CheckFunctionsOnly=true ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:385 # 4 0x0000000000c4b375 in llvm::MCJIT::getSymbolAddress (this=0x4d40030, Name="pow4" , CheckFunctionsOnly=true ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:371 # 5 0x0000000000c4bc76 in llvm::MCJIT::getFunctionAddress (this=0x4d40030, Name="pow4" ) at /root/llvm7_test/lib/ExecutionEngine/MCJIT/MCJIT.cpp:449 # 6 0x000000000040e869 in main (argc=2, argv=0x7ffe3e970cb8) at main.cpp:131
代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 class RuntimeDyldImpl { JITEvaluatedSymbol getSymbol (StringRef Name) const { RTDyldSymbolTable::const_iterator pos = GlobalSymbolTable.find (Name); if (pos == GlobalSymbolTable.end ()) return nullptr ; const auto &SymEntry = pos->second; uint64_t SectionAddr = 0 ; if (SymEntry.getSectionID () != AbsoluteSymbolSection) SectionAddr = getSectionLoadAddress (SymEntry.getSectionID ()); uint64_t TargetAddr = SectionAddr + SymEntry.getOffset (); TargetAddr = modifyAddressBasedOnFlags (TargetAddr, SymEntry.getFlags ()); return JITEvaluatedSymbol (TargetAddr, SymEntry.getFlags ()); } }
实际上就是去全局符号表GlobalSymbolTable里面找到这个符号拿到地址而已
1 2 3 4 5 508 const auto &SymEntry = pos->second; (gdb) n 509 uint64_t SectionAddr = 0; (gdb) p SymEntry $ 2 = (const llvm::SymbolTableEntry &) @0x4d42258: {Offset = 80, SectionID = 3, Flags = {Flags = 16 '\020' , TargetFlags = 0}}
这里拿到的正是日志里面打印的(Offset = 80 = 0x50):
Type: 4 Name: pow4 SID: 3 Offset: 0x50 flags: 66
PIC重定位(R_X86_64_PC32+R_X86_64_PLT32)
实际上PIC重定位才是当前的主流,还是一样的代码文件,编译脚本在https://github.com/tedcy/llvm7_test/blob/master/demo/engine/cpp_file/pic/build.sh
1 2 3 4 5 6 7 8 clang++ -m64 -std=c++0x -fno-use-cxa-atexit -fnon-call-exceptions -c -emit-llvm ../a.cc llvm-dis a.bc # got模式(必须开启PIC模式,否则全局变量会在使用R_X86_64_32S后因为绝对地址大于32位挂掉) llc -filetype=obj -relocation-model=pic a.bc -o a.o # 去掉回溯信息 objcopy --remove-section=.eh_frame a.o
这里特意去掉了debug信息和栈回溯信息,简化问题
符号表相比绝对地址重定位是没有任何改变的
而重定位表变量和函数,绝对地址重定位都是R_X86_64_64,现在变量变成了R_X86_64_PC32(下面没有,但是即使是extern int a也是一样! ),函数变成了R_X86_64_PLT32
重定位表如下:
1 2 3 4 5 6 7 8 9 10 11 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x2f8 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000006 000700000002 R_X86_64_PC32 0000000000000000 .bss - 8 000000000010 000800000002 R_X86_64_PC32 0000000000000000 .rodata - 8 00000000001a 000600000002 R_X86_64_PC32 0000000000000000 .data - 4 000000000024 000c00000002 R_X86_64_PC32 0000000000000000 globalWeakV0 - 8 00000000003a 000b00000004 R_X86_64_PLT32 0000000000000000 dyFunc - 4 000000000042 000d00000004 R_X86_64_PLT32 0000000000000000 pow2 - 4 00000000004c 000d00000004 R_X86_64_PLT32 0000000000000000 pow2 - 4
demo调用的日志如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 llvm_register_debuger_printer set ok! dbg|JIT: Map 'pow2' to [0x40e499] dbg|ev_start_finalizeObject|cost:0 us dbg|ev_start_generateCodeForModule|cost:14 us dbg|ev_start_createObjectFile|cost:6 us dbg|ev_end_createObjectFile|cost:44 us dbg|ev_start_loadObject|cost:3 us dbg|start Dyld->loadObject|cost:21 us dbg|start RuntimeDyldImpl::loadObjectImpl|cost:6 us dbg|start Parse symbols|cost:37 us dbg| start emitSection: .rodata|cost:31 us dbg| emitSection SectionID: 0 Name: .rodata obj addr: 0x5114ab4 new addr: 0x7f28e7a7c000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .rodata|cost:30 us dbg| Type: 1 Name: globalConstV0 SID: 0 Offset: (nil) flags: 0 dbg| start emitSection: .bss|cost:21 us dbg| emitSection SectionID: 1 Name: .bss obj addr: (nil) new addr: 0x7f28e7a7b000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .bss|cost:23 us dbg| Type: 1 Name: globalV0 SID: 1 Offset: (nil) flags: 0 dbg| start emitSection: .data|cost:18 us dbg| emitSection SectionID: 2 Name: .data obj addr: 0x5114aac new addr: 0x7f28e7a7b004 DataSize: 8 StubBufSize: 0 Allocate: 8 dbg| end emitSection: .data|cost:12 us dbg| Type: 1 Name: globalV1 SID: 2 Offset: 0x4 flags: 0 dbg| Type: 1 Name: globalWeakV0 SID: 2 Offset: (nil) flags: 70 dbg| start emitSection: .text|cost:50 us dbg| emitSection SectionID: 3 Name: .text obj addr: 0x5114a50 new addr: 0x7f28e7a7a000 DataSize: 92 StubBufSize: 18 Allocate: 110 dbg| end emitSection: .text|cost:25 us dbg| Type: 4 Name: pow3 SID: 3 Offset: (nil) flags: 66 dbg| Type: 4 Name: pow4 SID: 3 Offset: 0x30 flags: 66 dbg|end Parse symbols|cost:21 us dbg|start Parse relocations|cost:4 us dbg|start Parse relocations for SectionID:3|cost:18 us dbg| start processRelocationRef for RelType: 2 Addend: -8 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 6 Target SectionID: 1 Offset: 0 Addend: -8 dbg| addRelocationForSection add Relocations, Target SectionID: 1 RelocationEntry SectionID: 3 Offset: 6 Addend: -8 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 2 Addend: -8 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 16 Target SectionID: 0 Offset: 0 Addend: -8 dbg| addRelocationForSection add Relocations, Target SectionID: 0 RelocationEntry SectionID: 3 Offset: 16 Addend: -8 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 2 Addend: -4 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 26 Target SectionID: 2 Offset: 0 Addend: -4 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 3 Offset: 26 Addend: -4 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 2 Addend: -8 TargetName: globalWeakV0 dbg| SectionID: 3 Offset: 36 Target SectionID: 2 Offset: 0 Addend: -8 dbg| addRelocationForSection add Relocations, Target SectionID: 2 RelocationEntry SectionID: 3 Offset: 36 Addend: -8 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 4 Addend: -4 TargetName: dyFunc dbg| SectionID: 3 Offset: 58 Target SectionID: 0 Offset: 0 Addend: -4 dbg| Create a new stub function dbg| createStubFunction Writing 0xFF25 at 0x7f28e7a7a05c dbg| Section reserved 6 bytes at 0x7f28e7a7a05c dbg| allocateGOTEntries GOTSectionID: 4 StartOffset: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 3 Offset: 94 Addend: -4 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:dyFunc RelocationEntry SectionID: 4 Offset: 0 Addend: 0 dbg| Type R_X86_64_PC32 Writing 0x1e at 0x7f28e7a7a03a dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 4 Addend: -4 TargetName: pow2 dbg| SectionID: 3 Offset: 66 Target SectionID: 0 Offset: 0 Addend: -4 dbg| Create a new stub function dbg| createStubFunction Writing 0xFF25 at 0x7f28e7a7a062 dbg| Section reserved 6 bytes at 0x7f28e7a7a062 dbg| allocateGOTEntries GOTSectionID: 4 StartOffset: 8 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 3 Offset: 100 Addend: 4 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:pow2 RelocationEntry SectionID: 4 Offset: 8 Addend: 0 dbg| Type R_X86_64_PC32 Writing 0x1c at 0x7f28e7a7a042 dbg| end processRelocationRef dbg| start processRelocationRef for RelType: 4 Addend: -4 TargetName: pow2 dbg| SectionID: 3 Offset: 76 Target SectionID: 0 Offset: 0 Addend: -4 dbg| Stub function found dbg| Type R_X86_64_PC32 Writing 0x12 at 0x7f28e7a7a04c dbg| end processRelocationRef dbg|end Parse relocations for SectionID:3|cost:272 us dbg|end Parse relocations|cost:6 us dbg|finalizeLoad GOTSectionID: 4 Addr: 0x7f28e7a7b010 TotalSize: 16 dbg|end RuntimeDyldImpl::loadObjectImpl|cost:19 us dbg|end Dyld->loadObject|cost:7 us dbg|ev_end_loadObject|cost:3 us dbg|ev_notifyObjectEmitted|cost:3 us dbg|ev_start_storageBufferAndObjects|cost:44 us dbg|ev_end_storageBufferAndObjects|cost:5 us dbg|ev_end_generateCodeForModule|cost:4 us dbg|ev_start_finalizeLoadedModules|cost:2 us dbg|ev_start_relocations|cost:2 us dbg|----- Contents of section .rodata before relocations ----- dbg|0x00007f28e7a7c000: 00 00 00 00 dbg|----- Contents of section .bss before relocations ----- dbg|0x00007f28e7a7b000: 00 00 00 00 dbg|----- Contents of section .data before relocations ----- dbg|0x00007f28e7a7b000: 01 00 00 00 01 00 00 00 dbg|----- Contents of section .text before relocations ----- dbg|0x00007f28e7a7a000: 55 48 89 e5 c7 05 00 00 00 00 02 00 00 00 c7 05 dbg|0x00007f28e7a7a010: 00 00 00 00 02 00 00 00 c7 05 00 00 00 00 02 00 dbg|0x00007f28e7a7a020: 00 00 c7 05 00 00 00 00 02 00 00 00 5d c3 90 90 dbg|0x00007f28e7a7a030: 55 48 89 e5 53 50 89 7d f4 e8 1e 00 00 00 8b 7d dbg|0x00007f28e7a7a040: f4 e8 1c 00 00 00 89 c3 8b 7d f4 e8 12 00 00 00 dbg|0x00007f28e7a7a050: 0f af d8 89 d8 48 83 c4 08 5b 5d c3 dbg|----- Contents of section .got before relocations ----- dbg|0x00007f28e7a7b010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 dbg|start resolveExternalSymbols dbg| ------------ ExternalSymbolRelocations.size() = 2 ------------ dbg| ExternalSymbolRelocations[0] = pow2 dbg| ExternalSymbolRelocations[1] = dyFunc dbg| ------------------------------------------ dbg|resolveExternalSymbols while start dbg| NewSymbols.insert(pow2) dbg| NewSymbols.insert(dyFunc) dbg|LegacyJITSymbolResolver::lookup dbg| find SymName=dyFunc dbg| not found and no Sym.takeError dbg| dyFunc found in OpenedHandles dbg| findSymbol success dbg| has address dbg| find SymName=pow2 dbg| not found and no Sym.takeError dbg| pow2 found in findExistingSymbol dbg| findSymbol success dbg| has address dbg|resolveExternalSymbols while end, has new symbols, try resolve more dbg|resolveExternalSymbols while start dbg|resolveExternalSymbols while end, no new symbols resolved, break dbg| ------------ ExternalSymbolMap.size() = 2 ------------ dbg| ExternalSymbolMap[0] = pow2 dbg| ExternalSymbolMap[1] = dyFunc dbg| ------------------------------------------ dbg|Resolving relocations Name: pow2 0x40e499 dbg|Type R_X86_64_64 Writing 0x40e499 at 0x7f28e7a7b018 dbg|Resolving relocations Name: dyFunc 0x7f28e63a16c0 dbg|Type R_X86_64_64 Writing 0x7f28e63a16c0 at 0x7f28e7a7b010 dbg|end resolveExternalSymbols dbg|Resolving relocations Section #4 0x7f28e7a7b010 dbg|Type R_X86_64_PC32 Writing 0xfae at 0x7f28e7a7a05e dbg|Type R_X86_64_PC32 Writing 0xfb0 at 0x7f28e7a7a064 dbg|Resolving relocations Section #2 0x7f28e7a7b004 dbg|Type R_X86_64_PC32 Writing 0xfe6 at 0x7f28e7a7a01a dbg|Type R_X86_64_PC32 Writing 0xfd8 at 0x7f28e7a7a024 dbg|Resolving relocations Section #0 0x7f28e7a7c000 dbg|Type R_X86_64_PC32 Writing 0x1fe8 at 0x7f28e7a7a010 dbg|Resolving relocations Section #1 0x7f28e7a7b000 dbg|Type R_X86_64_PC32 Writing 0xff2 at 0x7f28e7a7a006 dbg|----- Contents of section .rodata after relocations ----- dbg|0x00007f28e7a7c000: 00 00 00 00 dbg|----- Contents of section .bss after relocations ----- dbg|0x00007f28e7a7b000: 00 00 00 00 dbg|----- Contents of section .data after relocations ----- dbg|0x00007f28e7a7b000: 01 00 00 00 01 00 00 00 dbg|----- Contents of section .text after relocations ----- dbg|0x00007f28e7a7a000: 55 48 89 e5 c7 05 f2 0f 00 00 02 00 00 00 c7 05 dbg|0x00007f28e7a7a010: e8 1f 00 00 02 00 00 00 c7 05 e6 0f 00 00 02 00 dbg|0x00007f28e7a7a020: 00 00 c7 05 d8 0f 00 00 02 00 00 00 5d c3 90 90 dbg|0x00007f28e7a7a030: 55 48 89 e5 53 50 89 7d f4 e8 1e 00 00 00 8b 7d dbg|0x00007f28e7a7a040: f4 e8 1c 00 00 00 89 c3 8b 7d f4 e8 12 00 00 00 dbg|0x00007f28e7a7a050: 0f af d8 89 d8 48 83 c4 08 5b 5d c3 dbg|----- Contents of section .got after relocations ----- dbg|0x00007f28e7a7b010: c0 16 3a e6 28 7f 00 00 99 e4 40 00 00 00 00 00 dbg|ev_end_relocations|cost:397 us dbg|ev_start_registerEHFrames|cost:3 us dbg|ev_end_registerEHFrames|cost:2 us dbg|ev_end_finalizeLoadedModules|cost:14 us dbg|ev_end_finalizeObject|cost:6 us ============================== engine loaded
R_X86_64_PC32(变量类型)
变量类型都是一样的,所以以globalConstV0为例
1 2 3 4 5 static int globalV0 asm ("globalV0" ) = 0 ;void pow3 () { globalV0 = 2 ; }
计算重定位
这一步和绝对地址的重定位是类似的
globalConstV0是全局的const变量,虽然初始化为0(初始化为0的非const变量在.bss段),但是依然在.rodata
下面的日志是在扫描全局变量,Parse symbols时,为object文件中地址0x3d0117c的段.rodata分配了虚拟内存地址0x7f0866641000
1 2 3 4 dbg| start emitSection: .rodata|cost:33 us dbg| emitSection SectionID: 0 Name: .rodata obj addr: 0x3d0117c new addr: 0x7f0866641000 DataSize: 4 StubBufSize: 0 Allocate: 4 dbg| end emitSection: .rodata|cost:35 us dbg| Type: 1 Name: globalConstV0 SID: 0 Offset: (nil) flags: 0
然后进行计算重定位Parse relocations,SectionID为3,Offset为22(也就是0x16)的地方使用了这个符号
1 2 3 4 5 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 22 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations dbg| end processRelocationRef
SectionID 3是什么呢,扫码全局变量,Parse symbols时的日志有这么一段
1 2 3 dbg| start emitSection: .text|cost:57 us dbg| emitSection SectionID: 3 Name: .text obj addr: 0x3d010e0 new addr: 0x7f086663f000 DataSize: 145 StubBufSize: 0 Allocate: 145 dbg| end emitSection: .text|cost:29 us
也就是说.text
的offset为0x16的地方,需要重定位成Target SectionID: 0 Offset: 0
,也就是.rodata+0
的globalConstV0回顾重定位表也可以看到一样的信息
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x698 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000016 000800000001 R_X86_64_64 0000000000000000 .rodata + 0
区别是SectionID: 0的new addr现在是0x7f28e7a7c000
而SectionID: 3的new addr现在是0x7f28e7a7a000
再者,重定位表的Section信息也不一样了
现在是Offset是16(也就是0x10),并且多了一个Addend信息
1 2 3 4 5 dbg| start processRelocationRef for RelType: 2 Addend: -8 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 16 Target SectionID: 0 Offset: 0 Addend: -8 dbg| addRelocationForSection add Relocations, Target SectionID: 0 RelocationEntry SectionID: 3 Offset: 16 Addend: -8 dbg| end processRelocationRef
重定位表也可以看到一样的信息
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x698 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000010 000800000002 R_X86_64_PC32 0000000000000000 .rodata - 8
写入重定位到内存
上面的关键信息已经写入了倒排索引Relocations[目标SectionID=0,也就是.rodata+0].push_back(重定位SectionID=3也就是.text, 重定位Offset: 16 = 0x10,重定位Addend:-8)
在RelocationList的时候会触发写入重定位到内存,但是R_X86_64_PC32下算法略有变化:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 void RuntimeDyldELF::resolveX86_64Relocation (const SectionEntry &Section, uint64_t Offset, uint64_t Value, uint32_t Type, int64_t Addend, uint64_t SymOffset) { switch (Type) { case ELF::R_X86_64_PC32: { uint64_t FinalAddress = Section.getLoadAddressWithOffset (Offset); int64_t RealOffset = Value + Addend - FinalAddress; support::ulittle32_t ::ref (Section.getAddressWithOffset (Offset)) = (RealOffset & 0xFFFFFFFF ); break ; } } }
日志如下:
1 2 dbg|Resolving relocations Section #0 0x7f28e7a7c000 dbg|Type R_X86_64_PC32 Writing 0x1fe8 at 0x7f28e7a7a010
写入前地址的内存,0x7f28e7a7a010对应第1个字符:
1 dbg|0x00007f28e7a7a010: (e8 1f 00 00) 02 00 00 00 c7 05 e6 0f 00 00 02 0
写入后地址的内存,0x7f28e7a7a010对应第1个字符:
1 dbg|0x00007f28e7a7a010: (e8 1f 00 00) 02 00 00 00 c7 05 e6 0f 00 00 02 0
写入了4个字节(e8 1f 00 00):也就是0x000001fe8
结合汇编代码看重定位计算
虽然日志能对上了,但是对R_X86_64_PC32为什么这么计算还是摸不到头脑,结合汇编看下原理吧
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 objdump -d a.o a.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <pow3>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: c7 05 00 00 00 00 02 movl $0x2,0x0(%rip) # e <pow3+0xe> b: 00 00 00 e: c7 05 00 00 00 00 02 movl $0x2,0x0(%rip) # 18 <pow3+0x18> 15: 00 00 00 18: c7 05 00 00 00 00 02 movl $0x2,0x0(%rip) # 22 <pow3+0x22> 1f: 00 00 00 22: c7 05 00 00 00 00 02 movl $0x2,0x0(%rip) # 2c <pow3+0x2c> 29: 00 00 00
pow3的第二个movl $0x2,0x0(%rip)
就对应着globalV0 = 2;
(关于movl指令,可见附录)
根据.rela.text
,需要重定位的也就是offset 0x10,用括号标注一下
1 2 e: c7 05 (00 00 00 00) 02 movl $0x2,0x0(%rip) # 18 <pow3+0x18> 15: 00 00 00
movl指令的计算是拿下一个rip指令的地址+偏移量来跳转到指定地址去取值的
下一个rip指令的地址,相差了8字节,假设0x10对应的虚拟内存地址base + 0x10
那么R_X86_64_PC32就是section.text base addr + 0x10 + 0x8 + offset = section.rodata base addr
所以offset = section.rodata base addr - 0x8 - (section.text base addr + 0x10)
代入上面日志,也就是0x7fd677aee000 - 0x8 - 0x7fd677aec010 = 0x1fe8
了
R_X86_64_PLT32(函数类型)
PLT32,又叫做got表定位
在我之前的博客hook 的妙用#延迟绑定plt桩代码 中描述过gcc和gcc的动态链接器是如何使用.got.plt
段来完成延迟绑定的逻辑的
而llvm的MCJIT作为动态链接器的时候,则有不一样的细节,简单来说就是只有.got
段实现,没有延迟绑定逻辑
函数类型都是一样的,以dyFunc为例
计算重定位
回顾一下重定位表
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x2f8 contains 7 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000003a 000b00000004 R_X86_64_PLT32 0000000000000000 dyFunc - 4
回顾下上文的绝对地址计算重定位
1 2 3 4 5 dbg| start processRelocationRef for RelType: 1 Addend: 0 TargetName: dbg| This is section symbol dbg| SectionID: 3 Offset: 22 Target SectionID: 0 Offset: 0 Addend: 0 dbg| addRelocationForSection add Relocations dbg| end processRelocationRef
R_X86_64_PLT32多了不少内容
1 2 3 4 5 6 7 8 9 10 11 12 dbg| start processRelocationRef for RelType: 4 Addend: -4 TargetName: dyFunc dbg| SectionID: 3 Offset: 58 Target SectionID: 0 Offset: 0 Addend: -4 dbg| Create a new stub function dbg| createStubFunction Writing 0xFF25 at 0x7f28e7a7a05c dbg| Section reserved 6 bytes at 0x7f28e7a7a05c dbg| allocateGOTEntries GOTSectionID: 4 StartOffset: 0 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 3 Offset: 94 Addend: -4 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:dyFunc RelocationEntry SectionID: 4 Offset: 0 Addend: 0 dbg| Type R_X86_64_PC32 Writing 0x1e at 0x7f28e7a7a03a dbg| end processRelocationRef ... dbg|finalizeLoad GOTSectionID: 4 Addr: 0x7f28e7a7b010 TotalSize: 16
对应代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 Expected<relocation_iterator> RuntimeDyldELF::processRelocationRef () ...省略 if (Arch == Triple::x86_64) { if (RelType == ELF::R_X86_64_PLT32) { if (Value.SymbolName) { SectionEntry &Section = Sections[SectionID]; StubMap::const_iterator i = Stubs.find (Value); uintptr_t StubAddress; if (i != Stubs.end ()) { StubAddress = uintptr_t (Section.getAddress ()) + i->second; LLVM_DEBUG (dbgs () << "\t\tStub function found\n" ); } else { LLVM_DEBUG (dbgs () << "\t\tCreate a new stub function\n" ); uintptr_t BaseAddress = uintptr_t (Section.getAddress ()); uintptr_t StubAlignment = getStubAlignment (); StubAddress = (BaseAddress + Section.getStubOffset () + StubAlignment - 1 ) & -StubAlignment; unsigned StubOffset = StubAddress - BaseAddress; Stubs[Value] = StubOffset; createStubFunction ((uint8_t *)StubAddress); Section.advanceStubOffset (getMaxStubSize ()); MY_DEBUG (my_dbgs () << "\t\tSection reserved " << getMaxStubSize () << " bytes at " << format("%p\n" , StubAddress)); uint64_t GOTOffset = allocateGOTEntries (1 ); resolveGOTOffsetRelocation (SectionID, StubOffset + 2 , GOTOffset - 4 , ELF::R_X86_64_PC32); addRelocationForSymbol ( computeGOTOffsetRE (GOTOffset, 0 , ELF::R_X86_64_64), Value.SymbolName); } resolveRelocation (Section, Offset, StubAddress, ELF::R_X86_64_PC32, Addend); } else { RelocationEntry RE (SectionID, Offset, ELF::R_X86_64_PC32, Value.Addend, Value.Offset); addRelocationForSection (RE, Value.SectionID); } } } ...省略 }
然后按日志来分析(下面的桩函数指令0XFF25在gdb解析为jmpq,细节可见附录)
1 2 3 4 5 6 //创建一个stub function,也就是一个“桩函数” dbg| Create a new stub function //“桩函数”的开头是0xFF25 dbg| createStubFunction Writing 0xFF25 at 0x7f28e7a7a05c //预留6个字节的操作数 dbg| Section reserved 6 bytes at 0x7f28e7a7a05c
然后分配了一个got表的section(在全部重定位以后这个Section 分配了地址0x7f28e7a7b010)
1 2 3 dbg| allocateGOTEntries GOTSectionID: 4 StartOffset: 0 ... dbg|finalizeLoad GOTSectionID: 4 Addr: 0x7f28e7a7b010 TotalSize: 16
接着写入了两个重定位,第一个是目标got段的重定位
写入了倒排索引Relocations[目标SectionID=4,也就是.got+0].push_back(重定位SectionID=3也就是.text, 重定位Offset: 94 = 0x5E,重定位Addend:-4)
1 dbg| addRelocationForSection add Relocations, Target SectionID: 4 RelocationEntry SectionID: 3 Offset: 94 Addend: -4
第二个是目标dyFunc的重定位
写入了倒排索引ExternalSymbolRelocations[目标符号dyFunc].push_back(重定位SectionID=4也就是.got, 重定位Offset: 0 = 0x0,重定位Addend:0)
1 dbg| addRelocationForSymbol add ExternalSymbolRelocations, Target SymbolName:dyFunc RelocationEntry SectionID: 4 Offset: 0 Addend: 0
接着写入重定位到内存,这一次处理的重定位Offset: 58 = 0x3a
,所以是直接处理完了这个重定位
1 2 3 4 dbg| start processRelocationRef for RelType: 4 Addend: -4 TargetName: dyFunc dbg| SectionID: 3 Offset: 58 Target SectionID: 0 Offset: 0 Addend: -4 ... dbg| Type R_X86_64_PC32 Writing 0x1e at 0x7f28e7a7a03a
那么这个重定位会指向哪里呢,看下汇编代码
1 2 3 4 5 6 7 8 9 10 11 12 13 objdump -d a.o a.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000030 <pow4>: 30: 55 push %rbp 31: 48 89 e5 mov %rsp,%rbp 34: 53 push %rbx 35: 50 push %rax 36: 89 7d f4 mov %edi,-0xc(%rbp) 39: e8 00 00 00 00 callq 3e <pow4+0xe> 3e:
所以重定位完以后,这一条指令就是e8 1e 00 00 00
,e8指令解析详见附录,实际含义是32位相对地址寻址
也就是指向.text的base addr 0x7f28e7a7a000
+ 下一条指令0x3e
+0x1e
= 0x7f28e7a7a05c
,这个地址眼熟么,正是刚才stub function的地址
画图来表达的话:
graph TD
subgraph TextSection[.text Section 地址 0x7f28e7a7a000]
CallInstruction[当前地址0x7f28e7a7a039<br>callq : e8 已重定位1e 00 00 00]
StubFunction[当前地址0x7f28e7a7a05c<br>Stub Function jmpq 目标.got+0 : ff 25 待重定位00 00 00 00]
end
subgraph GOTSection[.got Section 地址 0x7f28e7a7b010]
GOTEntry1[当前地址0x7f28e7a7b010<br>Entry 1 目标dyFunc 地址 : 待重定位00 00 00 00 00 00 00 00]
GOTEntry2[Entry 2 目标pow2 地址]
end
subgraph ExternalSymbol[外部符号]
dyFunc[dyFunc待解析]
end
CallInstruction -- Relocation 1: PC Relative<br>.text的地址 0x7f28e7a7a000 + 下一条指令0x3e + 0x1e = 0x7f28e7a7a05c --> StubFunction
StubFunction -- Relocation 2: PC Relative --> GOTEntry1
GOTEntry1 -- Relocation 3: Absolute --> dyFunc
写入重定位到内存
日志如下
1 2 3 4 dbg|Resolving relocations Name: dyFunc 0x7f28e63a16c0 dbg|Type R_X86_64_64 Writing 0x7f28e63a16c0 at 0x7f28e7a7b010 dbg|Resolving relocations Section #4 0x7f28e7a7b010 dbg|Type R_X86_64_PC32 Writing 0xfae at 0x7f28e7a7a05e
画图来表达的话
graph TD
subgraph TextSection[.text Section 地址 0x7f28e7a7a000]
CallInstruction[当前地址0x7f28e7a7a039<br>callq : e8 已重定位1e 00 00 00]
StubFunction[当前地址0x7f28e7a7a05c<br>Stub Function jmpq 目标.got+0 : ff 25 已重定位ae 0f 00 00]
end
subgraph GOTSection[.got Section 地址 0x7f28e7a7b010]
GOTEntry1[当前地址0x7f28e7a7b010<br>Entry 1 目标dyFunc 地址 : 已重定位c0 16 3a e6 28 7f 00 00]
GOTEntry2[Entry 2 目标pow2 地址]
end
subgraph ExternalSymbol[外部符号]
dyFunc[dyFunc已解析0x7f28e63a16c0]
end
CallInstruction -- Relocation 1: PC Relative<br>.text的地址 0x7f28e7a7a000 + 下一条指令0x3e + 0x1e = 0x7f28e7a7a05c --> StubFunction
StubFunction -- Relocation 2: PC Relative<br>.text的地址 0x7f28e7a7a000 + 下一条指令0x62 + 0xfae = 0x7f28e7a7b010 --> GOTEntry1
GOTEntry1 -- Relocation 3: Absolute<br>地址0x7f28e63a16c0 --> dyFunc
总结
MCJIT在使用setObjectCache设置完对象文件以后,finalizeObject()一次只处理一个对象文件(可以多次调用),主要工作在地址重定位上
首先计算重定位:
把全局函数和变量的符号,找到存到GlobalSymbolTable
计算重定位符号
如果目标是符号
如果能在GlobalSymbolTable找到,说明是内部符号,建立基于段的倒排索引Relocations[目标符号SectionID].push_back(重定位符号SectionID,Offset, Addend等信息)
如果不能找到,说明是外部符号,建立基于外部符号名的倒排索引ExternalSymbolRelocations[目标符号名].push_back(重定位符号SectionID,Offset, Addend等信息)
如果目标是段,直接建立基于段的倒排索引Relocations[目标SectionID].push_back(重定位符号SectionID,Offset, Addend等信息)
然后是写入重定位到内存:
上一步计算结果在四个数据结构里面:
MCJIT的EEState.getGlobalAddressMap()
通过addGlobalMapping()接口主动导入的符号表
RuntimeDyldELF的GlobalSymbolTable
全局函数和变量的符号表
RuntimeDyldELF的ExternalSymbolRelocations
基于外部符号的倒排索引,这一步基于EEState.getGlobalAddressMap()和dlsym查找对他写入重定位到内存
RuntimeDyldELF的Relocations
基于段的倒排索引,这一步对他写入重定位到内存
如果重定位成功,就可以通过MCJIT的接口获取想要的符号调用函数了
附录
汇编指令
C7 05(movl)
这个指令被objdump标注为movl,这是个别名,在x86的汇编速查表https://shell-storm.org/x86doc/是搜不到这个指令的
C7开头的指令实际上只有
MOV r/m16, imm16
C7 /0 iw
Move imm16 to r/m16.
MOV r/m32, imm32
C7 /0 id
Move imm32 to r/m32.
MOV r/m64, imm32
REX.W + C7 /0 io
Move imm32 sign extended to 64-bits to r/m64.
XBEGIN rel16
C7 F8
rtm
XBEGIN rel32
C7 F8
rtm
肯定不是C7 F8,那么C7 /0
的/0
是个什么鬼?查阅了一些资料,这个叫做ModR/M Opcode ,那么这是什么呢?
汇编指令格式解析
根据https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html的《英特尔® 64 和 IA-32 架构软件开发人员手册合集》
在Volume 2里面的《CHAPTER 2 INSTRUCTION FORMAT》中有这么一张图
x86汇编指令包含了
Instruction Prefixes:指令前缀
分为四个组
锁定和重复前缀
段覆盖前缀
操作数大小覆盖前缀
地址大小覆盖前缀
在X64下,还有一个REX的指令前缀
Opcode(唯一必选项):操作码
ModR/M:一共有三个域,Mod,Reg/Opcode, R/M
SIB:定义ModR/M的寻址方式的补充寻址方式,用于寻址放大倍数
Displacement:偏移
Immediate:立即数
Instruction Prefixes
其中最重要的就是操作数和地址大小覆盖前缀了,在Intel手册Volume1的3.6.1章节Operand Size and Address Size in 64-Bit Mode
在 64 位模式下,默认的地址大小是 64 位,默认的操作数大小是 32 位 。默认值可以用前缀来覆盖。地址大小和操作数大小前缀允许在逐指令基础上混合使用 32/64 位的数据和 32/64 位的地址。表 3‑4 显示了可用于在 64 位模式下指定操作数大小覆盖的 66H 指令前缀与 REX.W 前缀的有效组合。注意:在 64 位模式下不支持 16 位地址。
REX 前缀由若干 4 位字段组成,共形成 16 个不同的取值。REX 前缀中的 W 位字段称作 REX.W。如果正确设置了 REX.W 字段,该前缀就指定将操作数大小覆盖为 64 位。注意,软件仍然可以使用操作数大小前缀 66H 来切换到 16 位操作数大小。然而,当同时使用 REX.W 与操作数大小前缀(66H)时,设置 REX.W 的优先级高于操作数大小前缀 (66H)。
REX.W
0
0
0
0
1
1
1
1
操作数大小前缀 66H
N
N
Y
Y
N
N
Y
Y
地址大小前缀 67H
N
Y
N
Y
N
Y
N
Y
有效操作数大小
32
32
16
16
64
64
64
64
有效地址大小
64
32
64
32
64
32
64
32
说明:
REX.W 使用 0/1 表示;66H/67H 用 N/Y 表示是否存在该前缀。
当同时存在 REX.W=1 与 66H=Y 时,REX.W 对操作数大小具有优先权(有效操作数仍为 64 位)。
ModR/M
ModR/M,可以如下查表(图在Intel手册Volume2 2.5.1章节Addressing-Mode Encoding of ModR/M and SIB Bytes )
1 2 3 4 5 6 7 8 9 解释: 3.1)[--][--]:表示使用SIB结构。 3.2)disp32:表示32位偏移。 3.3)[--][--]+disp8:表示使用SIB结构,且SIB结构后面有一个8位的偏移。 3.4)[--][--]+disp32:表示使用SIB结构,且SIB结构后面有一个32位的偏移。
Mod
对照上图这一部分是2位
00
[base]
提供 [base] 形式的 memory 寻址
01
[base + disp8]
提供 [base + disp8] 形式的 memory 寻址
10
[base + disp32]
提供 [base + disp32] 形式的 memory 寻址
11
register
提供 register 寻址。
Reg/Opcode
对照上图,这一部分是3位,映射了Opcode的扩展位(例如FF /0
到FF /6
这7个指令都是FF的Opcode,要靠/0,...,/6
来区分)
R/M
对照上图,这一部分是3位,映射了使用什么寄存器
回到C7 /0
这个指令:
C7对应了Opcode
/0
对应了ModR/M的Reg/Opcode
,指向的就是下图中Opcode为0的第一列
而05的二进制00000101,Mod = 00,Reg/Opcode = 000(也就是对应第一列),R/M = 101
由于没有前缀,因此C7 05 00 00 00 00后面默认跟着32位也就是4字节立即数,整个指令长10字节
1 2 3 4 C7 : MOV 05 : 基于下一个指令的32位偏移 00 00 00 00 : 32位基于下一个指令的寻址 00 00 00 00 : 32位的立即数
所以完整含义把7到10字节的部分赋值给基于下一个指令偏移量3到6字节地址的内存去
如果拼上前缀就会有别的含义,例如源码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 #include <stdio.h> #include <stdint.h> int main () { uint16_t val16; uint32_t val32; uint64_t val64; __asm__ volatile ( "movw $0x1234, %0" : "=m" (val16) ) ; __asm__ volatile ( "movl $0x12345678, %0" : "=m" (val32) ) ; __asm__ volatile ( "movq $-1, %%rax\n\t" "movq %%rax, %0" : "=m" (val64) : : "rax" ) ; printf ("val16 = 0x%04x\n" , val16); printf ("val32 = 0x%08x\n" , val32); printf ("val64 = 0x%016llx\n" , (unsigned long long )val64); return 0 ; }
输出
1 2 3 4 5 ~ gcc -m64 -O0 asm_c7.cpp -o asm_c7 ~ ./asm_c7 val16 = 0x1234 val32 = 0x12345678 val64 = 0xffffffffffffffff
反汇编objdump -d asm_c7
得到:
66 c7 45 ea 34 12 movw $0x1234,-0x16(%rbp)
66 前缀,所以立即数1234是16位的
没有67前缀,默认地址数是64位,所以会操作rbp而不是ebp寄存器
45 含义是 [ebp(rbp)]+disp8
ea 就是 -0x16
所以就是把1234赋值给rbp - 16地址的某个栈上变量(val16)
c7 45 ec 78 56 34 12 movl $0x12345678,-0x14(%rbp)
没有66前缀,所以立即数12345678是默认32位的
没有67前缀,默认地址数是64位,所以会操作rbp而不是ebp寄存器
45 含义是 [ebp(rbp)]+disp8
ec 就是 -0x14
所以就是把12345678赋值给rbp - 14地址的某个栈上变量(val32)
48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax 48 89 45 f0 mov %rax,-0x10(%rbp)
48前缀,这是一个REX prefix,对应编码01001000,其中0100是固定的,1000的1标识64位立即数
根据文档C7 /0的解释,Move imm32 sign extended to 64-bits to r/m64
把ffffffff
这个32位当做64位0xffffffffffffffff
来使用
没有67前缀,默认地址数是64位,所以会操作rbp和rax而不是ebp,rax寄存器
c0 含义是rax寄存器
所以就是把0xffffffffffffffff
赋值给寄存器rax,然后rax再赋值给rbp - 10地址的某个栈上变量(val64)
参考资料
这一篇资料Encoding x86-64 instructions: some worked examples 举了REX prefix和ModR/M的不少例子
提供了一个bash脚本来验证汇编指令
1 2 #!/bin/sh objdump -D -b binary -mi386:x86-64 -M intel "$@ "
例如
1 2 3 4 5 6 7 8 9 10 11 12 $ printf '66 c7 05 02 00 00 00 08 00 00 00' | xxd -r -ps > foo.bin $ ./disasm.sh foo.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 66 c7 05 02 00 00 00 mov WORD PTR [rip+0x2],0x8 # 0xb 7: 08 00 ...
66 前缀,所以立即数0x0008是16位的
05 ,所以值要写入基于下一个指令的32位偏移0x2的内存中去
整个指令是9位的,08 00后面多出来的2位都当...没有处理
如果加上一个67前缀,就会32位地址,变成操作eip而不是rip寄存器
1 2 3 4 5 6 7 8 9 10 11 12 ~ printf '67 66 c7 05 02 00 00 00 08 00 00 00' | xxd -r -ps > foo.bin ~ root@devmachine-chengyue-ubuntu1604-575843-78k2l:~/cpp_test# ./disasm.sh foo.bin foo.bin: file format binary Disassembly of section .data: 0000000000000000 <.data>: 0: 67 66 c7 05 02 00 00 mov WORD PTR [eip+0x2],0x8 # 0xc 7: 00 08 00 ...
这一篇问答有讨论这个64位默认地址数和67前缀的问题:https://stackoverflow.com/questions/57840400/x86-64-encoding-for-mov-instruction-weird-case
FF 25(jmpq)
这个指令也是类似,继续在https://shell-storm.org/x86doc/搜索FF开头的汇编指令,和C7不同,有7个指令都是FF /digit的形式,从FF /0到FF /6都有
既然知道也是ModR/M格式的,那么直接查表
查表可知这个指令应该是第5列的FF /4
,对应是disp32
在https://shell-storm.org/x86doc/搜索,可得
JMP r/m16
FF /4
Jump near, absolute indirect, address = zero-extended r/m16. Not supported in 64-bit mode.
JMP r/m32
FF /4
Jump near, absolute indirect, address given in r/m32. Not supported in 64-bit mode.
JMP r/m64
FF /4
Jump near, absolute indirect, RIP = 64-Bit offset from register or memory
FF = JMP,25 = 基于下一个指令的32位偏移
由于这个指令在64位上不支持r/m16和r/m32,只有r/m64,所以获取目标地址或者寄存器以后,读取8字节进行跳转
所以完整含义是基于下一个指令32位偏移的目标地址,读取8字节金鞋跳转
E8(callq)
在https://shell-storm.org/x86doc/搜索,可得
CALL rel16
E8 cw
Call near, relative, displacement relative to next instruction.
CALL rel32
E8 cd
Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.
这个语法没有ModR/M,所以没有指令前缀时,完整含义E8 = CALL并且基于下一个指令的32位偏移
LLVM 7.1.0已知bug
根据https://discourse.llvm.org/t/emulated-tls-on-x86-64-linux-with-the-jit-engine/44208和https://github.com/llvm/llvm-project/issues/3879
LLVM的不支持标准TLS实现,对于
1 2 3 4 5 6 7 extern "C" { static thread_local int tlsValue asm ("tlsValue" ) = 0 ; void pow3 () { tlsValue = 2 ; } }
在绝对地址模式下,编译得到的重定位是
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x3a0 contains 10 entries: Offset Info Type Sym. Value Sym. Name + Addend 00000000006f 000600000017 R_X86_64_TPOFF32 0000000000000000 tlsValue + 0
在相对地址模式下,编译得到的重定位是
1 2 3 4 5 ~ readelf -r a.o Relocation section '.rela.text' at offset 0x3a0 contains 10 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000053 000600000015 R_X86_64_DTPOFF32 0000000000000000 tlsValue + 0
不管是什么模式,都会报错
1 2 3 dbg|Relocation type not implemented yet! dbg|UNREACHABLE executed at /root/llvm7_test/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:270! Aborted
因为LLVM 7.1.0没有实现标准tls的重定位,因此在编译成目标文件的时候,就指定-emulated-tls
如下
1 2 3 4 5 clang++ -m64 -g -std=c++0x -fno-use-cxa-atexit -fnon-call-exceptions -c -emit-llvm a.cc llvm-dis a.bc # 绝对地址模式 llc -filetype=obj -code-model=large a.bc -o a.o -emulated-tls
此时重定位表会有一个如下的符号,是R_X86_64_64模式的
1 000000000070 001000000001 R_X86_64_64 0000000000000000 __emutls_get_address + 0
并且查看Section表会发现没有.tbss
段,这是对tls的一种模拟实现,他不完全兼容tls,对于std::async
会报错缺少符号
1 Not found symbols:__emutls_v._ZSt11__once_call, __emutls_v._ZSt15__once_callable
直到llvm 14.0.0的https://github.com/llvm/llvm-project/commit/a0a5964499816373c50d6d6a3a4b38c1b53f6714才支持了标准的tls
参考资料