iOS AOP Techniques
Aspect Ortiented Programming is one of the popular topic in past years, as the promoters claims this technique effectively mitigates the challenges co-exists with modern Objective Oriented Programming -> entanglement
& dispersion
.
- Entanglement: happens when a business module implements cross-cutting concerns which involves multiple use cases. For example a module involves providing both network as well as logging capabilities. Making the module less focusable on its original design intentions.
- Dispersion: happens when the calls to the above modules spread through out the entire application.
Now via adopting the AOP concept, the execution code no longer requires the developer to attentionally inserting the interface call of the modules into the code body. Instead, by declaring the suitable insertion point, the relevant aspecs interface invocation is "weaved"
into the code execution. Some of the core concepts are listed below:
- JoinPoint: Defines the joint point where the aspect invocation can be inserted into the execution code body.
- PointCut: Defines how the joint point is recgonised and how the aspects is inserted.
- Advice: Defines the implementation of the aspects invocation to be inserted.
- Weaving: Describes the overall process of identify, insert, and forming of the eventual code. The weaving can happen in either compile time or runtime dynammically.
In the world of iOS programming, ObjC and Swift relies on different techniques to achive the concepts. For Swift, due to the nature of the strong typing, extension
and property wrapper
are among those limited choices. For ObjC since it’s strong runtime capability, the ways to achive AOP appears to be more versatile.
This article discuss those typical frameworks/techniques can be used in ObjC: fishhook and Aspects.
2. Techniques
2.1 Fishhook
Fishhook is implemented based on the dyld(the dynamic linker), the dyld is responsible to load the app macho file into segements. In fishhook, the library use the following code to conduct inject:
// If this was the first call,
// register callback for image additions
// which is also invoked for existing images, otherwise,
// just run on existing images
if (!_rebindings_head->next) {
// listens for initial dylb loading
_dyld_register_func_for_add_image(
_rebind_symbols_for_image
);
} else {
// if loaded, swap the func pointers
uint32_t c = _dyld_image_count();
for (uint32_t i = 0; i < c; i++) {
_rebind_symbols_for_image(
_dyld_get_image_header(i),
_dyld_get_image_vmaddr_slide(i)
);
}
}
Once the dyld is loaded, the function _rebind_symbols_for_image
is invoked for each macho image. The mach_header is the header of the macho file, which contains the information about the segments and sections of the macho file. The slide
is the virtual memory address offset of the macho image, which is used to calculate the actual address of the symbols in the image.
/*
Refer to the mach header structure:
The 32-bit mach header appears at the very beginning of the object file for 32-bit architectures.
*/
struct mach_header {
uint32_t magic; /* mach magic number identifier */
int32_t cputype; /* cpu specifier */
int32_t cpusubtype; /* machine specifier */
uint32_t filetype; /* type of file */
uint32_t ncmds; /* number of load commands */
uint32_t sizeofcmds; /* the size of all the load commands */
uint32_t flags; /* flags */
};
static void _rebind_symbols_for_image(const struct mach_header *header,
intptr_t slide) {
rebind_symbols_for_image(_rebindings_head, header, slide);
}
Now let’s have deeper look into the function rebind_symbols_for_image
, which is the core of the fishhook library. The function is responsible for scanning the macho file’s segments and sections, and then rebinding the symbols based on the rebindings provided by the user.
static void rebind_symbols_for_image(struct rebindings_entry *rebindings,
const struct mach_header *header,
intptr_t slide) {
Dl_info info;
if (dladdr(header, &info) == 0) {
return;
}
segment_command_t *cur_seg_cmd;
segment_command_t *linkedit_segment = NULL;
struct symtab_command* symtab_cmd = NULL;
struct dysymtab_command* dysymtab_cmd = NULL;
uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t);
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) {
linkedit_segment = cur_seg_cmd;
}
} else if (cur_seg_cmd->cmd == LC_SYMTAB) {
symtab_cmd = (struct symtab_command*)cur_seg_cmd;
} else if (cur_seg_cmd->cmd == LC_DYSYMTAB) {
dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd;
}
}
if (!symtab_cmd || !dysymtab_cmd || !linkedit_segment ||
!dysymtab_cmd->nindirectsyms) {
return;
}
// Find base symbol/string table addresses
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff);
char *strtab = (char *)(linkedit_base + symtab_cmd->stroff);
// Get indirect symbol table (array of uint32_t indices into symbol table)
uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff);
// ....
}
In the above function, the symtab
, dysymtab
, and linkedit_segment
are extracted from the macho file’s load commands. The symtab
is the symbol table, which contains the symbols defined in the macho file. The dysymtab
is the dynamic symbol table, which contains the location of symbols that are used by the dynamic linker. The linkedit_segment
is the segment that contains the link edit information, which includes the symbol table, string table, and indirect symbol table and dynamic symbol table.
For each of the segements with type equals to “SEG_TEXT” or “SECT_TEXT” (which refers to the executable code), scan its corresponding sections via:
cur = (uintptr_t)header + sizeof(mach_header_t);
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
cur_seg_cmd = (segment_command_t *)cur;
if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
// excluding out `DATA` type segements. (data/stack/heap subtypes).
if (strcmp(cur_seg_cmd->segname, SEG_DATA) != 0 &&
strcmp(cur_seg_cmd->segname, SEG_DATA_CONST) != 0) {
continue;
}
// iterate through sections in current Text/text segement
for (uint j = 0; j < cur_seg_cmd->nsects; j++) {
section_t *sect = (section_t *)(cur + sizeof(segment_command_t)) + j;
// scanning for lazy symbol pointers
if ((sect->flags & SECTION_TYPE) == S_LAZY_SYMBOL_POINTERS) {
perform_rebinding_with_section(
rebindings, sect,
slide, symtab,
strtab, indirect_symtab
);
}
// scanning for non-lazy symbol pointers
if ((sect->flags & SECTION_TYPE) == S_NON_LAZY_SYMBOL_POINTERS) {
perform_rebinding_with_section(
rebindings, sect,
slide, symtab,
strtab, indirect_symtab
);
}
}
}
}
Now let’s take a look at the final step, the function pointer & rebinding: for each bindings in the section, based on the indexes of the section in the indrect symbol table, calculate its symbol table offset and hence deduce its symbol table name. The symbol name is compared against rebinding target, if there is the match, replace the binding’s function pointer with the destination function pointer.
uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
// ---------------------------------------------------------------------------------------
// 对于 S_NON_LAZY_SYMBOL_POINTERS / S_NON_LAZY_SYMBOL_POINTERS 中的每一个地址执行以下逻辑
//
// 1. 遍历每一个section,对于每一个section 中的pointer 地址进行检查,
// 2. 计算当前这个 pointer 在indirect symbol table 中的索引
// 3. 根据 indirect symbol table 索引对应的值,作为在symbol table 中的索引
// 4. 根据 symbol table 索引,获取 symbol table 中改symbol 在 string table中的索引
// 5. 对每一个 symbol 的string name 与当前 rebindings 中的每一个 rebindings 进行比较,
// 如果击中则执行 `6.`
// 6. 发起对 `indirect_symbol_bindings` 访问权限的修改,如果访问权限修改成功,
// 则将 `indirect_symbol_bindings[i]` 替换为 `cur->rebindings[j].replacement`
// ---------------------------------------------------------------------------------------
for (uint i = 0; i < section->size / sizeof(void *); i++) {
uint32_t symtab_index = indirect_symbol_indices[i];
if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
symtab_index == (INDIRECT_SYMBOL_LOCAL | INDIRECT_SYMBOL_ABS)) {
continue;
}
uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx;
char *symbol_name = strtab + strtab_offset;
bool symbol_name_longer_than_1 = symbol_name[0] && symbol_name[1];
struct rebindings_entry *cur = rebindings;
while (cur) {
for (uint j = 0; j < cur->rebindings_nel; j++) {
if (symbol_name_longer_than_1
&& strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
kern_return_t err;
if (cur->rebindings[j].replaced != NULL
&& indirect_symbol_bindings[i] != cur->rebindings[j].replacement)
*(cur->rebindings[j].replaced) = indirect_symbol_bindings[i];
/**
* 1. Moved the vm protection modifying codes to here to reduce the
* changing scope.
* 2. Adding VM_PROT_WRITE mode unconditionally because vm_region
* API on some iOS/Mac reports mismatch vm protection attributes.
* -- Lianfu Hao Jun 16th, 2021
**/
err = vm_protect (
mach_task_self (),
(uintptr_t)indirect_symbol_bindings,
section->size,
0,
VM_PROT_READ | VM_PROT_WRITE | VM_PROT_COPY
);
if (err == KERN_SUCCESS) {
/**
* Once we failed to change the vm protection, we
* MUST NOT continue the following write actions!
* iOS 15 has corrected the const segments prot.
* -- Lionfore Hao Jun 11th, 2021
**/
indirect_symbol_bindings[i] = cur->rebindings[j].replacement;
}
goto symbol_loop;
}
}
cur = cur->next;
}
symbol_loop:;
}
The fishhook’s readme page provides a quite abstract description about the above process:
2.2 Aspects
Aspects utilizes ObjC message forwarding to encapusulate the “cross-cutting” concerns. It is similar to the OCMock framework, internally compared to the fishhook, the implementation is more straight forward.
static id aspect_add(id self, SEL selector, AspectOptions options, id block, NSError * __autoreleasing *error) {
// parameter check
NSCParameterAssert(self);
NSCParameterAssert(selector);
NSCParameterAssert(block);
// start perform hook
__block AspectIdentifier *identifier = nil;
// lock
aspect_performLocked(^{
// avoid of hook in: runtime methods, dealloct methods, classes that are already hooked.
if (aspect_isSelectorAllowedAndTrack(self, selector, options, error)) {
// generate the container in a lazy manner, the container contains all the hooked method for a selector (it means one selector can be hooked for multiple times.)
AspectsContainer *aspectContainer = aspect_getContainerForObject(self, selector);
// assemble the identifer with `@xxx_prefix_@selector_name`
identifier = [AspectIdentifier identifierWithSelector:selector object:self options:options block:block error:error];
// insert the identifier into container and ensure the hook works~
if (identifier) {
[aspectContainer addAspect:identifier withOptions:options];
// modify the class to allow message interception.
aspect_prepareClassAndHookSelector(self, selector, error);
}
}
});
return identifier;
}
Inside the aspect_prepareClassAndHookSelector
, a hookClass is created via aspect_hookClass
, which creates a subclass of the current object’s class and resign the class type of the current object to the hooked subclass. In the hooked class, all the forwardInvocation:
is routed to the
internal method: __ASPECTS_ARE_BEING_CALLED__
.
static void aspect_prepareClassAndHookSelector(NSObject *self, SEL selector, NSError **error) {
NSCParameterAssert(selector);
Class klass = aspect_hookClass(self, error);
Method targetMethod = class_getInstanceMethod(klass, selector);
IMP targetMethodIMP = method_getImplementation(targetMethod);
if (!aspect_isMsgForwardIMP(targetMethodIMP)) {
// Make a method alias for the existing method implementation, it not already copied.
const char *typeEncoding = method_getTypeEncoding(targetMethod);
SEL aliasSelector = aspect_aliasForSelector(selector);
if (![klass instancesRespondToSelector:aliasSelector]) {
__unused BOOL addedAlias = class_addMethod(klass, aliasSelector,method_getImplementation(targetMethod), typeEncoding);
NSCAssert(addedAlias, @"Original implementation for %@ is already copied to %@ on %@", NSStringFromSelector(selector),
NSStringFromSelector(aliasSelector), klass);
}
// We use forwardInvocation to hook in, which calls
// into the `__ASPECTS_ARE_BEING_CALLED__` method.
class_replaceMethod(klass, selector, aspect_getMsgForwardIMP(self, selector), typeEncoding);
AspectLog(@"Aspects: Installed hook for -[%@ %@].", klass, NSStringFromSelector(selector));
}
}
There is a trick in the method aspect_getMsgForwardIMP
. the message forward implementation can be two IMPs: _objc_msgForward & _objc_msgForward_stret. Please refer to this article on the differences between the two.