• Repo: ~/learning/gpucc (local)
  • Hardware: AMD EPYC (SEV-SNP), Intel Xeon (TDX), NVIDIA H100/A100

What is Confidential Computing?

  • Running workloads in hardware-enforced trusted execution environments (TEEs)
  • The host/hypervisor cannot read or tamper with guest memory
  • Attestation: cryptographic proof that the workload is running in a genuine TEE with expected code

AMD SEV-SNP

  • SEV = Secure Encrypted Virtualization
  • SNP = Secure Nested Paging (latest generation)
  • Each VM gets its own encryption key managed by a dedicated security processor (PSP)
  • Memory pages are encrypted transparently – the guest OS doesn’t need modification
  • Attestation via /dev/sev-guest ioctl:
    • Send 64 bytes of user data (challenge/nonce)
    • Get back a signed report (4096 bytes) from the AMD PSP
    • Report contains measurement of the VM, policy, platform info
struct snp_report_req {
    uint8_t user_data[64];    // your challenge
    uint32_t vmpl;            // privilege level
    uint8_t report[4096];     // signed attestation report
};
// ioctl(fd, SNP_GET_REPORT, &req)
  • Kernel params needed: mem_encrypt=on iommu=pt amd_iommu=on kvm_amd.sev=1 kvm_amd.sev_snp=1
  • QEMU launch requires -object sev-snp-guest with cbitpos and policy

Intel TDX

  • TDX = Trust Domain Extensions
  • Similar concept to SEV-SNP but Intel’s approach
  • Uses a TDX Module running in a new CPU mode (SEAM)
  • Each TD (Trust Domain) has isolated memory, CPU state
  • Attestation via Intel’s SGX-style quoting infrastructure
  • Advice from colleague with PhD: Intel generally hires lots of PhDs who implement Everything but it’s overcomplicated; AMD stuff tends to be simpler / works

GPU Confidential Computing

  • NVIDIA H100 supports CC mode – GPU memory is encrypted
  • GPU gets its own attestation report (firmware measurement, nonce, signature)
  • Combined attestation: SEV-SNP report (VM integrity) + GPU report (GPU integrity)
  • VFIO passthrough to give the confidential VM direct GPU access:
# unbind from host driver, bind to vfio-pci
echo "$GPU_PCI_ID" > /sys/bus/pci/devices/$GPU_PCI_ID/driver/unbind
echo "vfio-pci" > /sys/bus/pci/devices/$GPU_PCI_ID/driver_override
echo "$GPU_PCI_ID" > /sys/bus/pci/drivers/vfio-pci/bind
  • The whole pipeline: host setup -> guest image -> launch VM with GPU passthrough -> run CUDA inside confidential VM

Attestation Flow

  1. Verifier sends a nonce/challenge
  2. Guest requests attestation from SEV-SNP (CPU-level) and GPU
  3. Both return signed reports containing:
    • Platform identity
    • Firmware/code measurements
    • The nonce (proves freshness)
  4. Verifier checks signatures against known-good root of trust
  5. If valid, verifier trusts the computation results

Workshop/Lab Ideas

  • TDX vs SEV-SNP comparison: setup, attestation, performance
  • GPU CC hello world: multiply two primes inside encrypted GPU memory, verify via attestation
  • Remote attestation demo: client sends challenge, CC VM responds with proof