SymPalantir RAG: Why RAG Security Requires Heterogeneous Compute

March 27, 2026

Secure RAG Groundwork

Amine Raji published two articles this month constituting a very thorough treatment of RAG knowledge base poisoning. The first article, RAG Poisoning: How Attackers Corrupt Your AI’s Knowledge Bases, demonstrated that three fabricated financial documents injected into a ChromaDB knowledge base caused a RAG system to report $8.3 million in quarterly revenue when the actual figure was $24.7 million. The attack succeeded on 19 of 20 runs with no jailbreak and no software exploit, relying only on three well-written fakes carrying authoritative language. The second article, RAG Security: Attacks, Defenses & Architecture, proposed a five-layer defense stack and tested each layer independently finding that embedding anomaly detection alone reduced attack success from 95% to 20%.

Raji’s work is important for two reasons. The lab demonstration is rigorous and the defense analysis is honest about what works and what does not, presenting measured success rates rather than categorical claims. Raji’s identification of semantic injection as “the hard one,” because the attack uses authoritative natural language with no detectable markers, correctly frames the central difficulty. Raji’s further observation that teams “wave through the document ingestion pipeline because that is all internal data” names the cultural blind spot making every subsequent technical vulnerability exploitable.

My build of the SymPalantir RAG system takes a fundamentally different approach to the same problem, not because Raji’s approach is wrong but because the two architectures serve different threat environments and the differences between them make clear that a security architecture’s ceiling can be set by computational diversity, not merely by its number of layers. Raji identifies semantic injection as the hardest attack class because the attack uses authoritative natural language with no detectable surface markers. Raji’s full five-layer stack reduces semantic injection success from 70% to 15%. The 15% residual is significant. However, in a financial context, one successful semantic injection in seven attempts means the attacker only needs patience.

The SymPalantir RAG stack reduced semantic injection success to 0% across all test runs. The difference lies in an architectural approach that changes the problem itself, not in some kind of categorically superior detection. Raji’s defenses operate at generation time limiting damage from documents already present in the knowledge base and requires the system to work around contaminated content. The SymPalantir RAG defenses operate at ingestion time, preventing documents from entering the knowledge base entirely.

Layer 3’s semantic review caught every contradiction in the poisoned documents. When a document claimed Q4 revenue was $8.3M and the existing knowledge base contained documents reporting $24.7M, Nemotron-120B identified the contradiction, flagged the authority spoofing language (“CORRECTED,” “supersedes,” “CFO-approved”) and noted the coordinated submission pattern of three documents arriving simultaneously on the same topic. Layer 4’s neuromorphic anomaly detection independently classified all three poisoned documents as anomalous based on their embedding patterns. The two layers use fundamentally different computational approaches and agreed on the verdict through independent analysis.

The independent convergence of fundamentally different computational approaches on the same verdict shows how heterogeneous compute is essential to secure RAG environments.

Monocultures Fall Together

An entire RAG security pipeline running on the same CPU architecture, the same Python runtime, the same vector database, and the same embedding model constitutes a computational monoculture. A single exploit or poisoning technique that accounts for the shared computational substrate can potentially compromise every layer simultaneously. The layers are logically independent but physically entangled, sharing memory models, floating-point behavior, library dependencies, and failure modes. An adversarial embedding that fools cosine similarity on one layer fools cosine similarity on every layer because the math runs on identical hardware with identical numerical properties.

By way of contrast, SymPalantir RAG uses fundamentally different compute capabilities for each trust decision.

The Symphony management host handles orchestration, workflow enforcement, and multi-party approval logic via x86. The trust decision at the orchestration layer is procedural, determining whether a given document carries the right signatures from the right parties.

Five NVIDIA RTX 3090 GPUs on one compute host run Nemotron-120B for semantic review. The trust decision at the semantic layer is linguistic, determining whether a document’s content contradicts what the knowledge base already knows. The computational substrate is a 120-billion-parameter transformer performing attention over token sequences, a fundamentally different operation from anything the other layers perform.

Ten BrainChip AKD1000 spiking neural network processors across ten Symphony edge nodes run trained CNN classifiers converted to spiking neural networks for anomaly detection. The trust decision at the anomaly detection layer is geometric, determining whether a document’s embedding pattern resembles the patterns the model learned from known-clean and known-poisoned training data. The AKD1000 is neuromorphic silicon that does not execute instructions sequentially but instead propagates spikes through a network of digital neurons with event-driven activation. The AKD1000’s computational model shares nothing with the GPU running the transformer or the CPU running the orchestration logic.

CKKS homomorphic encryption protects the knowledge base embeddings during anomaly detection. The cosine similarity computation runs on encrypted ciphertexts using ring polynomial arithmetic at a dimension of 8192. An attacker who understands the embedding space is operating against vectors the attacker cannot observe because the vectors are encrypted with keys derived from quantum entropy and the anomaly math runs homomorphically without ever decrypting the data.

IBM Quantum processors via IBM Cloud generated the entropy pool for cryptographic key material. The 609 quantum-random keys from ibm_fez were produced by Hadamard circuits measuring qubits in superposition, producing randomness derived from quantum mechanical processes that are provably non-deterministic. The signing nonces in Layer 1 carry provenance that no classical pseudorandom generator can provide.

Palantir Foundry serves as an independent system of record outside the compute stack. Every submission, every layer verdict, and every rejection reason is written to Foundry’s ontology across six object types and five link types with AES-256-GCM encrypted audit records. The audit trail is immutable, externally hosted, and queryable through Foundry’s own access controls. An attacker who compromises the entire compute stack still cannot alter the record of what was rejected and why the rejection occurred.

The attack surface across these paradigms is exponentially secure. An attacker who learns to fool the transformer running semantic review still has to fool a spiking neural network running on completely different silicon with a completely different computational model. An attacker who finds an adversarial embedding that evades the anomaly detector is operating against encrypted vectors the attacker cannot inspect. An attacker who somehow defeats both the LLM and the neuromorphic classifier still cannot bypass the cryptographic signature requirement without obtaining keys generated from quantum entropy and stored in hardware the attacker cannot access.

Each layer demands a different class of expertise. Fooling a transformer requires adversarial NLP. Fooling a spiking neural network requires neuromorphic adversarial techniques that barely exist as a research field. Breaking CKKS encryption requires advances in lattice cryptography. Forging quantum-random signatures requires either stealing physical keys or breaking quantum mechanics. No individual attacker and very few organizations possess simultaneous expertise across all four domains.

The Biodiversity Analogy

The security property of computational diversity parallels biodiversity in ecosystems. A monoculture wheat field is efficient with identical plants, identical harvest timing, and identical processing, yet the field remains one pathogen away from total crop failure. The Irish Potato Famine resulted not from a particularly virulent blight but from genetic uniformity that gave the blight a single attack surface replicated across an entire food system.

Mixed-species ecosystems resist pathogens not because any individual species is more resistant but because a pathogen adapted to exploit one species’ biology gains no advantage against a species with different biology. The resistance is structural, arising from diversity itself rather than from the hardness of any single component.

Military defense doctrine applies the same principle. Mixed air defense systems layer radar-guided missiles, infrared-guided missiles, directed energy weapons, and electronic warfare not because any single system is impenetrable but because a countermeasure designed to defeat radar guidance is useless against an infrared seeker. The attacker must carry simultaneous countermeasures for every engagement type and the defender can present different engagement types unpredictably.

Homogeneous RAG security stacks are monocultures. Adding more Python-based detection layers on the same CPU with the same libraries and the same numerical precision does add depth but the depth remains within a single computational paradigm. A sufficiently sophisticated adversarial technique that accounts for the shared substrate can potentially cascade through every layer.

Heterogeneous compute creates security that no single layer provides independently, requiring the attacker to simultaneously defeat fundamentally different computational paradigms sharing no common exploit surface.

Beyond Trusted Execution Environments

Heterogeneous compute connects easily to the broader trend of confidential computing and trusted execution environments. Yet, adding different environments means going further. Intel SGX, AMD SEV, and similar TEEs protect data in use by encrypting memory on a single CPU architecture. Memory encryption prevents the cloud provider and co-tenants from observing computation but the computation itself still runs on one architecture. A side-channel attack against the CPU potentially compromises everything the TEE protects.

Heterogeneous trust distributes the decision across architectures sharing no common side-channel surface. The AKD1000’s event-driven spike propagation does not exhibit the same cache-timing vulnerabilities as x86 instruction execution. The GPU’s SIMT execution model shares neither the AKD1000’s temporal dynamics nor the CPU’s branch prediction behavior. An attacker who discovers a side channel in one architecture learns nothing about the others.

The homomorphic encryption layer adds a second dimension of protection orthogonal to hardware diversity. Even if an attacker could observe the computation on every piece of hardware simultaneously, a scenario requiring extraordinary access, the embeddings themselves remain encrypted. The computation produces correct results on ciphertexts without the hardware ever processing plaintext vectors. The attacker sees encrypted polynomial operations, not embedding geometry.

Where Each Architecture Fits

The two architectures are not competing alternatives but serve different deployment contexts determined by the threat model and the infrastructure budget.

For a SaaS product with multi-tenant RAG, Raji’s architecture is appropriate. Cross-tenant ACL filtering is essential; the infrastructure requirements are modest and the defense retrofits into existing pipelines. For an internal knowledge base facing moderate threats, Raji’s stack similarly provides sufficient protection for many corporate environments.

For regulated financial services where the knowledge base informs decisions subject to audit, SymPalantir RAG’s cryptographic provenance, encrypted embeddings, and immutable audit trail in Foundry provide the compliance evidence regulators require. For defense and intelligence applications where the adversary is sophisticated and persistent, hardware-backed anomaly detection, homomorphic encryption, and quantum-random nonces provide defense-in-depth across computational paradigms the adversary cannot simultaneously master.

For healthcare with compliance requirements either architecture could apply depending on the specific threat. Raji’s ACL filtering and output monitoring protect PHI at the response level. The SymPalantir RAG audit trail and cryptographic provenance provide compliance evidence at the ingestion level. Both Symphony and LSF have been used to provide training and inference for data obfuscation that takes PII/HIPAA data, anonymizes it, and makes it available as an obfuscation capability that expands the use of datasets outside their private context for research and other uses in addition to retaining a gold vaulted copy of the original data.

Also, the architectures here are not mutually exclusive. An ideal enterprise deployment could layer Raji’s retrieval-time ACL filtering and output monitoring on top of the SymPalantir RAG ingestion-boundary defense for comprehensive coverage across the full RAG pipeline. Raji’s defenses handle what happens after documents enter the knowledge base and the initial piece of SymPalantir RAG defenses handle what happens before documents arrive. Combined, the two architectures cover the complete lifecycle.

Build Context

The full SymPalantir RAG stack was built in three days on the same commodity hardware described in earlier articles on SymWisdom and the non-extractive targeting architecture. The hardware comprises ten BrainChip AKD1000 boards on Symphony edge nodes, five RTX 3090 GPUs on a single 64-core EPYC compute host, IBM GPFS across all twelve nodes, and Palantir Technologies Foundry as the external system of record. The Akida anomaly model was trained on the cluster, quantized to 8-bit weights and 4-bit activations, converted to a spiking neural network, and deployed to all ten chips as a Symphony SOAM service. The 19 clean and 3 poisoned financial documents come from a synthetic Meridian Technologies dataset built specifically for the demonstration but could just as easily have used the prior financial documents put together in my first Palantir-based project from December 2025 covering annual reports from 37 financial firms.

The infrastructure cost for RAG security was zero because every component was already in place. The Akida chips were already running perception services. The GPUs were already serving Nemotron-120B for SymWisdom’s reflection engine. GPFS was already providing the shared filesystem. Symphony was already orchestrating the workload lifecycle. Foundry was already receiving ontology objects. Adding a RAG trust pipeline meant deploying new services on existing infrastructure and the dynamic compute platform rather than procuring new hardware.

The zero infrastructure cost reflects a broader advantage of heterogeneous compute worth emphasizing. The same hardware diversity that strengthens the security posture also enables infrastructure reuse. A heterogeneous cluster where neuromorphic chips handle anomaly detection, GPUs handle semantic review, and CPUs handle orchestration naturally distributes the security workload across hardware already present for other purposes.

What the Numbers Show

Against Raji’s poisoning scenario of three fabricated documents claiming Q4 revenue of $8.3M against a knowledge base containing $24.7M, the results are stark.

With no defense the attack succeeds 95% of the time. The RAG system confidently reports the false figure.

With Raji’s full five-layer stack, marker injection drops to 0% success and knowledge poisoning drops to 10% but semantic injection retains a 15% success rate. The five-layer result is genuinely strong and represents a massive improvement over the undefended case.

With SymPalantir RAG’s four layers all three attack types drop to 0% success. The documents never enter the knowledge base. The RAG system continues reporting $24.7M because the knowledge base was never modified.

Per-layer performance confirms the operational viability of the approach. Layer 1 signature verification runs in sub-millisecond time. Layer 3 semantic review on Nemotron-120B runs at 44.7 tokens per second (functional but slow only because commodity RTX 3090s are in play rather than enterprise GPUs). Layer 4 neuromorphic anomaly detection on AKD1000 completes classification in 9 milliseconds with 100% accuracy on poisoned documents and 93.5% accuracy on clean documents. The entire trust pipeline from document submission to final verdict completes in seconds rather than minutes.

Raji’s stack achieves the 15% residual on semantic injection because the defenses operate within the same computational paradigm as the attack. The poisoned documents are designed to be linguistically convincing to language models and a sufficiently convincing document can pass a language-model-based defense operating on the same principles. The SymPalantir RAG stack eliminates the residual not through a better language-model-based defense but through mechanisms operating on entirely different principles that the attacker’s linguistic sophistication does not address, specifically neuromorphic pattern classification and homomorphic distance computation.

Convergence

Raji’s work correctly identifies the problem and builds effective defenses within the constraints of accessible tooling. The contribution is real and the 70% reduction in semantic injection success from a deployable Python stack represents immediate practical value for teams running undefended RAG systems today.

However, the security ceiling of any homogeneous defense stack is bounded by the shared computational substrate. Adding layers within one paradigm adds depth. Adding layers across paradigms adds a qualitatively different kind of defense, one where the attacker’s expertise in defeating any single layer provides no leverage against the others.

Knowledge base poisoning will not be solved by better detection alone. Hardened solutions require trust architectures that make unauthorized ingestion structurally impossible and that distribute the remaining detection across computational paradigms sharing no common failure mode. Simultaneously defeating SymPalantir’s RAG protection means beating five distinct computational paradigms and presents an insurmountable problem even for the most sophisticated attackers. That’s the power of heterogeneous computing properly applied with tools like Symphony, GPFS, BrainChip’s Akida, IBM Quantum, and the homomorphic encryption it all enables.

This article extends the analysis in Amine Raji’s “RAG Poisoning: How Attackers Corrupt Your AI’s Knowledge Bases” and “RAG Security: Attacks, Defenses & Architecture” linked above. Raji’s original poisoning demonstration directly inspired the SymPalantir RAG project. Raji’s lab code is available at github.com.

The opinions expressed in this article are those of the author and do not necessarily reflect the views of the author’s employer.