This rule identifies code patterns where embeddings are stored in vector databases without validation.

Requires validation of embeddings before storage or similarity search.

📊 Rule Details

Property	Value
Type	suggestion
Severity	🟡 MEDIUM
OWASP LLM	LLM08: Vector & Embedding Weaknesses
CWE	CWE-20: Improper Input Validation
CVSS	5.5
Config Default	`off` (recommended), `error` (strict)

🔍 What This Rule Detects

This rule identifies code patterns where embeddings are stored in vector databases without validation.

❌ Incorrect Code

// Direct embedding without validation
await vectorStore.upsert({
  id: docId,
  embedding: await embed(text),
});

// Unvalidated createEmbedding
await index.insert({
  vector: await createEmbedding(input),
});

✅ Correct Code

// Validated embedding
await vectorStore.upsert({
  id: docId,
  embedding: validateEmbedding(await embed(text)),
});

// Normalized vector
await index.add({
  vector: normalize(embedding),
});

⚙️ Options

Option	Type	Default	Description
`embeddingPatterns`	`string[]`	`['embed', 'embedding', 'vector', 'encode']`	Patterns suggesting embedding calls
`validatorFunctions`	`string[]`	`['validate', 'verify', 'normalize']`	Functions that validate embeddings

🛡️ Why This Matters

Unvalidated embeddings can:

Poison vector stores - Malicious embeddings return incorrect results
Cause DoS - Invalid dimensions crash indexing
Enable jailbreaks - Crafted embeddings bypass safety
Leak information - Embedding inversion attacks

Known False Negatives

The following patterns are not detected due to static analysis limitations:

Validation in Embedding Function

Why: Validation inside called functions is not visible.

// ❌ NOT DETECTED - Validation in embed function
await vectorStore.upsert({
  embedding: await safeEmbed(text), // Validates internally
});

Mitigation: Document validation. Apply rule to embedding functions.

Custom Vector Store Methods

Why: Non-standard methods may not be recognized.

// ❌ NOT DETECTED - Custom store method
await myVectorDb.add(embedding); // Not in default patterns

Mitigation: Configure embeddingPatterns with custom method names.

Batch Embedding Operations

Why: Batch operations may obscure individual validations.

// ❌ NOT DETECTED - Batch operation
await vectorStore.batchUpsert(embeddings); // Are all validated?

Mitigation: Validate before batching. Review batch implementations.

📚 References

Error Message Format

The rule provides LLM-optimized error messages (Compact 2-line format) with actionable security guidance:

🔒 CWE-20 OWASP:A06 CVSS:7.5 | Improper Input Validation detected | HIGH [SOC2,PCI-DSS,HIPAA,GDPR,ISO27001]
   Fix: Review and apply the recommended fix | https://owasp.org/Top10/A06_2021/

Message Components

Component	Purpose	Example
Risk Standards	Security benchmarks	CWE-20 OWASP:A06 CVSS:7.5
Issue Description	Specific vulnerability	`Improper Input Validation detected`
Severity & Compliance	Impact assessment	`HIGH [SOC2,PCI-DSS,HIPAA,GDPR,ISO27001]`
Fix Instruction	Actionable remediation	`Follow the remediation steps below`
Technical Truth	Official reference	OWASP Top 10

require-embedding-validation