n8n + Email Validation: Automating List Hygiene Workflows

hangrydev ·

Your List Is Rotting While You Sleep

Email lists decay at roughly 2% per month. ZeroBounce’s annual Email List Decay Reports put it at 28% for 2024 and 23% for 2025, based on billions of verified addresses. If you validated your list in January and haven’t touched it since, a quarter of those addresses could be dead by December.

Manual validation doesn’t scale. You’re not going to upload a CSV every Tuesday morning. Nobody does that for more than two weeks.

n8n fixes this. It’s the open-source workflow automation platform that developers actually self-host because they want to own their data and skip the per-task pricing of Zapier and Make. You build workflows visually, but the HTTP Request node, expressions engine, and code nodes give you the same control as writing it by hand.

This tutorial covers three workflows: validate on form submission, clean a Google Sheet on a schedule, and trigger validation from CRM events via webhook. All three use the email validation API through n8n’s HTTP Request node.

Setting Up the HTTP Request Node

Every workflow in this tutorial hits the same validation endpoint. The HTTP Request node config stays consistent across all three.

{
  "method": "POST",
  "url": "https://api.truemail.io/v1/verify",
  "headers": {
    "Authorization": "Bearer {{ $env.TRUEMAIL_API_KEY }}",
    "Content-Type": "application/json"
  },
  "body": {
    "email": "{{ $json.email }}"
  }
}

A few things matter here.

Store your API key in n8n’s environment variables or credentials, not hardcoded in the node. Self-hosted n8n can read from environment variables you set on your server or in Docker. Cloud n8n has a built-in credentials store. Either way, keep secrets out of the workflow JSON.

The response comes back with a status field: deliverable, undeliverable, risky, or unknown. The risky status usually means a catch-all domain that accepts mail for any address. You’ll route on these statuses in every workflow below.

Set the timeout to 30 seconds in the HTTP Request node’s Options. SMTP verification can take 500ms to 3 seconds per address, and greylisted servers push that higher. The node’s default timeout is generous (around 300 seconds), but setting an explicit lower value prevents a stalled request from blocking your whole batch run.

Workflow 1: Validate on Form Submission

Someone fills out your signup form. Before that email hits your database, you want to know if it’s real.

The Trigger

Use n8n’s Webhook node as the entry point. It creates a URL you can POST to from your frontend or form service.

{
  "node": "Webhook",
  "config": {
    "httpMethod": "POST",
    "path": "validate-signup",
    "responseMode": "lastNode"
  }
}

The responseMode: lastNode setting is the key detail. It means n8n waits for the entire workflow to finish before responding to the caller. Your form gets the validation result back in the same HTTP response. No polling. No callbacks.

The Flow

Webhook receives the email. HTTP Request node calls the validation API. An IF node splits on the result.

// IF node expression
{{ $json.data.status === "deliverable" }}

True branch: return a 200 with { "valid": true } to the form. The user proceeds.

False branch: return a 200 with { "valid": false, "reason": $json.data.status }. Your frontend shows an error before the bad address ever touches your database.

What about latency? The full SMTP check adds 1-3 seconds to your form submission. That’s noticeable. Two options: show a spinner with “Verifying your email…” or run only the MX check (50-200ms) synchronously and defer SMTP verification to a background workflow. The email validation API guide covers this latency tradeoff in detail.

Handling Edge Cases

Disposable emails need their own branch. The API returns a disposable: true flag alongside the status. Add a second IF node after the validation response:

// Check for disposable email providers
{{ $json.data.disposable === true }}

Tempmail, Guerrilla Mail, Mailinator. Industry data suggests anywhere from 5-10% of new signups use disposable addresses, and the number keeps climbing. If you’re running a SaaS with a free trial, that number is probably higher. Route these to a rejection response or flag them for manual review.

Workflow 2: Scheduled Batch Cleaning

Your marketing team keeps a Google Sheet of leads. Or an Airtable base. Or both. These lists grow stale. Contacts change jobs, companies shut down, mailboxes get deactivated.

This workflow runs on a schedule, pulls the list, validates every row, and writes the results back.

The Trigger

{
  "node": "Schedule Trigger",
  "config": {
    "rule": {
      "interval": [{ "field": "weeks", "weeksInterval": 1 }]
    }
  }
}

Weekly is the right cadence for most lists under 50,000 contacts. Monthly works for cold storage lists. Daily is overkill unless you’re adding hundreds of new contacts per day.

Reading from Google Sheets

The Google Sheets node pulls all rows. You want just the email column and a row identifier so you can write results back.

{
  "node": "Google Sheets",
  "operation": "getAll",
  "sheetId": "{{ $env.GOOGLE_SHEET_ID }}",
  "range": "A:D"
}

The Batch Problem

Here’s where it gets interesting. If your sheet has 5,000 rows and you send 5,000 HTTP requests in parallel, you’ll blow through API rate limits in seconds. MailCop’s standard tier allows 10 requests per second. At 5,000 rows, that’s over 8 minutes of sequential processing.

n8n handles this with the Loop Over Items node (formerly called Split In Batches). Set the batch size to 10 and connect it to the HTTP Request node. Each batch fires 10 requests, waits for responses, then fires the next 10.

{
  "node": "Loop Over Items",
  "config": {
    "batchSize": 10,
    "options": {
      "reset": false
    }
  }
}

Add a Wait node after each batch with a 1-second delay. That keeps you under rate limits without wasting time.

For lists over 10,000 rows, skip the row-by-row approach entirely. Use the batch validation endpoint and receive results via webhook. Submit the whole list in one API call, then process the callback when it arrives.

Writing Results Back

After validation, the Google Sheets Update node writes the status back to each row. Map the validation result to a “Status” column and the timestamp to a “Last Validated” column.

{
  "node": "Google Sheets",
  "operation": "update",
  "sheetId": "{{ $env.GOOGLE_SHEET_ID }}",
  "range": "E{{ $json.rowNumber }}:F{{ $json.rowNumber }}",
  "data": {
    "Status": "{{ $json.data.status }}",
    "Last Validated": "{{ $now.toISO() }}"
  }
}

Now your marketing team opens the sheet Monday morning and sees which contacts are deliverable, which bounced, and which are risky. No CSV exports. No manual uploads. It just runs.

Workflow 3: CRM Webhook Trigger

A new lead enters your CRM. A contact updates their email. A deal moves to a stage where you’re about to send an outbound sequence. These are all events that should trigger validation.

The Trigger

Most CRMs (HubSpot, Salesforce, Pipedrive) support outbound webhooks on record changes. Point that webhook at an n8n Webhook node.

{
  "node": "Webhook",
  "config": {
    "httpMethod": "POST",
    "path": "crm-contact-validate",
    "responseMode": "responseNode"
  }
}

Use responseMode: responseNode here instead of lastNode. This tells n8n to wait for a separate Respond to Webhook node to send the HTTP response, giving you full control over when and what you return. You want to acknowledge the CRM’s webhook immediately and then continue processing. CRM webhooks typically timeout after 5-10 seconds. If validation takes longer, the CRM retries and you get duplicate events.

The Flow

Place a Respond to Webhook node right after the Webhook trigger. It fires the 200 response back to the CRM immediately. Everything downstream keeps running.

{
  "node": "Respond to Webhook",
  "config": {
    "respondWith": "json",
    "responseBody": { "received": true },
    "responseCode": 200
  }
}

After the Respond to Webhook node, the HTTP Request node validates the email. An IF node routes on the result. The true branch updates the CRM contact with a “verified” tag. The false branch flags the contact and optionally notifies a Slack channel.

Conditional Routing

n8n’s Switch node handles multi-way routing better than chained IF nodes. Route on all four statuses:

// Switch node rules
[
  { "value": "deliverable", "output": 0 },
  { "value": "undeliverable", "output": 1 },
  { "value": "risky", "output": 2 },
  { "value": "unknown", "output": 3 }
]
  • Output 0 (deliverable): Update CRM contact with verified status. No action needed.
  • Output 1 (undeliverable): Remove from active sequences. Tag as bounced. Alert the rep.
  • Output 2 (risky): Flag for review. Often a catch-all domain where individual verification isn’t possible.
  • Output 3 (unknown): Queue for retry in 24 hours. Temporary DNS issues or greylisting can cause this.

The “risky” bucket deserves attention. Around 15-28% of B2B domains are catch-all, according to EmailListVerify’s 2024 domain analysis. That’s a lot of contacts sitting in limbo. Your workflow needs a policy for these: validate periodically, score based on engagement, or accept the risk and send anyway with lower volume.

Self-Hosted vs Cloud n8n

This matters for email validation workflows specifically. Not just in general.

Self-hosted n8n runs on your infrastructure. Your API keys stay on your servers. Webhook payloads containing email addresses (PII under GDPR) never touch a third-party cloud. If you’re validating customer lists in the EU, self-hosting removes one data processor from your chain.

Cloud n8n is simpler to operate. No Docker containers to manage, no updates to apply, no SSL certs to rotate. But your workflow executions (including every email address you validate) flow through n8n’s infrastructure. For a marketing list of newsletter subscribers, that’s probably fine. For a CRM full of enterprise contacts under NDA? Maybe not.

The performance difference matters too. Self-hosted n8n on a 4-core VPS handles roughly 50-80 workflow executions per minute with the HTTP Request node. Cloud n8n’s execution limits depend on your plan tier. For the batch cleaning workflow processing 5,000 contacts weekly, either works. For real-time validation on every form submission at 1,000+ signups per day, test your throughput first.

One more thing. Self-hosted n8n lets you install any community node from npm, no approval needed. If someone builds a dedicated MailCop node (instead of using the generic HTTP Request node), you can install it immediately on self-hosted. Cloud n8n only allows verified community nodes that have passed n8n’s security review.

Error Handling Across All Three Workflows

Validation API calls fail. DNS timeouts, rate limits, network blips. Every workflow needs an error branch.

n8n handles this through a separate Error Workflow. Create a new workflow with an Error Trigger node as its starting point. Then, in each validation workflow’s settings, set this new workflow as the Error Workflow. When any node in the main workflow fails, n8n runs the error workflow and passes the failure details to the Error Trigger node.

Connect the Error Trigger to a notification node (Slack, email, Discord) and a logging node (write to a Google Sheet or database).

{
  "node": "Error Trigger",
  "connected_to": [
    {
      "node": "Slack",
      "message": "Validation workflow failed: {{ $json.error.message }}"
    },
    {
      "node": "Google Sheets",
      "operation": "append",
      "data": {
        "Timestamp": "{{ $now.toISO() }}",
        "Error": "{{ $json.error.message }}",
        "Workflow": "{{ $json.error.workflow.name }}"
      }
    }
  ]
}

For the batch workflow specifically, enable “Retry On Fail” on the HTTP Request node in its Settings tab. Set Max Tries to 3 with a wait between attempts. Note that n8n’s built-in retry uses a fixed delay, not exponential backoff. For true exponential backoff, you’d need a custom loop with a Wait node. A single failed validation shouldn’t stop the other 4,999 from processing.

What about partial failures in batch mode? If row 847 fails and the workflow stops, you lose progress on the remaining rows. The Loop Over Items node helps here because each batch is independent. If batch 85 of 500 fails, batches 1-84 are already written back to the sheet. The error handler logs batch 85’s failure, and batches 86-500 continue.

Putting It All Together

Three workflows. Three triggers. Same validation API underneath.

The form submission workflow catches bad emails before they enter your system. The scheduled batch workflow cleans lists that are already there. The CRM webhook workflow validates on every data change event.

You could build all three in an afternoon. The HTTP Request node config is identical across workflows. The routing logic is a Switch node with four outputs. The only real difference is the trigger and what happens after validation.

Start with the one that hurts most. If your bounce rate is climbing, start with the batch cleaner. If fake signups are polluting your funnel, start with the form trigger. If your sales team is sending sequences to dead addresses, start with the CRM webhook.

Then add the other two. They’re workflows, not code deployments. No PRs, no CI pipeline, no deploy windows. Just import the JSON and activate.