Reliable SFTP Integration with Apache Camel: Idempotency, Retries & Dead Letters
Parsing a file correctly is the easy half. The hard half — the part that pages you at 2am — is everything around it: the same file getting picked up twice, a downstream system being briefly down, a malformed record that should be quarantined rather than crashing the whole batch. Here’s how I make Apache Camel file routes survive the real world.
1. Don’t process the same file twice
SFTP polling is at-least-once by nature. A connection drops after you’ve read a file but before you’ve moved it, and the next poll sees it again. Camel’s idempotent consumer prevents reprocessing by remembering what it has already seen:
from("sftp://user@bank-host:22/outbound/ach"
+ "?password=RAW({{bank.sftp.password}})"
+ "&readLock=idempotent-changed"
+ "&idempotentRepository=#fileIdempotentRepo"
+ "&move=.done&moveFailed=.error")
.routeId("ach-inbound")
.idempotentConsumer(header("CamelFileName"))
.idempotentRepository("fileIdempotentRepo")
.log("Processing new file ${header.CamelFileName}")
.to("direct:processAch");
For a single node an in-memory repo is fine, but it forgets on restart. In production — especially across multiple instances — back it with something shared:
@Bean
IdempotentRepository fileIdempotentRepo(DataSource ds) {
// Survives restarts and is safe across multiple app instances.
return JdbcMessageIdRepository.jdbcMessageIdRepository(ds, "ach-inbound");
}
Use the file name plus a content hash as the key if a filename can legitimately recur with new contents.
2. Retry transient failures, don’t retry bugs
When the downstream core-banking API blips, you want to back off and retry — not fail the batch. A redelivery policy with exponential backoff handles that:
errorHandler(defaultErrorHandler()
.maximumRedeliveries(5)
.redeliveryDelay(2000)
.backOffMultiplier(2) // 2s, 4s, 8s, 16s, 32s
.retryAttemptedLogLevel(LoggingLevel.WARN));
The key discipline: only transient errors should be retryable. A
SocketTimeoutException deserves a retry; a RecordValidationException does
not — retrying it just burns five attempts on data that will never parse.
3. Quarantine poison messages with a dead-letter channel
Records that can’t be processed shouldn’t block the rest of the file or disappear. Route them to a dead-letter destination for inspection and reprocessing:
onException(RecordValidationException.class)
.handled(true) // don't kill the route
.maximumRedeliveries(0) // no point retrying bad data
.to("sftp://user@bank-host:22/quarantine")
.log(LoggingLevel.ERROR,
"Quarantined bad record from ${header.CamelFileName}: ${exception.message}");
Pair this with per-record splitting and stopOnException(false) so one bad
row doesn’t abort the other 999,999 good ones:
.split(body()).streaming().stopOnException(false)
.to("bean:achPostingService")
.end();
4. Make failures visible
Silent integrations are dangerous integrations. A few cheap habits pay off:
- Stable
routeIds so logs and metrics are searchable per route. - Camel’s Micrometer support for throughput, in-flight, and failure counts, surfaced in your dashboards.
- Alert on the quarantine folder — anything landing there means a human should look.
The shape of a resilient route
Put together, every robust file route I build has the same skeleton:
pick up safely (read locks) → dedupe (idempotent consumer) → parse → split per record (don’t fail the batch) → process with bounded retries → quarantine poison data → archive & emit metrics.
Get those seven moves right and a file integration stops being a source of 2am pages and becomes the boring, reliable plumbing it’s supposed to be.
If you want the parsing side — modelling fixed-length and ACH/NACHA records with Bindy — see Parsing Fixed-Length and Dynamic Banking Files with Apache Camel.