Like what you read? Get new posts delivered straight to your inbox!
Contact Us | Follow Us
Experience. Integrity. IMPACT.
What We Do

Electronic Submissions for Paper People – How Valid Is Your Validator?

October 15, 2015 | BJ Witkin, Senior Manager | Regulatory Operations

We’ve talked about what goes in an eCTD submission, what you (or your publisher) need to create one, what kind of information your publisher will ask you for at the start of a new application, and lifecycle and linking. This time we’re going to discuss another term thrown around the eCTD world a lot: Validation.

In the years before I became a submission publisher I attended a lot of presentations about eCTD submissions. I heard a lot of presenters say things like, “Every submission is validated before it’s sent,” and “The FDA validates every submission before it’s uploaded to their servers.” I heard these statements and thought that validation checks make sure every submission is perfect.

I mean, it makes it all sound very black-and-white, doesn’t it?

The reality of validation is that there are at least 50 shades of grey involved. I’m about to let you in on a secret:

The next submission I send which completely validates—that is, which has no warnings—will be the FIRST submission I’ll send which completely validates.

Yes, that’s right. I don’t know a single publisher who has sent a submission with zero warnings. You’re probably asking why—or maybe how can this be? Before I answer that, let me explain how validation tools work and what they do.

All Validators aren’t created equal

The FDA has guidances and specifications for eCTD submissions and PDF documents. Validation tools look at these guidances and check each submission against them to see if it complies. Sounds simple, but it isn’t. Here are a few examples.

Problem 1: The guidances are often open to interpretation or unclear (I know you’re shocked to hear that). Thus, just like regulatory people, the programmers writing the tool have to figure out what the guidance really means. And, just as three regulatory people may have three different interpretations of a guidance, three developers might too.

Problem 2: Guidances change. You’d expect that as the guidance evolved so would the validation tools, but this isn’t always the case. For many companies their validation tools are loss-leaders—they’re given for free when you purchase the publishing tool. Therefore they don’t have any incentive to update the tool to keep up with the changes.

Problem 3: Some validators are just better at finding and reporting problems than others. Many validators are written by vendors who have publishing tools. These companies will often write validators to work really well with their publishing software…and not as well with submissions published using another tool. Why not just always use the validator that comes with the publishing tool since they work together? Maybe your validation tool is covering up errors made by the publishing tool…

Are you scared yet? Don’t worry. Your publisher probably has this under control.

Working around the Validator

Most, if not all, experienced publishers are aware of these issues and have come up with a pretty simple solution:

We validate our submissions using two (or more) validators.

Here are validation reports from two different validators applied to the same submission:

Validator 1 reported one error (in blue), while Validator 2 reported 27 (in red)! And Validator 1 reported 34 warnings (mostly non-embedded fonts) but Validator 2 reported 18. Which one is correct? They both are…kinda.

Fonts. Seriously?

Looking at the report for Validator 1, there are many warnings about embedded fonts. Yet Validator 2 didn’t see any embedded font issues at all:


I know from experience that Validator 1 throws false positives about embedded fonts, so in this case I won’t worry about it. Frankly, even if it had found non-embedded fonts it wouldn’t bother me. This is one of the most common warnings and most (maybe all) publishers ignore them.

PDF should be text searchable, not just image only

I’d like you to look at the warning in green from Validator 1, “PDFs should be text searchable, not just image only.” That’s straight from the FDA’s guidance on PDFs, and it’s FDA’s way of asking us not to send scanned files.

Instead we’re supposed to use Optical Character Recognition (OCR) on any scanned documents. The documents flagged in this submission are all old journal articles. Since OCR almost always introduces errors in the content, I’d never OCR reference articles like these.

False positives

I’ve highlighted two warnings in a pinkish-brown color. The first time I got these warnings from this validator I panicked a little because I’d never seen them in any other validator.

When I couldn’t figure out what was wrong I called tech support for the validator and asked what was going on. They called me back about 30 minutes later and told me they’d determined that this was a false positive—their own validator threw these warnings even when validating submissions from their own publishing tool.

Every submission we validate with this tool generates those warnings.

How about those 27 High-level errors in Validator 2?

In May of 2015 FDA released a new specification for PDFs. In it they required all cross-document hyperlinks to open in a new window. Our publishing tool won’t make links open in an external window; until the tool is fixed we can’t comply.

Fortunately I’ve spoken to the eCTD folks at FDA and they’ve admitted that it’s not a reasonable requirement. Still, Validator 2 checks for them and if they’re not set correctly they generate errors, one for each link:


Why didn’t Validator 1 find them? Because Validator 1’s vendor hasn’t updated their tool yet.

Then there’s the silly stuff

Those of you who’ve worked in Regulatory for a while will shake your head at this but I doubt you’ll be surprised. Ready?

Look at the purple-highlighted warning in Validator 1’s report, “File-level security or password protection is present (not applicable for FDA Forms).” The PDF guidance specifically states that PDFs may not have any kind of security…but FDA’s own forms—the 1571 and 3674—have security on them when they come from FDA and it can’t be removed.

A good validator is smart enough to recognize the 3674 and ignore the password protection when it sees the forms; Validator 1 isn’t smart enough to ignore the security setting but Validator 2 is.

Here’s another one. Look at the error highlighted in blue for Validator 1, “PDF file is not readable by eCTD Validator.” Column C tells me this error is in the 1571 form. That’s the fillable form supplied by FDA. So what’s wrong with it? Our sponsor applied a digital signature—which is recommended by FDA—but applying the digital signature made the file unreadable by the validator.

And don’t ask me why the 1571 didn’t generate the warning about “File-level security” too—the form has it but apparently the validator didn’t notice.

It’s stuff like this which drives publishers crazy. As always, if you’d like to have your submissions drive us crazy instead of you, don’t hesitate to contact us.

Category: Regulatory Operations
Keywords: eCTD lifecycle, eCTD publishing, electronic submission, eCTD validation

Other Posts You Might Like: