In my last post I wrote about why intelligence agencies were apparently unable to “connect the dots” on the Nigerian underwear bomber. The post elicited a variety of comments, some from distinguished experts (Don Peppers in customer data, Sam Felton in strategic and competitive intelligence, Sanjay Poonen in enterprise integration, and Jeff Jonas in information security). Some felt that “connecting the dots” should have been an easy problem, but most agreed with me that it’s difficult.
In this post, I’ll describe what needs to happen if you want to connect disparate pieces of information about the same person, organization, or entity. I hope you’ll be further persuaded of both the difficulty and the value of such an effort. The need is widely felt, whether you’re trying to connect the dots on customers in business, terrorists in intelligence, patients in health care, molecules in drug development, etc.
Here’s what you need to pull off to make this happen:
The key terms in your database — names, attributes, etc. — need to either be rigidly controlled at data entry, or there must be some kind of semantic layer connecting different spellings and meanings of the same term. Jonas’ list of 24 spellings of the name “Abdulmutallab” is both amusing and tragic at the same time.
You have to be clear on what event or events you’re looking for. Are you trying to identify a likely customer purchase or attrition, a terrorist incident, a drug interaction, or what?
You have to have a model of what pattern of variables and data values predict that event. The model may be implicit in people’s heads, but that means that people will have to search through the data to find that pattern. If you want a computer to identify the pattern, the model and all variables have to be explicit and the data unambiguous.
Everybody who makes observations has to collaborate in entering the data. If a CIA field officer observes something about a terrorist, or a salesperson talks with a disgruntled customer, they have to get it into the database.
Note that I’ve said “the database.” There needs to be one place — not many — where the important information goes. One could conceivably set up a distributed database environment in which the relevant information would be synchronized and updated, but it would be a technical nightmare.
The database has to run continuous cross-checks to see if something new and important has been added to the record. (Jonas calls this “enterprise intelligence”, but I don’t think there is a widely-accepted name for it.)
There has to be a clear plan of action for what all related parties should do when the pattern is identified and the key event is imminent. The decision rules are critical here. Should a potential terrorist go onto the no-fly list? Should a customer be offered a substantial discount if he or she is about to attrite?
There has to be a mechanism for turning the plan into action. Whether you’re talking about terrorism or commerce, response time is of the essence.
There are probably a few other steps too, but perhaps you have realized how hard this is. Some steps are harder than others, of course. For example, the U.S. government has made substantial progress on developing a single database of potential terrorists (called TIDE, for Terrorist Identities Datamart Environment), but the Christmas Day incident suggests that steps 4, 6, 7, and 8 are not fully mature. Human behavior is almost always harder to change than information technology systems.
How about your organization? Are you ahead of or behind the U.S. intelligence community? Are you doing all eight steps with respect, for example, to your customers? If you are, I’d love to hear about it — and I’d like to become your customer!