Marketo over the past couple of years has emerged as one of the top "lead management" platforms in the industry. Most recently, Gartner has them in the top right of their Magic Quadrant for CRM Lead Management (along with Eloqua, now owned by Oracle). Marketo's Revenue Performance and Marketing Automation platform helps marketers manage multiple digital channels, including email, social, online events and mobile. It is a key component of strategies that develop leads and opportunities for sales teams.
As with most data-centric solutions, Marketo's platform is significantly enhanced by ensuring all of the contact data for customers and prospects at its core is correct, current, and comprehensive. At a high level, this significantly enhances the ROI that Marketo customers attain from its platform, and reduces missed opportunities due to inaccurate lead data.
At a more granular level, Address Verification, Phone Number Validation, and especially Real-Time Email Validation can critically improve Marketing Automation efforts. Email validation ensures email addresses, especially when the opt-ins may have been awhile ago, are active and receiving email. This is a critical part of professional lead management efforts. If invalid email addresses existing in a customer or prospect database are not withdrawn, a large number of bounced emails will cause an organization to fall out of favor with ISPs and very likely end up on various spam lists, causing all marketing communications to arrive in spam folders.
Since Marketo was a natural for StrikeIron's capabilities, customers of ours suggested integrating the two platforms to enhance the value and experience of using Marketo. We jumped at the opportunity!
Late in 2012, Marketo introduced their innovative "Webhooks" capability for integrating third party applications via XML-based interfaces. Using their "trigger" functionality (such as "new lead created" or "Web from filled out"), a call-out to a third-party API can be achieved as part of any Marketo process. The resultant response data can then be mapped into the Marketo system and become a part of individual lead record data. In the case of StrikeIron, this includes validated mailing addresses, phone numbers, and email addresses, as well as additional enhancement data that are also returned by the StrikeIron APIs.
Since all of StrikeIron's Cloud APIs have the same data structure, behavior, and open XML protocol formats (in this case, REST was used), it makes it easy to integrate all of our various API products into the Marketo platform. In other words, since both support standard, open platforms designed with integration in mind, the integration was significantly less difficult and deeper than with some other third-party platforms where we have integrated our customer data validaton solutions. In fact, the affinity between the two platforms was significant enough that we went ahead and integrated our SMS Text Messaging API into the Marketo platform as well, supporting Marketo's mobile efforts.
After successfully testing our various Webhook integrations, including with production customers, Marketo then provided StrikeIron with the Marketo-Integrated designation that you can see here: http://launchpoint.marketo.com/strikeiron-inc/749-strikeiron-email-verification
We currently have several solutions now available pre-integrated within Marketo as part of Marketo's Launchpoint partner channel. If you search for "StrikeIron" here, you can see the various integrated solutions currently available.
The last thing anyone wants is a marketing automation platform that is relying on bad or otherwise incomplete data. This powerful, yet easy integration of two platforms ensures that won't be the case.

Point-of-Sale (POS) systems are continuing to evolve. What was once only a "processing and recording of sales" mechanism (such as a cash register), POS implementation is now a considerable competitive advantage in a retailer's strategy. As a result, retailers are always looking to get more long-term ROI out of their POS systems. They are determining how this interaction opportunity with a customer can support other parts of a company's multichannel sales strategy as well.
The operation of these systems is rapidly moving from a counter-based cash register to the retailer's floor, as credit card processing hardware increasingly supports this model. Mobile handheld devices, tablets and other consumer devices performing the actual sales transaction are now commonplace within a store, no longer tying personnel to the cash register. As a result, customers are now often asked to enter their own information via the on-the-floor transaction device, and an email address now more than ever is part of the collected customer information. Primarily, this is because customers increasingly prefer more efficient and environment-friendly e-receipts. Better yet, collecting an email address can then provide an opportunity for future communications via email for the retailer, and can result in increased ongoing customer engagement.
The value of collecting an email address can be considerable. Email-oriented and Web-based marketing are outpacing traditional advertising and direct mail methods, and multi-channel sales strategies are as critical to success as ever. Even with retailers where traditional methods of POS still make the most sense via a cash register, collecting email addresses are equally important for customer retention success.
However, there are a couple of challenges to collecting this kind of customer data on the floor. First, data entry can sometimes be a little tougher with these mobile devices. If the customer enters their own data, which is now often the case, the chances of collecting incorrect or invalid email addresses either by accident or otherwise can go up considerably. Even if the email address is collected properly, 30-40% of people change their email address every year. Simple typos or not, these issues can not only prevent e-mail receipts from being delivered, but a large incidence of invalid emails being entered can result in future marketing communications going into spam folders if Internet Service Providers (ISPs) detect enough email bounces and failed deliveries coming from the same sending source.
Real-time email validation that utilize an instantaneous out-to-the-Cloud check to ensure email deliverability can substantially reduce typos and otherwise bad email addresses from getting into the system in the first place, right there on the floor at the point of data capture. Utilizing this kind of technology of email address validation can reduce email bounces and failures by 90% or more. It can be an effective tool in ensuring the highest possible levels of data integrity when capturing customer details.
StrikeIron's Email Verification Solution is cloud-based; allowing it to determine in real-time if an email address is valid and deliverable before sending a message. It is used in many production POS systems today to ensure as often as possible that correct email addresses are obtained when collecting customer information.
The cloud-based, real-time approach is an important one, as StrikeIron is constantly evolving its algorithms in the background without requiring customers to update their POS integration in any way. Our team of email verification experts is constantly tweaking, enhancing, and otherwise modifying the algorithms that make these real-time checks as accurate as possible on an ongoing basis without any effort from the customer leveraging the technology.
Email Verification is only one easy-to-integrate API product available from StrikeIron. Others include Phone Number Validation, Address Verification, Do Not Call List checking, SMS Text Messaging and several more. For more information, please contact info@strikeiron.com.
The general premise of data warehousing hasn't changed much over the years. The idea is still to aggregate as much relevant data as possible from multiple sources, centralize it in a repository of some kind, catalog it, and then utilize it for reporting and analytics to make better business decisions. An effective data warehousing strategy seamlessly enables trend analysis, predictive analytics, forecasting, decision support, and just about anything else we now categorize under the umbrella of "data science."
The premise is not different these days, but rather, it is more the shifting nature of the data sources that the warehouse must draw from to capture as much useful information as possible. It's the data that's changed, not the goal.
First, there is the rapid proliferation of social-generated data in all of its unstructured forms, making the data extraction and transformation components of loading data to the warehouse more difficult than it has been in the past. But this isn't really groundbreaking for 2013, as social data and the creation of various Big Data technologies its growth has spawned, such as Hadoop, have been emerging for several years now.
Instead, what will likely be significantly different in 2013 is the acceleration of the deployment of a multitude of SaaS applications within the enterprise, especially in the larger, often slower to adopt, companies that populate the Fortune 2000. As the deal sizes grow in size, the SaaS footprint is clearly becoming significantly bigger.
This is where it becomes interesting. It's not just that an organization has several different SaaS applications such as Salesforce, Workday, and Success Factors in place and in use across the enterprise, with a single instance of each in use by all. Instead, due to the nature of the easier adoption of these SaaS applications, many of them have come in through the back door departmentally and at different times rather than through a centralized IT-controlled proliferation. This means that multiple instances of the same application are popping up everywhere.
For example, there are large enterprises that now have 10, 20, or even 50+ instances of Salesforce running across the entire organization. Each instance has its own set of customization of data collection and storage, separate add-on applications installed, different data feeding these applications, and unique implementation approaches. This could result in the old adage of solving old problems while creating new ones.
Some questions that could be asked are what kind of data collection and ETL challenges will this cause for those wishing to leverage a data warehousing strategy? Is the fact that the operational data from these various SaaS applications is stored and maintained by different vendors, each of which who is incentivized to keep it that way, make things easier or more difficult for data warehousing and the analysis it enables? Will data fragmentation and the resultant data integration strategies scale across all of these instances of SaaS applications? It will be interesting to see organizations meet the "SaaS sprawl" challenge, especially as it relates to cross-enterprise data collection strategy.
Furthermore, SaaS applications have taken an ever-increasing hold of the enterprise as of late with larger and larger deals. With the Cloud and SaaS applications a major part of their 2013 strategies, Oracle, SAP, IBM, and the more traditional software vendors have taken notice. SAP's Business ByDesign, Oracle's Fusion Applications, and recent SaaS acquisitions will surely add to what could become a hodge podge of SaaS applications across the enterprise.
To meet these challenges currently, cloud data warehousing offerings from companies like BitYota and Amazon's Redshift are beginning to emerge with a core theme of the cloud as the centralized data storage repository. ETL and data integration solutions such as Informatica's Cloud and Dell's Boomi are racing to meet these traditional data warehousing requirements in the cloud paradigm. Also, the traditional data cleansing requirements of data warehousing are being met with their cloud-based counterparts for better, more usable data in these new age warehouses. One thing that will never change is that bad data will always equal bad analysis, and the need for making investments in data quality strategies will continue to exist.
As the landscape of SaaS continues its rapid expansion, and the data within these applications continues to burgeon, 2013 will definitely be a pivotal year in the dawn of a new class of data warehousing technologies.
I've had an opportunity to work closely with ActivePrime lately as they are in the homestretch of beta-testing their CleanVerify product. CleanVerify is one of a suite of products they market to help companies improve, organize, and better utilize the data within Oracle CRM On Demand. I'm really excited about what they have accomplished.
ActivePrime has integrated several services from StrikeIron as part of their solution. I am helping them test the usability of the integration. Utilizing StrikeIron's Cloud-based address verification, email verification, and phone validation products, ActivePrime is able to deliver a powerful, high-performance, real-time data validation and data cleansing solution fully integrated within Oracle CRM On Demand. This enables customers to focus on closing business rather than the ongoing maintenance of data within their CRM system.
For example, a call center representative collecting contact information from a customer and entering it into the Oracle CRM On Demand system can now automatically validate that the collected mailing address, phone number, and email address are all correct (see screen shots below). Validating data at the point of collection before it ever gets into the CRM system can reduce the cost of downstream data cleansing efforts as much as 10x, ensuring a high level of quality data within the CRM system from the moment the data is entered.
In addition to validating the data in real-time, additional information, such as correct ZIP+4 and county, is also added to address records.
Another nice feature is a log that gets created that keeps track of every validation that occurs within the Oracle CRM On Demand system, ideal for administrators and other data stewards who like to stay on top of these kinds of things.
If you would like to try the integration during the beta period, or anytime after, ActivePrime is offering a CleanVerify trial here.
Validating a mailing address:

Validating a phone number:

Validating an email address:

View log/report:

As organizations move applications to the Cloud where it makes sense to do so, they should recognize that this is an ideal opportunity to improve the value of the underlying data assets that feed these applications. After all, any system, Cloud or otherwise, is only as good as the data within it.
A "move to the Cloud" provides a unique opportunity to both ensure existing data is of the highest possible quality and to also
install mechanisms to govern that all future data that enters the system is accurate, current, and complete. This is especially ideal if data is also being moved from an existing internal database to a Database-as-a-service (DBAAS) product like SQL Azure or Amazon RDS, or a to a database that will be running on top of a Cloud service such as MySQL or SQL Server on Amazon, Microsoft Azure, Rackspace, or any other Cloud platform.
As data is moving from its source database, where it currently exists, into its target Cloud database, you can take advantage of this ideal time to:
- Ensure all physical addresses are valid, accurate, current and complete
- Ensure all email addresses are live, working email addresses that have not been disabled or changed (otherwise, you could find yourself on spam lists simply by trying to contact your customers)
- Ensure all telephone numbers are valid, accurate, and current
- Ensure all data fields are consistent in content and individual data elements are non-ambiguous, making data analysis and the emerging field of data science much more effective
- Fill in all missing data where possible
- Eliminate duplicate contact and customer records
- Incorporate any other data-specific business rules and requirements that make sense for your organization
Also, the wise organization puts real-time data quality and data enhancement mechanisms in place at the points of data collection, such as a data entry form or within a Web-to-lead process, to ensure that all new data coming into a system is of the highest possible quality. This also prevents degradation of data over time, so the same set of issues do not occur again a short time later. Otherwise, this will lead to more cleansing efforts and cost downstream.
A significant part of the success of any Cloud initiative revolves around cleansing existing data during migration, getting real-time data quality mechanisms in place, and establishing an ongoing data management plan with metrics and goals for going forward. Don't let rare application migration opportunities such as this go to waste.
Many of StrikeIron's direct customers integrate our various API-delivered data services into applications, Web sites, and business processes entirely on their own, usually with a single line of code or two - a testament to how easy this is to do. These product offerings available on the Cloud can be integrated into anything that can consume a SOAP or REST-based Web service (which is just about anything).
However, StrikeIron has also developed technology integration partnerships with many of today’s top software and Internet solutions platforms, solutions which are all enhanced by integrating Data-as-a-Service capabilities from StrikeIron.
Having these capabilities, such as real-time address verification, email verification, sales tax rates, foreign currency rates, SMS text messaging, and phone verification, pre-integrated into various other platforms that are already in use by large customers every day can be a very compelling solution. It is a win-win-win scenario for our customers, partners, and our technology.
One such partner is Informatica. Informatica has integrated several StrikeIron services for the purposes of contact data validation within its Informatica Cloud platform, as data validation is a very important step in the integration of data between various platforms. These services can be used via the Informatica Cloud StrikeIron plug-in, or as directly integrated within the Informatica Cloud platform per our most recent partnership. In the latter case, some of our services are available for use simply by checking a box directly within Informatica's Cloud application. This makes it very easy to have high quality, validated data arriving at a target destination, having been cleansed as an intermediate step while in transit from its source. You can view a recorded Webinar here.


There are many different kinds of batch data cleansing processes that can be performed against large databases of existing customer information. Standardizing inconsistent data, removing duplicate records, validating columns against up-to-date reference data, filling in missing data, and appending new data to existing data are all examples of customer data processing that can help improve the value of internal data assets.
When data assets undergo these kinds of processes their value increases and they enable business intelligence applications to be more useful, operations to be more efficient, and customer communication efforts to be more effective. These are worthwhile endeavors indeed.
However, it can often be a considerable effort to do large, after-the-fact database cleanup jobs - not to mention the considerable costs and complexity associated with offline data processing. Also, batch jobs are rarely a one-time effort, as the same problems begin to appear soon after a mass cleansing, and then begin to build to troublesome levels again, putting the data stewards of the organization back to square one.
An alternative can be to leverage
real-time data quality mechanisms at the point of data collection. This means validating data, filling in missing data, appending data, standardizing data, and comparing it to existing data for duplicates in real-time,
before it ever gets into the database. This can eliminate or dramatically reduce the cost and effort associated with downstream batch cleanup processes, enabling the benefits of clean, complete, accurate data to appear immediately across the organization. It also prevents the build up of these kinds of data quality issues over time.
Real-time data quality can be achieved by
integrating calls to data quality functions within business processes, Website data collection forms, customer-facing applications, call center applications where representatives speak with customers, and anywhere else that data is collected in real-time. Typically these programmatic calls are to Cloud-based APIs that are leveraging constantly refreshed reference data to ensure the highest possible data accuracy.
Here more than ever, an ounce of prevention is worth a pound of cure.
One of the exciting things about SOAP and REST-based Web services protocols is that they are text-based, providing for the platform independence necessary for broad machine-to-machine communication and open cloud computing models. In other words, describing data using a textual XML dialect allows iPhones to communicate with mainframes, as well as enabling Fortran-developed scientific instrumentation devices to be able communicate with Dell Server applications in the Cloud written in Java.
As long as both machines are aware of the "rules" of a given XML-dialect and how data is described, they can communicate and more importantly pass data back and forth to perform certain functions based on the resultant data. This is powerful and has really helped lay the groundwork for the success of the Cloud.
To demonstrate this concept, here is an example of an "Input" SOAP message to StrikeIron's Sales and Use Tax Basic service. Remember that XML is not meant to be human readable, but rather the implementation of a set of XML dialect rules. However, if you look closely then you can see the actual data elements that are passed within the XML message received by StrikeIron within our data centers by the calling entity:

Our application servers, which are always listening, receive the request, do some user authentication, and then perform the requested task and return the resultant data XML message below. It can then be used how ever necessary by the calling entity (to process an ecommerce transaction for example). Here is an example of the "Output" XML message:

This communication and data transaction has occurred entirely without human intervention. It takes place between machines that could be located anywhere on the globe, each completely oblivious to the hardware and software that comprise the other entity.
Fortunately, humans rarely if ever need to interact at the XML-level (sometimes it might be useful for debugging). Instead, the creation, sending, receiving, and interpretation of these XML messages are handled by the software development environments that one is working in, abstracting a developer or application user away from the XML-based data exchange.
This form of XML messaging is what makes companies like StrikeIron possible, opening up pre-built data processing, data validation, aggregated data sources, and other business functions available to the world. Regardless of what software and hardware environments a customer happens to be running, it's this approach that makes the ever-evolving "Great Data Highway" possible.
As you think about improving the quality of data within your organization, here are four quick and simple yet key tips that will assist in your approach and strategy on your way to success:
- Think of data as a strategic asset. Collecting and storing data alone is not enough. There must be a proactive plan in place to ensure that the data serving as the basis of decision-making, operations, and customer communication is treated as a strategic valuable asset. Effectively managing the quality, accuracy, and usability of this data on an ongoing, every day basis can translate into dramatic revenue opportunities and significant cost-saving efficiencies.
- Consistency is as important as accuracy.Accurate data is important, but so is consistency. Inconsistent representations of the same data content (such as variations of a company name, a lead source appearing six different ways, etc.) throughout data tables can make data very hard to analyze, and can even throw off analytics and business intelligence processes. This can result in decision-making (such as where to deploy marketing assets) based on faulty data points. A focus on data consistency can reduce the incidence of this substantially.
- Data quality is far cheaper transactionally.Improving the quality of data at the point of data collection (A Web form or via a call center representative) is much more inexpensive than waiting for broad data quality issues to appear downstream that must be addressed en masse. The cost difference can sometimes even be a factor of ten. Also, in the downstream case, considerable use of inaccurate and incomplete data might already have occurred.
Validating the accuracy of data before it ever gets into core customer databases is very important.
- Data quality is about more than technology.
Tools can only do so much. Incentive programs for capturing complete and accurate data (such as bonuses for 98% or greater accurate customer data point collection) can go a long way in better, more valuable organizational data, as well as education in the importance of data as a key strategic asset across business units, not just IT. Any comprehensive data quality plan built for success will involve the entire organization.
Late last week, Amazon
released an update to its
DynamoDB service, a fully managed
NoSQL offering for efficiently handling extremely large amounts of data in Web-scale (generally meaning very high user volume) application environments. The DynamoDB offering was originally launched in beta back in January, so this is its first update since then.
The update is a "batch write/update" capability, enabling multiple data items to be written or updated in a single API call. The idea is to reduce Internet latency by minimizing trips back and forth to Amazon's various physical data storage entities from the calling application. According to Amazon, this was in response to developer forum feedback requests.
This update to help address what was already an initial key selling point of DynamoDB tells us that latency is still a significant challenge for cloud-based storage. After all, one of the key attributes of DynamoDB when first launched was speed and performance consistency, something that their NoSQL precursor to DynamoDB,
SimpleDB, was unable to deliver, at least according to some developers and users who claimed data retrieval response times ran unacceptably into the minutes. This also could have been a primary reason for SimpleDB's lower adoption rates. Amazon is well aware of these performance challenges, and hence the significance of its first DynamoDB update.
Another key tenant of DynamoDB is that it is a managed offering, meaning the details of data management requirements such as moving data from one distributed data store to another is completely abstracted away from the developer. This is great news, as complexity of cloud environments was proving to be too challenging for many developers trying to leverage cloud storage capabilities. The masses were scratching their heads as to how to overcome storage performance bottlenecks, attain replication, achieve response latency consistency, and perform other operations-related data management challenges when it was in their purview to do so. By the way, management complexity will likely still be a major challenge for other NoSQL vendors, and there are many "big data" startups offering products in this category, who do not offer the same level of abstraction that DynamoDB offers. It will be interesting to see if the launch of DynamoDB becomes a significant threat to many of these startups.
We learned this reduction of complexity lesson at
StrikeIron within our own niche offerings as well. We gained a much bigger uptake of our simpler, more granular Web services APIs, such as
email verification,
address verification, and other products such as
reverse address and telephone lookups as single, individual services, rather than complex services with many different methods and capabilities. This proved true even if the the more complex services provided more advanced power within a single API. In other words, simplified remote controls for television sets are probably still the best idea for maximum television adoption, as initial confusion and frustration tends to be inversely proportional to the adoption of any technology.
Another interesting point is that this is the fifth class of database product offerings in Amazon's portfolio. Along with DynamoDB, there is also still the aforementioned SimpleDB, a schemaless NoSQL offering for "smaller" datasets. There is also the original
S3 offering with a simple Web service based interface for storing, retrieving, and deleting data objects in a straightforward key/value pair format. Next, there is
Amazon RDS for managed, relational database capabilities that utilize traditional SQL for manipulating data and is more applicable for traditional applications. Finally, there are the various Amazon Machine Image (AMI) offerings on
EC2 (Oracle, MySQL, etc.) for those who don't want a managed relational database and would rather have complete control over their instances (and not have to utilize their own hardware) and the RDBMs that run on them.
This tells us that the world is far from one-size-fits-all cloud database management systems, and we can all expect to be operating in hybrid storage environments that will vary from application to application for quite some time to come. I suppose that's good news for those who make a living on the operations teams of information technology.
And along with each new database offering from Amazon also comes a different business model. In the case of DynamoDB for example, Amazon has introduced the concept of "read and write capacity units", where charges will be based on the combination of frequency of usage and physical data size. This demonstrates that the business models are still somewhat far from optimal, and will likely change again in the future. Clearly they are not yet quite right for the major vendors trying to figure it all out as business model adjustments in the Cloud are not just limited to Amazon.
In summary, following the Amazon database release timeline over the years yields some interesting information, namely that speed/latency, reduction of complexity, the likelihood of hybrid compute and storage environments for some time to come, and ever-changing cloud business models are the primary focus of cloud vendors responding to the needs of their users. And as any innovator knows, the challenges are where the opportunities are.