The basic premise of APIs (Application Programmer Interfaces) is to make integration and customized usage with an application or between applications possible. These interfaces generally provide a pre-determined set of application communication "methods" for the purposes of sending and receiving messages to an application and invoking various commands. Application APIs have been around for some time. However, Web-based APIs, allowing the integration of applications to occur over the Internet, are more recent.
Traditionally, application to application integration was difficult. Custom code usually had to be written to make applications work together. It was typically a requirement that the applications were available on the same network, limiting the scope of integration. Not to mention that every time a new application was to be tied in, more custom code had to be written. Overall, it was a fairly tedious and expensive proposition to integrate applications, and the task was rarely a repeatable exercise.
Web APIs solve most of these challenges. For example, using text-based protocols like XML or JSON enable platform independence, or the concept that two applications can communicate and work in conjunction with each other even though they are running on entirely different hardware and software platforms.
This is a powerful concept, especially when the Internet has eliminated geographical requirements. For example, platform independence is what enables a Windows-based PC the ability to request a piece of information, such as a customer number, from a mainframe - even if that mainframe is running in a data center on the other side of the world. More recently, Web APIs enable mobile devices such as an iPhone to retrieve information from servers in real-time, which might be currency rates that are being maintained and updated on a Linux server thousands of miles away.
So while many of the old challenges are being solved, we want to be very careful to ensure that an API-based integration approach does not introduce several new complexities, throwing a wrench into an otherwise well-thought-out integration strategy that an organization might currently be considering.
For instance, it is fairly common for an organization or a data provider to publish multiple APIs to its constituency, each one representing a certain function or type of data. While consistency generally makes common sense, the challenge in practice is that the data sources that will be published might be coming from different places with varying data content standards and data structures, making normalization harder.
If an organization has five different APIs to publish, there might be cases where constituents want to integrate several, if not all of these APIs. If each one requires a different integration approach due to inconsistency of implementation across these APIs, the likelihood of adoption decreases, and all of the additional complexity introduced could make the ongoing maintenance of the integration quite difficult - the exact opposite of what we want to achieve with API-based integration.
Years of experience both creating APIs and serving thousands of customers who have integrated these APIs into production have given StrikeIron a foundation of knowledge around the deployment and ongoing use of APIs that address these kinds of issues. When we help customers publish API's through our IronCloud API Management platform, these are the typical areas of normalization we focus on as part of our best practices for APIs:
Data content normalization. This one might be the most difficult as it could require some manipulation of large content datasets. However, as with any dataset and database, data content consistency makes analysis and reporting much easier. A simple example is to ensure gender data is always "m" and "f" rather than multiple variations of "male", "female", etc. Content normalization requirements also could be more complex, like consistent product naming standards.
Data structure normalization. It's important that APIs delivering data follow the same structural formats to ensure that working with the resultant data within client applications is as usable as possible. For a basic example, if one API uses "Full Name" as a data parameter, and another API uses "Given Name" and "Last Name" as two separate parameters, the usefulness of the data can degrade as comparing data can be challenging.
Authentication. This is where utilizing third party platforms, such as IronCloud, to publish APIs can be exceptionally helpful as they typically provide a de-coupled authentication layer allowing many different types of authentication (and protocols) to be leveraged. Examples include SOAP header-based authentication, SOAP parameter-based authentication, certificate-based/HTTP Secure REST, WS*Security, and several others. Here, not only is consistency important, but flexibility as well. It is hard to predict what IDEs or other development tools will be used when integration to an API occurs by a client application, and some of those tools might only support some authentication approaches. The important thing is that once an authentication method is decided upon, that same method can be repeated across all APIs that have been published by an organization.
Response code consistency. Most APIs have a set of response codes associated with them to relay important information back to the calling application after an API method has been invoked. For example, an API might need to respond with a "password not valid" or "data not found" response. In these cases, utilizing a response code such as "404" might be appropriate for data not found. The key thing is to ensure that these response codes are consistent from one API to the next. Otherwise, a developer will have to understand and write additional code to handle the various responses from each different API. This creates more complexity with each new API that is integrated.
API behavior consistency. If mechanisms such as timeouts or API usage reporting capabilities are present as parameters in an API, it's a good idea to ensure that they are available in multiple APIs in the same way. This can prevent unnecessary coding and unexpected client application behavior when developers try to leverage multiple APIs from the same organization.
Business model consistency. It is rare that there is not a usage control mechanism in place governing the use of an externally published API. Whether credits, hits, daily maximums, monthly usage, or some other accounting mechanism is in place governing usage, be sure that is it consistent across all of the APIs that you publish to minimize usage contingency code that needs to be created by the developer. Inconsistency here can cause considerable challenges as usage governance code issues have the nature of often being detected during production use, and that is always undesired. Foresight here can have long-term benefits in terms of adoption and overall stickiness of API usage.
This is a basic model for API design best practices that StrikeIron has developed over the years helping customers and partners design and deploy APIs that are in production in thousands of organizations around the globe. While our IronCloud API Hosting and Management Platform handles a lot of these details, foresight and good API design can make a big long-term impact towards the success of an API Strategy.
If you are considering publishing data or other business functionality in the form of a Web API for integration by others, our experience and track-record of successful API deployment and integration could be helpful and a great choice to minimize complexity, accelerate speed to market, achieve scalability, control usage and ultimately achieve the benefits a well thought-out API strategy can provide. Let us know of your plans and we will gladly provide some initial consultation to see if IronCloud, our API management platform, is right for your needs.

Marketo over the past couple of years has emerged as one of the top "lead management" platforms in the industry. Most recently, Gartner has them in the top right of their Magic Quadrant for CRM Lead Management (along with Eloqua, now owned by Oracle). Marketo's Revenue Performance and Marketing Automation platform helps marketers manage multiple digital channels, including email, social, online events and mobile. It is a key component of strategies that develop leads and opportunities for sales teams.
As with most data-centric solutions, Marketo's platform is significantly enhanced by ensuring all of the contact data for customers and prospects at its core is correct, current, and comprehensive. At a high level, this significantly enhances the ROI that Marketo customers attain from its platform, and reduces missed opportunities due to inaccurate lead data.
At a more granular level, Address Verification, Phone Number Validation, and especially Real-Time Email Validation can critically improve Marketing Automation efforts. Email validation ensures email addresses, especially when the opt-ins may have been awhile ago, are active and receiving email. This is a critical part of professional lead management efforts. If invalid email addresses existing in a customer or prospect database are not withdrawn, a large number of bounced emails will cause an organization to fall out of favor with ISPs and very likely end up on various spam lists, causing all marketing communications to arrive in spam folders.
Since Marketo was a natural for StrikeIron's capabilities, customers of ours suggested integrating the two platforms to enhance the value and experience of using Marketo. We jumped at the opportunity!
Late in 2012, Marketo introduced their innovative "Webhooks" capability for integrating third party applications via XML-based interfaces. Using their "trigger" functionality (such as "new lead created" or "Web from filled out"), a call-out to a third-party API can be achieved as part of any Marketo process. The resultant response data can then be mapped into the Marketo system and become a part of individual lead record data. In the case of StrikeIron, this includes validated mailing addresses, phone numbers, and email addresses, as well as additional enhancement data that are also returned by the StrikeIron APIs.
Since all of StrikeIron's Cloud APIs have the same data structure, behavior, and open XML protocol formats (in this case, REST was used), it makes it easy to integrate all of our various API products into the Marketo platform. In other words, since both support standard, open platforms designed with integration in mind, the integration was significantly less difficult and deeper than with some other third-party platforms where we have integrated our customer data validaton solutions. In fact, the affinity between the two platforms was significant enough that we went ahead and integrated our SMS Text Messaging API into the Marketo platform as well, supporting Marketo's mobile efforts.
After successfully testing our various Webhook integrations, including with production customers, Marketo then provided StrikeIron with the Marketo-Integrated designation that you can see here: http://launchpoint.marketo.com/strikeiron-inc/749-strikeiron-email-verification
We currently have several solutions now available pre-integrated within Marketo as part of Marketo's Launchpoint partner channel. If you search for "StrikeIron" here, you can see the various integrated solutions currently available.
The last thing anyone wants is a marketing automation platform that is relying on bad or otherwise incomplete data. This powerful, yet easy integration of two platforms ensures that won't be the case.

Point-of-Sale (POS) systems are continuing to evolve. What was once only a "processing and recording of sales" mechanism (such as a cash register), POS implementation is now a considerable competitive advantage in a retailer's strategy. As a result, retailers are always looking to get more long-term ROI out of their POS systems. They are determining how this interaction opportunity with a customer can support other parts of a company's multichannel sales strategy as well.
The operation of these systems is rapidly moving from a counter-based cash register to the retailer's floor, as credit card processing hardware increasingly supports this model. Mobile handheld devices, tablets and other consumer devices performing the actual sales transaction are now commonplace within a store, no longer tying personnel to the cash register. As a result, customers are now often asked to enter their own information via the on-the-floor transaction device, and an email address now more than ever is part of the collected customer information. Primarily, this is because customers increasingly prefer more efficient and environment-friendly e-receipts. Better yet, collecting an email address can then provide an opportunity for future communications via email for the retailer, and can result in increased ongoing customer engagement.
The value of collecting an email address can be considerable. Email-oriented and Web-based marketing are outpacing traditional advertising and direct mail methods, and multi-channel sales strategies are as critical to success as ever. Even with retailers where traditional methods of POS still make the most sense via a cash register, collecting email addresses are equally important for customer retention success.
However, there are a couple of challenges to collecting this kind of customer data on the floor. First, data entry can sometimes be a little tougher with these mobile devices. If the customer enters their own data, which is now often the case, the chances of collecting incorrect or invalid email addresses either by accident or otherwise can go up considerably. Even if the email address is collected properly, 30-40% of people change their email address every year. Simple typos or not, these issues can not only prevent e-mail receipts from being delivered, but a large incidence of invalid emails being entered can result in future marketing communications going into spam folders if Internet Service Providers (ISPs) detect enough email bounces and failed deliveries coming from the same sending source.
Real-time email validation that utilize an instantaneous out-to-the-Cloud check to ensure email deliverability can substantially reduce typos and otherwise bad email addresses from getting into the system in the first place, right there on the floor at the point of data capture. Utilizing this kind of technology of email address validation can reduce email bounces and failures by 90% or more. It can be an effective tool in ensuring the highest possible levels of data integrity when capturing customer details.
StrikeIron's Email Verification Solution is cloud-based; allowing it to determine in real-time if an email address is valid and deliverable before sending a message. It is used in many production POS systems today to ensure as often as possible that correct email addresses are obtained when collecting customer information.
The cloud-based, real-time approach is an important one, as StrikeIron is constantly evolving its algorithms in the background without requiring customers to update their POS integration in any way. Our team of email verification experts is constantly tweaking, enhancing, and otherwise modifying the algorithms that make these real-time checks as accurate as possible on an ongoing basis without any effort from the customer leveraging the technology.
Email Verification is only one easy-to-integrate API product available from StrikeIron. Others include Phone Number Validation, Address Verification, Do Not Call List checking, SMS Text Messaging and several more. For more information, please contact info@strikeiron.com.
The general premise of data warehousing hasn't changed much over the years. The idea is still to aggregate as much relevant data as possible from multiple sources, centralize it in a repository of some kind, catalog it, and then utilize it for reporting and analytics to make better business decisions. An effective data warehousing strategy seamlessly enables trend analysis, predictive analytics, forecasting, decision support, and just about anything else we now categorize under the umbrella of "data science."
The premise is not different these days, but rather, it is more the shifting nature of the data sources that the warehouse must draw from to capture as much useful information as possible. It's the data that's changed, not the goal.
First, there is the rapid proliferation of social-generated data in all of its unstructured forms, making the data extraction and transformation components of loading data to the warehouse more difficult than it has been in the past. But this isn't really groundbreaking for 2013, as social data and the creation of various Big Data technologies its growth has spawned, such as Hadoop, have been emerging for several years now.
Instead, what will likely be significantly different in 2013 is the acceleration of the deployment of a multitude of SaaS applications within the enterprise, especially in the larger, often slower to adopt, companies that populate the Fortune 2000. As the deal sizes grow in size, the SaaS footprint is clearly becoming significantly bigger.
This is where it becomes interesting. It's not just that an organization has several different SaaS applications such as Salesforce, Workday, and Success Factors in place and in use across the enterprise, with a single instance of each in use by all. Instead, due to the nature of the easier adoption of these SaaS applications, many of them have come in through the back door departmentally and at different times rather than through a centralized IT-controlled proliferation. This means that multiple instances of the same application are popping up everywhere.
For example, there are large enterprises that now have 10, 20, or even 50+ instances of Salesforce running across the entire organization. Each instance has its own set of customization of data collection and storage, separate add-on applications installed, different data feeding these applications, and unique implementation approaches. This could result in the old adage of solving old problems while creating new ones.
Some questions that could be asked are what kind of data collection and ETL challenges will this cause for those wishing to leverage a data warehousing strategy? Is the fact that the operational data from these various SaaS applications is stored and maintained by different vendors, each of which who is incentivized to keep it that way, make things easier or more difficult for data warehousing and the analysis it enables? Will data fragmentation and the resultant data integration strategies scale across all of these instances of SaaS applications? It will be interesting to see organizations meet the "SaaS sprawl" challenge, especially as it relates to cross-enterprise data collection strategy.
Furthermore, SaaS applications have taken an ever-increasing hold of the enterprise as of late with larger and larger deals. With the Cloud and SaaS applications a major part of their 2013 strategies, Oracle, SAP, IBM, and the more traditional software vendors have taken notice. SAP's Business ByDesign, Oracle's Fusion Applications, and recent SaaS acquisitions will surely add to what could become a hodge podge of SaaS applications across the enterprise.
To meet these challenges currently, cloud data warehousing offerings from companies like BitYota and Amazon's Redshift are beginning to emerge with a core theme of the cloud as the centralized data storage repository. ETL and data integration solutions such as Informatica's Cloud and Dell's Boomi are racing to meet these traditional data warehousing requirements in the cloud paradigm. Also, the traditional data cleansing requirements of data warehousing are being met with their cloud-based counterparts for better, more usable data in these new age warehouses. One thing that will never change is that bad data will always equal bad analysis, and the need for making investments in data quality strategies will continue to exist.
As the landscape of SaaS continues its rapid expansion, and the data within these applications continues to burgeon, 2013 will definitely be a pivotal year in the dawn of a new class of data warehousing technologies.
I've had an opportunity to work closely with ActivePrime lately as they are in the homestretch of beta-testing their CleanVerify product. CleanVerify is one of a suite of products they market to help companies improve, organize, and better utilize the data within Oracle CRM On Demand. I'm really excited about what they have accomplished.
ActivePrime has integrated several services from StrikeIron as part of their solution. I am helping them test the usability of the integration. Utilizing StrikeIron's Cloud-based address verification, email verification, and phone validation products, ActivePrime is able to deliver a powerful, high-performance, real-time data validation and data cleansing solution fully integrated within Oracle CRM On Demand. This enables customers to focus on closing business rather than the ongoing maintenance of data within their CRM system.
For example, a call center representative collecting contact information from a customer and entering it into the Oracle CRM On Demand system can now automatically validate that the collected mailing address, phone number, and email address are all correct (see screen shots below). Validating data at the point of collection before it ever gets into the CRM system can reduce the cost of downstream data cleansing efforts as much as 10x, ensuring a high level of quality data within the CRM system from the moment the data is entered.
In addition to validating the data in real-time, additional information, such as correct ZIP+4 and county, is also added to address records.
Another nice feature is a log that gets created that keeps track of every validation that occurs within the Oracle CRM On Demand system, ideal for administrators and other data stewards who like to stay on top of these kinds of things.
If you would like to try the integration during the beta period, or anytime after, ActivePrime is offering a CleanVerify trial here.
Validating a mailing address:

Validating a phone number:

Validating an email address:

View log/report:

As organizations move applications to the Cloud where it makes sense to do so, they should recognize that this is an ideal opportunity to improve the value of the underlying data assets that feed these applications. After all, any system, Cloud or otherwise, is only as good as the data within it.
A "move to the Cloud" provides a unique opportunity to both ensure existing data is of the highest possible quality and to also
install mechanisms to govern that all future data that enters the system is accurate, current, and complete. This is especially ideal if data is also being moved from an existing internal database to a Database-as-a-service (DBAAS) product like SQL Azure or Amazon RDS, or a to a database that will be running on top of a Cloud service such as MySQL or SQL Server on Amazon, Microsoft Azure, Rackspace, or any other Cloud platform.
As data is moving from its source database, where it currently exists, into its target Cloud database, you can take advantage of this ideal time to:
- Ensure all physical addresses are valid, accurate, current and complete
- Ensure all email addresses are live, working email addresses that have not been disabled or changed (otherwise, you could find yourself on spam lists simply by trying to contact your customers)
- Ensure all telephone numbers are valid, accurate, and current
- Ensure all data fields are consistent in content and individual data elements are non-ambiguous, making data analysis and the emerging field of data science much more effective
- Fill in all missing data where possible
- Eliminate duplicate contact and customer records
- Incorporate any other data-specific business rules and requirements that make sense for your organization
Also, the wise organization puts real-time data quality and data enhancement mechanisms in place at the points of data collection, such as a data entry form or within a Web-to-lead process, to ensure that all new data coming into a system is of the highest possible quality. This also prevents degradation of data over time, so the same set of issues do not occur again a short time later. Otherwise, this will lead to more cleansing efforts and cost downstream.
A significant part of the success of any Cloud initiative revolves around cleansing existing data during migration, getting real-time data quality mechanisms in place, and establishing an ongoing data management plan with metrics and goals for going forward. Don't let rare application migration opportunities such as this go to waste.
Many of StrikeIron's direct customers integrate our various API-delivered data services into applications, Web sites, and business processes entirely on their own, usually with a single line of code or two - a testament to how easy this is to do. These product offerings available on the Cloud can be integrated into anything that can consume a SOAP or REST-based Web service (which is just about anything).
However, StrikeIron has also developed technology integration partnerships with many of today’s top software and Internet solutions platforms, solutions which are all enhanced by integrating Data-as-a-Service capabilities from StrikeIron.
Having these capabilities, such as real-time address verification, email verification, sales tax rates, foreign currency rates, SMS text messaging, and phone verification, pre-integrated into various other platforms that are already in use by large customers every day can be a very compelling solution. It is a win-win-win scenario for our customers, partners, and our technology.
One such partner is Informatica. Informatica has integrated several StrikeIron services for the purposes of contact data validation within its Informatica Cloud platform, as data validation is a very important step in the integration of data between various platforms. These services can be used via the Informatica Cloud StrikeIron plug-in, or as directly integrated within the Informatica Cloud platform per our most recent partnership. In the latter case, some of our services are available for use simply by checking a box directly within Informatica's Cloud application. This makes it very easy to have high quality, validated data arriving at a target destination, having been cleansed as an intermediate step while in transit from its source. You can view a recorded Webinar here.


There are many different kinds of batch data cleansing processes that can be performed against large databases of existing customer information. Standardizing inconsistent data, removing duplicate records, validating columns against up-to-date reference data, filling in missing data, and appending new data to existing data are all examples of customer data processing that can help improve the value of internal data assets.
When data assets undergo these kinds of processes their value increases and they enable business intelligence applications to be more useful, operations to be more efficient, and customer communication efforts to be more effective. These are worthwhile endeavors indeed.
However, it can often be a considerable effort to do large, after-the-fact database cleanup jobs - not to mention the considerable costs and complexity associated with offline data processing. Also, batch jobs are rarely a one-time effort, as the same problems begin to appear soon after a mass cleansing, and then begin to build to troublesome levels again, putting the data stewards of the organization back to square one.
An alternative can be to leverage
real-time data quality mechanisms at the point of data collection. This means validating data, filling in missing data, appending data, standardizing data, and comparing it to existing data for duplicates in real-time,
before it ever gets into the database. This can eliminate or dramatically reduce the cost and effort associated with downstream batch cleanup processes, enabling the benefits of clean, complete, accurate data to appear immediately across the organization. It also prevents the build up of these kinds of data quality issues over time.
Real-time data quality can be achieved by
integrating calls to data quality functions within business processes, Website data collection forms, customer-facing applications, call center applications where representatives speak with customers, and anywhere else that data is collected in real-time. Typically these programmatic calls are to Cloud-based APIs that are leveraging constantly refreshed reference data to ensure the highest possible data accuracy.
Here more than ever, an ounce of prevention is worth a pound of cure.
One of the exciting things about SOAP and REST-based Web services protocols is that they are text-based, providing for the platform independence necessary for broad machine-to-machine communication and open cloud computing models. In other words, describing data using a textual XML dialect allows iPhones to communicate with mainframes, as well as enabling Fortran-developed scientific instrumentation devices to be able communicate with Dell Server applications in the Cloud written in Java.
As long as both machines are aware of the "rules" of a given XML-dialect and how data is described, they can communicate and more importantly pass data back and forth to perform certain functions based on the resultant data. This is powerful and has really helped lay the groundwork for the success of the Cloud.
To demonstrate this concept, here is an example of an "Input" SOAP message to StrikeIron's Sales and Use Tax Basic service. Remember that XML is not meant to be human readable, but rather the implementation of a set of XML dialect rules. However, if you look closely then you can see the actual data elements that are passed within the XML message received by StrikeIron within our data centers by the calling entity:

Our application servers, which are always listening, receive the request, do some user authentication, and then perform the requested task and return the resultant data XML message below. It can then be used how ever necessary by the calling entity (to process an ecommerce transaction for example). Here is an example of the "Output" XML message:

This communication and data transaction has occurred entirely without human intervention. It takes place between machines that could be located anywhere on the globe, each completely oblivious to the hardware and software that comprise the other entity.
Fortunately, humans rarely if ever need to interact at the XML-level (sometimes it might be useful for debugging). Instead, the creation, sending, receiving, and interpretation of these XML messages are handled by the software development environments that one is working in, abstracting a developer or application user away from the XML-based data exchange.
This form of XML messaging is what makes companies like StrikeIron possible, opening up pre-built data processing, data validation, aggregated data sources, and other business functions available to the world. Regardless of what software and hardware environments a customer happens to be running, it's this approach that makes the ever-evolving "Great Data Highway" possible.
As you think about improving the quality of data within your organization, here are four quick and simple yet key tips that will assist in your approach and strategy on your way to success:
- Think of data as a strategic asset. Collecting and storing data alone is not enough. There must be a proactive plan in place to ensure that the data serving as the basis of decision-making, operations, and customer communication is treated as a strategic valuable asset. Effectively managing the quality, accuracy, and usability of this data on an ongoing, every day basis can translate into dramatic revenue opportunities and significant cost-saving efficiencies.
- Consistency is as important as accuracy.Accurate data is important, but so is consistency. Inconsistent representations of the same data content (such as variations of a company name, a lead source appearing six different ways, etc.) throughout data tables can make data very hard to analyze, and can even throw off analytics and business intelligence processes. This can result in decision-making (such as where to deploy marketing assets) based on faulty data points. A focus on data consistency can reduce the incidence of this substantially.
- Data quality is far cheaper transactionally.Improving the quality of data at the point of data collection (A Web form or via a call center representative) is much more inexpensive than waiting for broad data quality issues to appear downstream that must be addressed en masse. The cost difference can sometimes even be a factor of ten. Also, in the downstream case, considerable use of inaccurate and incomplete data might already have occurred.
Validating the accuracy of data before it ever gets into core customer databases is very important.
- Data quality is about more than technology.
Tools can only do so much. Incentive programs for capturing complete and accurate data (such as bonuses for 98% or greater accurate customer data point collection) can go a long way in better, more valuable organizational data, as well as education in the importance of data as a key strategic asset across business units, not just IT. Any comprehensive data quality plan built for success will involve the entire organization.