As organizations move applications to the Cloud where it makes sense to do so, they should recognize that this is an ideal opportunity to improve the value of the underlying data assets that feed these applications. After all, any system, Cloud or otherwise, is only as good as the data within it.
A "move to the Cloud" provides a unique opportunity to both ensure existing data is of the highest possible quality and to also
install mechanisms to govern that all future data that enters the system is accurate, current, and complete. This is especially ideal if data is also being moved from an existing internal database to a Database-as-a-service (DBAAS) product like SQL Azure or Amazon RDS, or a to a database that will be running on top of a Cloud service such as MySQL or SQL Server on Amazon, Microsoft Azure, Rackspace, or any other Cloud platform.
As data is moving from its source database, where it currently exists, into its target Cloud database, you can take advantage of this ideal time to:
- Ensure all physical addresses are valid, accurate, current and complete
- Ensure all email addresses are live, working email addresses that have not been disabled or changed (otherwise, you could find yourself on spam lists simply by trying to contact your customers)
- Ensure all telephone numbers are valid, accurate, and current
- Ensure all data fields are consistent in content and individual data elements are non-ambiguous, making data analysis and the emerging field of data science much more effective
- Fill in all missing data where possible
- Eliminate duplicate contact and customer records
- Incorporate any other data-specific business rules and requirements that make sense for your organization
Also, the wise organization puts real-time data quality and data enhancement mechanisms in place at the points of data collection, such as a data entry form or within a Web-to-lead process, to ensure that all new data coming into a system is of the highest possible quality. This also prevents degradation of data over time, so the same set of issues do not occur again a short time later. Otherwise, this will lead to more cleansing efforts and cost downstream.
A significant part of the success of any Cloud initiative revolves around cleansing existing data during migration, getting real-time data quality mechanisms in place, and establishing an ongoing data management plan with metrics and goals for going forward. Don't let rare application migration opportunities such as this go to waste.
2011 has been the year of the Cloud database. The idea of shared database resources and the abstraction of underlying hardware seems to be catching on. Just like Web and application servers, paying-as-you-go and eliminating unused database resources, licenses, hardware, and all of the associated cost is proving to have attractive enough business models that the major vendors are betting on it in significant ways.
The recent excitement has not been limited to just the fanfare around "big data" technologies. Lately, most of the major announcements have come around the traditional relational, table-driven SQL environments Web applications make use of much more widely than the key-value pair data storage mechanisms "NoSQL" technology uses for Web-scale data-intensive applications such as Facebook, NetFlix, etc.
Here are some of the new Cloud database offerings for 2011:
Saleforce.com has launched Database.com, enabling developers in other Cloud server environments such as Amazon's EC2 and the Google App Engine to utilize its database resources, not just users of Salesforce's CRM and Force.com platforms. You can also build applications in PHP or on the Android platform and utilize Database.com resources. The idea is to reach a broader set of developers and application types than just CRM-centric applications.
At Oracle Open World a couple of weeks ago, Oracle announced the Oracle Database Cloud Service, a hosted database offering running Oracle's 11gR2 database platform available in a monthly subscription model, accessible either via JDBC or its own REST API.
Earlier this month, Google announced Google Cloud SQL, a database service that will be available as part of its App Engine offering based on MySQL, complete with a Web-based administration panel.
Amazon, to complement its other Cloud services and highly used EC2 infrastructure, has made the Amazon Relational Database Service (RDS) available to enable SQL capabilities from Cloud applications, giving you a choice of underlying database technology to use such as MySQL or Oracle. It is currently in beta.
Microsoft also has its SQL Azure Cloud Database offering available in the Cloud, generally positioned as suited for applications that use the Microsoft stack for developers that will want to leverage some of the benefits of the Cloud.
"Marketecture?"
Some of the above offerings have only been announced so far, and not actually launched. Or, they have limited preview access available now. Also, even the business models in some of these cases have not even been completely divulged, or if so are very likely to change.
Clearly there is a considerable marketshare land grab existing now. All of the major vendors are recognizing that traditional-SQL Cloud storage infrastructure will be an important technology going forward. Adding a solid database layer to the Cloud architecture story seems like an important step in the continuing enterprise and commercial software move to the Cloud, and these new vendor offerings should in turn accelerate this move.
Latency?
So, is this really the wave of the future? Some of the major questions that will have to be answered include those around latency. When data requests have to hop from a client application, then to the application server, to the database, and then back to the server and client, even multiple times within a single request, it can result in quite a performance hit. Likely, these machines exist far from each other geographically and might really slow things done, annoying an end-user with the slow page loads. This is probably why most infrastructure providers realize that they have to have the corresponding database capabilities available and accessed natively to reduce this latency. However, performance, along with security issues (perceived or otherwise) still could be a significant barrier to mainstream adoption.
Also, most of the relational database environments that exist in the Cloud only have a subset of SQL capabilities available and in some cases can be quite limited. For example, many of these Cloud SQL platforms don't support cross-table joins, at least not yet. This is a very common requirement for SQL applications. The lack of support is primarily because joins can consume a lot of resources, another performance-killer in shared environments.
Next?
Once most of this storage and Cloud database infrastructure gets in place however, incorporating more content-oriented data services such as customer data verification will become commonplace and easy to leverage. We may even see them incorporated into the database offerings themselves as they look to differentiate themselves from vendor to vendor. Cloud-based database offerings have the advantage of making much larger libraries of data-oriented add-on capabilities available right out of the box, so the story here is much more than just cost.
While SQL Cloud offering announcements are all the rage in 2011, 2012 will undoubtedly tell the adoption tale. No doubt these offerings will be ideal and cost-effective for many use cases out there. But will demand be large enough quickly enough to support all of these vendors and drive the innovation at a speed that will make these platforms viable in the near future for enterprise and commercial applications? The answer is likely yes, but the next twelve months or so will give us a lot of the supporting data to measure the extent of the trend.

StrikeIron, the cloud leader in data quality and data communications, announced today the roll-out of Trident, the next generation email verification technology and the most accurate solution in the market.
Read the full press release here.
At the very heart of any CRM system is the data within it. To communicate effectively with customers and prospects requires a high level of current, accurate data about each and every contact record in the system. The same is true in order to have the best possible business intelligence as the foundation of decision making.
There are traditional software methods for achieving high quality data, but they typically can be expensive and time-consuming to implement, and even more costly to maintain. Using a "Data-as-a-Service" (Daas) approach provides an effective model for incorporating real-time data validation checks into a CRM system, including verifying the accuracy of email addresses, phone numbers and mailing addresses. A solution in the "Cloud" like DaaS allows an easy implementation because the integration is completed in advance by the vendor, hence the "turnkey" notion.
This Web-based approach also eliminates the need to update the data reference sources that serve as the basis of these data validation checks. This is because the frequent updates occur at a master data center that is pre-wired into the solution via the Cloud. The reference data utilizes the Internet and is accessed as needed, rather than stored as a separate, full copy and maintained at each and every site where the software is in use. Since data reference updates are automatic in the world of DaaS, there is no risk that the reference data becomes aged and gradually loses its usefulness as is what often happens with on-premise software solutions.
Finally, the Web/Cloud-based solutions have the advantage of usage-based business models rather than large, upfront software investments that come with a high degree of risk and typically include implementation costs. Rather, the DaaS business example enables a grow-as-you-succeed model which is generally best for an organization.
In the case of Oracle CRM On Demand, StrikeIron has collaborated with our partner ActivePrime to help deliver this type of turnkey, pre-integrated solution, improving the quality and usability of data within Oracle CRM On Demand. Within this solution, available now, a single click validates the critical points of contact data via the Web, with all reference data updated automatically. This enables a much better foundation of high quality data on which to operate a sales force or marketing organization, ultimately resulting in a much better CRM ROI. You can see the screen shots below for email address validation, phone number validation and mailing address validation.
Verify Email

Verify Address

Verify Phone

It has been five years since Oracle CRM On Demand was released and now finally there is an integrated real-time data quality solution that is easy to turn on and put to use.
There is often a need within an organization to move data from point A to point B. One example is when user-submitted data, collected from a Web site, is moved into a CRM system. This typically results in a "lead" being created. Contact data from the user is collected in response to a form being filled out requesting more information about a certain offering, a question needing answering or another action indicating some level of interest in a company's products. All of these types of inquiries are fundamental interests of sales professionals.
The moving of the actual data to a CRM system could be in the form of a nightly batch load or as each single lead is collected. Either way, it is more important to ensure that only valid, complete information is loaded into the CRM system in order to optimize the time and increase the likelihood of success for the sales professional on the other end of the system.
A lot of time can be wasted by a sales organization by following up on phantom leads or leads with incorrect information. Ongoing communication and lead nurturing can also be severely affected if contact information is not valid or current. And finally, expensive, time-consuming "data cleansing" activities might have to be initiated downstream if an organization waits until volumes of incomplete or inaccurate data collects and builds within a CRM system over a long period of time.
One way to prevent this from happening is by using Informatica's Cloud product in conjunction with StrikeIron's Contact Record Verification Suite. StrikeIron has developed a plug-in for Informatica to manage this data migration process. The Cloud product uses Informatica's classic data integration technology in a SAAS scenario, enabling data to be loaded from many various systems, including Web-to-lead data into Salesforce.com. StrikeIron's Contact Record Verification Suite plug-in performs the actual phone number, address and email validation checks along the way. The joint offering is very easy to get up and running – no software to install, no hardware to prepare and no reference data to acquire.
This Cloud-based load-and-validate approach ensures that more accurate, complete and validated data actually gets into the CRM system with minimal effort, optimizing the time of sales executives. This process provides better communication and access with customers and prospective customers while preventing costly data cleansing activities to be performed down the road.
- Here is a video demonstration showing the joint solution: http://www.youtube.com/watch?v=c4-s6kRam6c
- Here is more information on the joint solution: http://www.strikeiron.com/Partners/PremierPartners/Informatica.aspx
Contact us at sales@strikeiron.com for more information.
Amazon's new SES (Simple Email Service) product is a scalable, transaction-based offering for programmatically sending large amounts of email. This is accomplished using Amazon's Web-scale architecture, most especially for applications that already use EC2 (server rental) and S3 (storage rental). By utilizing SES you are essentially leveraging the "Cloud" to send emails from applications and Web sites rather than investing in your own software and hardware infrastructure to do so. This process substantially reduces cost and complexity as do most Cloud services and in this case requires only a simple API call. There is no network configuration or email server setup required in this process.
However, there are some significant restrictions to consider that Amazon has imposed on the user. The SES service will only let you start out with a limited quota until you build a "good reputation" within their system. The initial limit is 200 emails per day. This will increase substantially once you build your reputation within Amazon’s service.
One criteria used to "build your reputation" within Amazon is based on number of bouncebacks or emails that could not be delivered because the email address is invalid or has been disabled. Having a clean, verified email list prior to and during your ongoing use of the service is extremely important to minimize the number of bouncebacks you receive. If your use of SES returns a large number of bouncebacks from non-working email addresses, your quota will not be raised and you may be disqualified from using the system. Multiple bouncebacks can really hurt your reputation and will prevent you from being able to fully maximize Amazon's SES product.
Fortunately, you can use another Cloud-based service (available from StrikeIron) for verifying the validity of an email address before using Amazon's SES service (or any other email service). It is another simple API check that will indicate if a given email address is not valid (an actual non-intrusive, real-time check across the Web without ever sending an actual email). This is exactly one of the primary uses of StrikeIron's Email Verification Service- building email service provider reputation by significantly reducing bouncebacks and staying off of spam lists which can kill your ability to communicate with customers and prospective customers electronically.

As email technology becomes more sophisticated, so should those of us who make use of these technologies especially when it is so easy. The business upside can be dramatic and provide great results for companies.
There are three primary points of communication with customers and potential customers. They are the physical address (mail), the email address, and the telephone number. And often more than one in each case.
All businesses aren't the same, but in general, how important is it to communicate regularly with customers and contacts? What value can you place on the accuracy of data about your customers? Does it mirror the value of the customers themselves?
Most would agree that these data points about contact data are important enough to ensure resources are available to ensure this contact information is current, accurate, and complete. After all, these are the gateways to those who drive the bottom line. Can you afford for this information to be wrong or incomplete?
So what are some of the threats to "Big Three" accuracy?
One threat is that email addresses are changed regularly, often resulting in the disabling of existing email addresses. This can happen when someone changes jobs or leaves a company, and in an era where once the spam kings get a hold of an email address, 95% or more of email can be spam, sometimes email addresses are changed just to be relieved from this electronic deluge of junk email.
Also, at least 40 million Americans change their mailing address at least once each year, and this usually results in one or more phone numbers being changed. And of course with the skyrocketing popularity of smartphones, keeping up with a contact's various telephone points of contact can be a bear.
Each of these are just some examples of contact data can degrade over time.
Taking these "facts of life" and combining them with the large number of typos that can occur during the data collection process of these data elements, especially over the Web, and you have a recipe for a significant data accuracy problem.
Getting the "Big Three" right isn't always easy, but in most cases, investing effort and resources on this issue along with the application of various solutions designed to solve these kinds of problems can pay significantly dividends, both short-term and long-term. Focusing on these three primary points of contact, and greatly improving the validity and accuracy of that information, can go along way in getting the results you are looking for when communicating with customers and potential customers.
And of course, perhaps our Contact Record Verification Suite can help. We'd be happy to talk with you about it and help address your particular situation. After all, that's what we do every day.
CRM success is heavily dependent on the accuracy and comprehensiveness of data within the CRM system. Incomplete or inaccurately collected data can significantly impact CRM ROI if account reps have to spend a lot of their time tracking down correct information about a prospect or chasing down prospects who are difficult to find or no longer employed by the organization being pursued.
StrikeIron has several applications available on the AppExchange that are natively integrated to Salesforce.com using the Force.com Cloud platform. These solutions can go a long way in helping an organization greatly improve the quality and completeness of the contact data that exists within their Salesforce.com data, making it easy an natural part of the data collection process.
-resized-600.bmp)
You can find out more about these solutions here: http://crm.strikeiron.com/Home/Live-Data-for-Salesforce-CRM.aspx
In addition to the ability to validate and correct mailing addresses both in the US and Canada as well as 200 other countries, verify email addresses, and check phone numbers for Do Not Call list compliance, our solutions provide custom mapping capabilities to ensure that the data returned from each verification call ends up in the correct field within your customized Salesforce.com application. The application simply hits our data center with contact record data, validates it, and then brings back any additional enhanced data about that contact that goes straight into the account or contact record. This integration, including the custom field mapping, is a big selling point of the solution.
Here are a couple basic screen shots showing how to utilize the mapping capabilities:
(Mapping data from Salesforce.com that will always be validated by StrikeIron)
-resized-600.bmp)
(Mapping data from StrikeIron back to fields, including custom fields within Salesforce.com)
-resized-600.bmp)
Also, if you want to see these solutions in action and how they provide for a solid foundation of clean, accurate, and complete data within Salesforce.com, visit us at our booth at DreamForce next week at the Moscone Center in San Francisco.
StrikeIron is going to be sending a large contingent of team members out to the Salesforce.com Dreamforce event December 6th-9th at the Moscone Center in San Francisco. It is being billed by Salesforce as the "Cloud Computing Event of the Year".
We will be showcasing our native Force.com applications, where we have seamlessly integrated several of our data verification offerings into the Salesforce.com CRM platform, including address verification, email verification, and the Do Not Call list (checking in real-time for outbound compliance).
We also will be showing our Informatica Cloud Contact Record Verification plug-in, where data being loaded into Salesforce.com from various sources can be validated and enhanced as it is being loaded into the system (daily lead loads for example). This can provide for dramatically better data quality within Salesforce, which is often cited as the #1 problem with CRM ROI.
And then of course we have several other data-as-a-service and data verification offerings that are easy to integrate into any application. While the underlying technology for cloud-based name, address, email, and telephone verification is the same, there are of course many cases where you would want to do this outside of Salesforce, but still to the benefit of CRM and other applications.
We will have engineering (including our CTO), marketing, and business development folks (including myself) available for anyone who wants to explore our technology, asks questions, and discuss partnership opportunities.
We hope to see you there!
The debates rage on about "Public Clouds" and "Private Clouds" and which is more appropriate for serious computing efforts, including in business systems and all across the universe of applications.
Most vendors, not surprisingly, line up behind the approach that best suits their product offerings.
For example, SAAS vendors (Salesforce, NetSuite, SuccessFactors) say that multi-tenant applications are the Cloud, citing the need for a business solution with shared, multi-tenant software resources, including databases, are needed to truly make the Cloud useful. Yet many of these vendors are often criticized for not providing "open" models, so still some long-term questions remain. Yes, these Clouds are easy get into, but how do you get out of them if necessary?
The infrastructure-as-a-service crowd (Amazon's EC2, Google App Engine, Rackspace) will suggest that only infrastructure is the "true" Cloud, meaning essentially renting clean servers by the minute and storage by the byte represent the original "open" Cloud vision, enabling applications to be moved from Cloud to Cloud without difficulty. However, this is just servers and storage in the end (at least for now), so the user still has to build everything themselves. Ok for some, not entirely useful for most.
And of course the enterprise software folks (Oracle, SAP, IBM) often claim that the Cloud can and should be "Private" because it's a better security model and enables you to manage it
within the organization. This enables them to capitalize on the hype of the Cloud without having to change too much of their actual offerings. Of course the challenge with this model is the lack of sharing licenses or hardware across organizations becomes quite expensive, and quite frankly we have had this model before under other names such as "mainframe", "client-server" and other "in-house" architectures. Sure, there is some incremental innovation and usefulness, but it's not too much different than what has always been offered, just another iteration.
So while there are valid use cases for each of the above scenarios, there is one thing I want to point out with Public versus Private Cloud discussions when businesses are unsure which route to go. It goes all the way back to the birth of the Cloud as a concept itself.
The reason we even have the Cloud in the first place is that heavily-trafficked Web sites such as Google and Amazon found they had to build massive, high performance, scalable systems to be able to handle the processing load at peak times (Amazon at Christmas for example). This meant that during non-peak times, they found themselves with lots of excess, unused computing capacity.
This of course spawned the idea that they could leverage this excess capacity, as well as their
expertise in managing high-performance, distributed, "Web scale" computing technology as an additional line of revenue, and possibly launching a brand new industry of opportunities. Hence, the Cloud was born.
The one key piece of this Cloud concept is "expertise". This is something that you get in Public Cloud environments that you don't get in Private Clouds. With Private Clouds, you get all of the hardware and software (and the corresponding purchased licenses) that you need, but you don't have a team of experts that have been running that platform for years monitoring, managing, and supporting that platform in real-time while you use it, including having visibility into it as it runs. By definition you therefore don't have engineers supporting the success of your application systems on a minute-by-minute basis.
This real-time team of experts, and their associated expertise developed over time, is something you get inherently in the Public Cloud scenario. The folks who run these systems have as their core mission in life to keep the platform up and running, battle test it over time, improve it, enhance it, test it, analyze operational data, review performance charts, improve and enhance it again, and on and on, day after day.
Although a bit overused, the electric generator is a good example of demonstrating the difference. If you have your own electrical generators powering your home, it doesn't matter that thousands of other people have one just like it in their homes. If it goes down, you are on your own, and it's your responsibility to keep the electricity flowing from room to room. But if you plug into the electric grid run by your local power company, and there is an outage while you are having dinner somewhere, likely it will be fixed before you even get home from the restaurant. And you might not even notice there was a problem since you weren't at home (you were out dining in the "Dinner Cloud" and outsourcing the washing of dishes). This is because the system was monitored, a problem was detected, and a team was ready to spring into action once the outage occurred.
How long would it have taken to call the generator repairman to get him scheduled to come out with a power outage in your own generator? There's a reason electricity grids have evolved the way they have.
Oh, and all of the innovation occuring behind the scenes at the power company on a day to day basis? It comes to you automatically, often while you sleep, as opposed to a new giant chunk of hardware arriving every 18-24 months that you have to figure out how to configure and get up and running again.
So how is this relevant to StrikeIron?
Well, the same is also true in our case. While we are more the Software-as-a-Service variety of Cloud Computing (and in our case "data-as-a-service"), we recognize that users have a choice in the way to obtain the type of functionality we offer. A lot of the powerful capabilities we have such as our Cloud-managed
Contact Record Verification Suite, such as real-time telephone, address, and email verification, could also be purchased and brought in-house as software applications and raw data sources, and a similar result could be achieved in terms of better, more usable customer data assets. The approach would just be a heck of a lot different.
In the latter scenario, all of the verification reference data would have to be managed and maintained internally. One would have to acquire the software and data files, and then get the functionality up and running. It would then have to be designed and delivered in such a way to be able to handle the various loads of data verification that might appear from different applications at different times, and often in high volume scenarios. Also, all of the other expertise around availability, testing, updating, and the usual effort associated with in-house solutions would have to be developed internally.
With us, all we do day in and day out is focus on verifying and delivering our real-time data verification capabilities to thousands of applications simultaneously with a very high level of performance at all times, delivering 24x7x365. All you need to do, just like the electric company, is plug into us. All of the data management, updating, software maintenance, and performance testing and improving is done by us, with all of the heavy lifting abstracted from you.
Since we launched our system in 2005, we have constantly improved our finely-tuned delivery and fault-tolerant capabilities, including load-balancing, high speed data I/O, redundancy, external monitoring, and everything else we have to provide to be able to support our customers and their production applications. And we are getting smarter and better about how we go about it every day. This expertise is something that each and every one of our customers gets to leverage with every single call to our system. This is why we have only had minutes of downtime over the last four years.
So could in-house solutions provide the same end result? Maybe in the sense that yes you could end up with good clean customer data somehow on your own. But at what cost, effort, and with what missed opportunities? Focus on your core business, and leave the external data verification effort to us. We will keep the lights on. Guaranteed.