The 2012 UP-START Cloud Awards (@UP_Con) has named StrikeIron as a finalist for the Best Cloud Broker Award, recognizing the company for its advancement and contribution to the cloud computing business and technology communities. StrikeIron is one of five finalists.
Other enterprise companies competing for awards include Salesforce.com, DellCloud Applications, Amazon Web Services, Rackspace and Apple, to name a few.
A group of cloud experts determined the winners in each of 22 categories after tallying votes based on the Borda count election method. Final voting will take place during the Third Annual Global UP 2012 Cloud Computing Conference, which is December 12, 2012, at the South San Francisco Conference Center, with more than 150,000 attendees expected.
“We are thrilled to add another award to our portfolio,” says Justin Helming, StrikeIron’s VP of Marketing. “StrikeIron is the best cloud broker service, because our IronCloud™ platform has been proven over nearly 10 years for thousands of customers, who have successfully processed billions of transactions.”
StrikeIron was founded in 2003 by Bob Brauer (founder of DataFlux), Richard Holcomb (co-founder of HAHT Software and Q+E Software), and Robert Dale. As a result, StrikeIron has developed a solid infrastructure and has won many accolades, including recognition from Inc. Magazine, Business Leader Magazine, Gartner and Forrester Research.
"Big Data" has not only become one of the hottest terms in information technology today, it has also become one of the most confusing. This is partly because of the many definitions of Big Data floating around, and no real consensus as to which sources are authoritative. And of course, once the marketing masses get hold of a valuable term like this, the definition-stretching of a valuable search term gets leveraged to modernize the productizing of legacy solutions and adds to the perplexity, which is already happening a great deal.
Many people think Big Data refers strictly to very large amounts of data. While this is somewhat true, there is no real size characteristic that would identify a company, a product, or a solution as Big Data. Big Data to one might not be Big Data to another. For example, there have been large volumes of data around for a long time, so in of and itself, the idea of data volume would not be anything new.
I think the easiest way to understand the meaning of Big Data and its implications, at least in the here and now of late 2012, is to instead think of it as an umbrella term for a collection of technologies and various architectures (that are creatively inexpensive) for the handling of large, otherwise unwieldy datasets, where data is being added at very high rates of speed, largely as the result of all of the data being generated on the Internet.
Some historical context may help.
Up until several years ago, there was essentially, what is now known as PC-scale (single computer) applications and Enterprise-scale (across the company) networked applications. A standard relational database such as Oracle, MySQL, SQL Server, or even in some cases dBase or Microsoft Access, were adequate to meet the needs of even the heaviest of data requirements for these scales of applications.
Then, as the Internet began to mature, a third class of applications came along known as “Web-scale” applications. These include the various applications made available by Google, Facebook, Amazon, NetFlix and others whose ongoing usage was generating so much data around the globe, far more than any single enterprise had ever seen, that it was essentially too much for any standard relational database to be able to handle.
Alternatively, even if one of the major commercial database vendors could handle it, price tags would be in the millions of dollars, along with the cost of all of the hardware to achieve it (probably another $10M). These heavy data and corresponding performance requirements, combined with a need to solve this problem in a lower-cost way, spawned a completely new set of database technology that could handle this scale of operational data without the otherwise absurd price tags.
Some of these architectures include “Hadoop/HBase,” “Cassandra,” “Big Table,” “Dynamo” and other distributed, text-based solutions that could use distributed standard PCs in some cases to handle these complex requirements. These solutions have been architected in such a way to handle the requirements of “Web-scale” applications, each one with its own advantage (I/O-centric like Cassandra, versus analytics-centric such as Hadoop for example) without having to invite the commercial database salesperson to visit. Also, not only storage and retrieval mechanisms of huge datasets are included underneath the umbrella Big Data term, but other capabilities such as capturing data, searching, analysis, and other ways to manipulate these gigantic streams of data, and all of course, in a reasonable, useful timeframe.
Therefore, when the industry talks about “Big Data,” it is the technology approaches to which most of the experts are referring and discussing, and usually in the context of Web-scale data architectures. Companies that are focusing on these particular technology solutions are the ones, which rightfully fall into the Big Data category. Unfortunately, many companies offering some level of data volume are often saying “Hey! We are doing Big Data, too!” However, this is somewhat like someone saying they are doing the Cloud because they have a website. You cannot stop them from saying this, but it helps to be able to recognize when this occurs.
Hopefully, thinking of Big Data in these technology terms help assuage some of the mass confusion which seems to be out there around the emerging concept, and enable you to recognize solutions and concepts that fit within this realm, versus those that would purely like to be, and allow you to be a contributing member of the discussion.
The reasons for why you should be doing mobile marketing are numerous and profound. It’s been well documented how text messaging is a pervasive, attractive channel.
Mobile has been championed largely for its ability to serve as an advertising platform for marketers. However, mobile does not need to be all about flashy ads or innovative promotions and campaigns.
SMS text messaging remains one of the simplest, yet most effective ways to increase revenue and brand engagement. While text alerts and notifications are not a glamorous or groundbreaking mobile tactic, they yield great results. Studies show that SMS has an unprecedented 98% open rate (almost all text messages are usually read within five seconds!).
While SMS alerts and notifications are simple, they are excellent at ensuring a great customer service, turning consumers into brand evangelists.
As a marketer and brand advocate, I can appreciate a well-executed SMS campaign. I thought I would highlight a few of my personal favorite SMS alerts and notifications:
- CVS/pharmacy: CVS/pharmacy rolled out their SMS program known as Order Ready Text Messaging, which notifies users when their prescription is ready to be picked up at their local pharmacy.
“CVS/pharmacy launched Order Ready Text Messaging to allow customers the convenience of receiving an immediate notification when their prescription is ready to be picked up,” said Erin Pensa, director of public relations for CVS/pharmacy.
“This program enhances communication with our customers by letting them know their orders are ready when they would not otherwise be aware, eliminating the need for them to call-in to check and, ultimately, save the customer time,” she said.
Text alerts are available at more than 7,300 locations in the U.S., including the one I personally use. I opted-in to the Order Ready Text Messaging program, which I have found beneficial. It definitely saved the time and hassle of calling into the pharmacy to confirm pickup time. I never have to wait around the pharmacy, because I now know exactly when a prescription is ready. Whether CVS/pharmacy fulfills an order early or late, I can adjust my time accordingly.

Time is not only critical for the sake of convenience, but customers’ medical health. This is why the immediacy of text messaging allows for a great customer experience at CVS/pharmacy.
- Barnes & Noble: While Amazon.com is the largest retailer on the Internet, Barnes & Noble remains the largest book retailer in the U.S. For those of us who still prefer to have a brick-and-mortar store, Barnes & Noble is a great option due to its “Pick Me Up” program.
Barnes & Noble created this program to blend its online and offline presence. The premise behind it is simple: “Reserve Online, Pick Up at a Store Near You.”
The “Pick Me Up” program is ideal for cross-channel consumerism. The days are long gone when people researched and purchased products in the same channel. Barnes & Noble’s mobile program helps account for this shift in buyers.
Customers receive a text notification of when a book is ready to be picked up at their local store. In short, mobile serves as the bridge between online and the brick-and-mortar store.
As a customer, I liked this mobile program for the times when I simply do not want to pay for shipping or wait a week to get the juicy bestseller.
Receiving a SMS alert made the whole buying process run very smoothly. When you arrive at the store, you go straight to the checkout line and let the cashier know you have a pickup order. They’ll grab the items from behind the registers for you, allowing a quick and convenient buying process. Now you do not have to waste time searching the store for a book only to find it is out of stock.
This Tweet succinctly sums up one customer’s satisfaction with the “Pick Me Up” program:

- Verizon Wireless: Verizon Wireless can send an email or a text message to your phone to notify you when the bill is available, as well as usage warning texts on your behalf.

According to the L.A. Times, Verizon, AT&T and other cell major cellphone providers have agreed with U.S. regulators to end bill shock by sending warning text messages to subscribers who are approaching monthly voice, text and data limits.
I like this SMS use case, because it is proactive by creating a dialogue between brands and customers. Remember: a well-informed customer equals a happy customer.
These 3 use cases demonstrate that mobile can be simple, yet powerful in driving customer engagement and satisfaction. Hopefully, they have inspired you to start implementing SMS messaging as part of your mobile strategy. Feel free to share any other use cases that you have encountered and enjoyed.
DataWeek 2012 is a conference happening this week in San Francisco. The theme of the conference is a focus on the data revolution that is occurring across businesses. This includes the growth of data-related technologies such as "big data", "data as a service", and the Cloud and how they are creating paradigm changes in product engineering, marketing, and customer relationships. Areas discussed will be how "data science", "data analytics", "open data initiatives", and data platforms will be useful in 2013 and beyond as organizations more and more recognize data as a strategic asset and a critical driver for business growth.
As businesses become more and more data-driven, to compete they will have to become more adept at understanding their customers and customer behavior, product usage patterns, and opportunities and risks that may not be so apparent until operational, customer and other data within the enterprise is leveraged to illuminate what is really happening within our businesses. This can provide the groundwork for insight and the strategic decision-making that follows that insight. In other words, "how does data drive the business forward?"
StrikeIron's CTO Bob Brauer will be moderating a panel on "Making Data Products with Data-as-a-Service", where we will delve into what makes a successful data product, win-win business models, how high quality data is at the core of any data driven initiative, and what the future holds for data-as-a-service products. Tom Carlock from D&B, Stephane Dubois from Xignite, and Brian Wilcove from Sofinnova Ventures will also be on the panel.
Come join us for the session, Tuesday, September 25th at noon PDT at the DataWeek event.

InsideFacebook reported Friday that Facebook will be introducting a new offering allowing advertisers to target ads based on their customers' email addresses or phone numbers.

This is a very interesting and granular offering. It effectively enables you to target your existing online and offline customer base directly, 1:1, on Facebook.
There are many very compelling use cases for the savvy marketer. For example, retailers can target customers based on past purchase behaviors; brands can create compelling, multi-touch, retargeting campaigns; and CPL advertising networks can effectively cross sell and even turn co-reg leads into more valuable offer specific leads. The possibilities are endless.
Of course, all of this is premised on having the correct phone number or email address associated with the target customers. Many people have multiple email addresses and phone numbers, and these can change frequently. This means the email or phone numbers in your lead database, if you have them at all, may or may not match the information in Facebook.
One solution is StrikeIron's Phone and Email Append solutions. Reverse Phone and Address Lookup appends phone numbers, even cellular numbers, to existing names or addresses. Additionally, Phone Append will append demographic information such as estimated income, and homeowner probability. Email Append will append email addresses to your existing customer data, including providing alternative email addresses, which is useful if your customers are using a different address on Facebook.
Facebook's new targeting is a step closer to 1:1 marketing where you can now target based on your current customer or prospect list. Append solutions from StrikeIron can help you enhance the ability to reach your audience audience.
How will you use 1:1 customer targeting options like Facebook's?
I've had an opportunity to work closely with ActivePrime lately as they are in the homestretch of beta-testing their CleanVerify product. CleanVerify is one of a suite of products they market to help companies improve, organize, and better utilize the data within Oracle CRM On Demand. I'm really excited about what they have accomplished.
ActivePrime has integrated several services from StrikeIron as part of their solution. I am helping them test the usability of the integration. Utilizing StrikeIron's Cloud-based address verification, email verification, and phone validation products, ActivePrime is able to deliver a powerful, high-performance, real-time data validation and data cleansing solution fully integrated within Oracle CRM On Demand. This enables customers to focus on closing business rather than the ongoing maintenance of data within their CRM system.
For example, a call center representative collecting contact information from a customer and entering it into the Oracle CRM On Demand system can now automatically validate that the collected mailing address, phone number, and email address are all correct (see screen shots below). Validating data at the point of collection before it ever gets into the CRM system can reduce the cost of downstream data cleansing efforts as much as 10x, ensuring a high level of quality data within the CRM system from the moment the data is entered.
In addition to validating the data in real-time, additional information, such as correct ZIP+4 and county, is also added to address records.
Another nice feature is a log that gets created that keeps track of every validation that occurs within the Oracle CRM On Demand system, ideal for administrators and other data stewards who like to stay on top of these kinds of things.
If you would like to try the integration during the beta period, or anytime after, ActivePrime is offering a CleanVerify trial here.
Validating a mailing address:

Validating a phone number:

Validating an email address:

View log/report:

As you may know, StrikeIron is an Informatica Cloud partner. We recently won another customer account that will be using the StrikeIron Contact Record Verification suite to clean their records as they move between Salesforce.com, a proprietary marketing database, and Eloqua via Informatica Cloud. To help this customer get started, we wanted to be able to run Informatica Cloud on a Mac as well as have a test platform that was remotely accessible from anywhere.
Running Informatica Cloud on AWS accomplished both of these goals. We could run the secure agent on the EC2 instance and then access the Informatica Cloud web front end from a Mac or any of our customer's computers without worrying about firewalls, etc.
This tutorial will go step-by-step through how to create an AWS EC2 Windows Server instance and install the Informatica Cloud Secure agent.
The first step is to create your Amazon AWS account on this page by clicking the “Sign Up” button in the top right corner. The instance created in this tutorial will run in the free tier so if you are a new user, it should not cost you anything. Once your account is created and approved we are ready to start.
Create the instance:
1) Log into your AWS account at: https://console.aws.amazon.com/console/home
2) You should be on the AWS Management console screen. Click the EC2 icon
. This will take you to the EC2 Console Dashboard.
3) Click the “Launch Instance” button to display the Create New Instance Dialog.
4) Make sure the Quick Launch Wizard radio button is selected. There are three key pieces of information you will enter on this screen:
-
In the “Name your Instance” field type "InfaCloudTest” or whatever you would like to call this instance.
-
In the “Choose Your Key Pair” section, select the "Create New" radio button and name your security key pair “InfaCloudTest”. The key pair is used to create a secure password for your remote desktop. Click “Download” to download your PEM file to your computer. Note the location as you will need it later.
-
Finally, you will select the instance configuration. Choose the “Microsoft Server 2008 Base” with the 64 bit option selected.
Your "Create New Instance" dialog box should now look like this:

5) Click “Continue” to see the next step in the "Create a New Instance" process.
6) The next dialog should look like the following. You should not need to change anything but there are two important settings to note. First, make sure the Shutdown behavior is set to “Stop”. “Stop” means that if you shutdown the instance, all of your data will persist – just like a normal PC. If this option is set to “Terminate” your instance will be effectively formatted and will also disappear from your instance table next time Amazon does a cleanup sweep.
The next important item is the Security Group. Amazon creates a default security group for you. Depending on what endpoints you connect to, you may need to open up ports in the security group later.

7) Click “Launch” to continue. You will receive a confirmation box saying that your instance is launching. Click “Close”.
8) You will be taken back to the EC2 Management Console. On the top-right hand side, you will see a section called “My Resources”. It should now show that you have 1 running instance (you may need to wait up to 2 minutes then click refresh for it to show up).
9) Click “1 Running Instance” and you will be taken to the “My Instances” page as seen below. Click the check box to the left of your instance name (InfaCloudTest) to display the instance information in the bottom pane. Take a look at this information which includes the full domain name, security groups, and elastic IP if you have linked one (note: we do not need an elastic IP for running Informatica Cloud).
10) Right click on the instance and select “Connect” as seen below:

11) You will see a dialog box like below which contains the remote desktop login details for your instance.
12) Click the “Retrieve Password Link”. You may get a warning saying “Not Available Yet”. If so, you will need to wait up to 15 minutes.
13) Click “Choose File” and find the PEM file you downloaded in step 4.
14) Click “Decrypt Password”. This will display a dialog box with the login information.
15) Note the Public DNS, username, and password as you will need this information to Remote Desktop into the machine. You can download a shortcut file to a Remote Desktop Instance as well.
16) Now open your Microsoft Remote Desktop Application. This will be in the Application Folder if you are on a Mac (RDS comes with Office or you can download from: http://www.microsoft.com/mac/remote-desktop-client) or access via "Program Files | Accessories | Remote Desktop Connection" if you are on a PC.
17) For the computer name, enter the Public DNS entry (note: this will change each time you stop and restart and instance).
18) Remote Desktop will pop up a login box. Enter “Administrator” as the User Name and the password you copied from step 15 above. Leave the domain field blank. Click the “Add this information in your keychain” if you are on a Mac to remember your password.
19) You may receive a warning that the server name on the certificate is invalid. Click “Connect”.

20) You should now be logged into your AWS Windows instance and see a Windows desktop.
Installing Informatica Cloud:
21) Start up Internet Explorer. Select “Don’t use recommended settings” if prompted. Internet Explorer comes with very tight security settings on Windows Server so I suggest you navigate to http://google.com/chrome and download Chrome to save some time and frustration. You will likely have to add several google domains to the Trusted Sites list when prompted to download.
22) Navigate to www.informaticacloud.com and click “Login Here” in the top right corner.
23) Login using your Informatica Cloud credentials.
24) Click “Configuration”. Click “Agents”.
25) Click the yellow “Download Agent” button.
26) Select “Windows” as the platform and click “Download”.

27) When the agent_install.exe dialog is complete, click agent_install.exe and “Run” in the Windows security box.
28) Select the default values for the Informatica Cloud Agent install wizard and click “Done” when complete.
29) Enter your Informatica Cloud credentials and click “Register” in the setup box.
30) After approximately 30 seconds, you should see that the Secure Agent is up and running on the Windows Server.
31) You should see the Agent populate on the Informatica Cloud site in the Configuration | Agents section.
32) If you are going to use files or database on the AWS Windows Server, you will also need to add a connection to the EC2 instance. For example, to read/write flat files on the Windows Server, in the Informatica Cloud web app, click “Configuration”, and then “Connections”. Click the yellow “New” button:
33) Create a target directory on the Windows Server, "c:\infacloud" in this case, and fill out the new connection information as seen below:

Your Informatica Cloud instance is now ready. You can create Contact Validation, Data Synchronization, and other tasks.
I hope you found this tutorial helpful. Please leave any questions or comments below or feel free to drop us an email at info@strikeiron.com
"Cassandra" is one of the various "Big Data" data storage and retrieval platforms that are available currently, especially for use with Cloud-based applications. Many companies with Web applications serving many simultaneous users that have significant data requirements are now utilizing the platform to serve as the data foundation for these applications. It helps to achieve the performance levels they require to support their large, active user bases, while also available at an attractive price (zero licensing costs, minimal hardware costs) as compared to some of the commercial offerings currently available.
Cassandra was originally developed by Facebook to handle the massive number of parallel reads and writes that were required by their user base when interacting with various Facebook pages, especially when searching. As you can imagine, every time you pull up your Facebook page, many different tables of data are accessed for various purposes to provide the content for a given page, with potentially millions of people accessing this same data at the same time. Performance was obviously a key challenge that required a non-traditional solution. Hence, Cassandra was born.
The key to the Cassandra approach is an elimination of the SQL query language (these are instead simple key-value pair text file writes with its own reduced query language), non-support of data joins, and elimination of other performance-heavy database "overhead" features that are present within Oracle, SQL Server, MySQL, and other traditional database plaforms. This feature reduction makes it ideal for storing and retrieving data at high speeds in Web applications with heavy data access loads. It is also architected and optimized for running in the Cloud within multi-tenant, heavy-use applications. Ultimately, it is a sacrifice of features and capability in exchange for speed and simplicity.
Cassandra is especially ideal for Web-scale applications where extensive/high levels of I/O (disk reads and writes) are required. This is different than Hadoop "Big Data" applications (another Big Data platform you might have heard of) for example, where the optimization is more around number-crunching and analytics rather than mass data reads and writes. In fact, these two popular Big Data platforms are more complementary than they are competitive.
Cassandra is an open-source platform available here:
http://cassandra.apache.org/
Here are some benefits of Cassandra:
- Massively scalable
- High performance
- Highly reliable and available
- Redundant: distributed node approach eliminates failure/data loss (data is replicated across all nodes)
- No single point of failure
- All of the distributed data storage is abstracted away from the applications, so more distributed nodes can be added at any time for increased performance, and the interface to the data access remains the same and very simple
There are four primary reasons people use Cassandra:
- High volume performance needed for massive number of reads and writes in multi-tenant Web applications.
- Data architecture is fairly simple, not requiring extensive querying capabilities.
- Cost versus commercial database platforms (commercial providers for RDBMS and storage platforms would charge $$$$ for anything near these kinds of performance results).
- Cassandra works across commodity hardware (PCs) - no high end RAID servers, etc. required, keeping costs on the hardware side very low as well.
The popularity of Cassandra is exploding. At the Cassandra Summit last week in Santa Clara hosted by DataStax (one of the commercial entities focused on Cassandra, along with Acunu), there were over 800 attendees, four times as many that were at the initial event two years ago, and twice as many as last year. Over 1000 companies (such as NetFlix) have Cassandra-based applications in production now. Other companies with Cassandra in production environments include Constant Contact, Twitter, Digg, Walmart Labs, and Cisco Webex. It’s clearly finding usage scenarios, growing in popularity and catching on.
Here is a simple tutorial on building a Cassandra-based application from scratch (will help to understand the basics of implementing it) that demonstrate how easy it is to put to use:
http://www.slideshare.net/patrickmcfadin/cassandra-summit-2012-building-a-cassandra-based-app-from-scratch
So if you are developing Web applications that have a significant associated data requirement, it might be time to give Cassandra a look.

Data cleansing is the process of detecting, diagnosing, and editing faulty data. It deals with data problems once they have occurred. Error prevention strategies can reduce many problems, but cannot eliminate them.
This does not mean we can forego strategy altogether though. Without a data cleansing strategy the data warehouse will suffer from the following:
- lack of quality
- loss of trust
- decrease in business sponsorship and funding
Since data cleansing is tedious and time consuming, a sound methodical strategy is pivotal. A rule-based strategy for data cleansing begins with the understanding that there are really only two options for data cleansing – clean the source data or clean the warehouse data.
When it comes to the latter, the first thought among many organizations is to utilize a DIY approach involving manual data cleansing, which occurs when erroneous data cannot be fixed programmatically. Data volumes to be cleansed are small making the automation process a poor investment.
For the majority of companies, a better suited strategy is automated data cleansing, which handles the cleaning of both warehouse data and source data. As compared to manual cleansing, an automated process can be done on the front- and back-end. Depending on your data, you probably will want to cleanse data as it is collected, as well as later during periodic intervals. An automated process can easily enhance a database by doing timely scheduled cleanses. This is very useful since data quality naturally erodes over time.
Automation should be part of your data cleansing strategy if you have a large-scale database. The cost involved in manual cleansing is high when compared to the time in which it can be done with an automated process in place. All or majority of the data errors can be fixed programmatically by applying a cloud-based solution like StrikeIron’s that use logical rules to cleanse data in real-time.
See how StrikeIron can fit into any automated data cleansing strategy. Drop us an email at info@strikeiron.com or give us a call at 919.467.4545 to learn more.
Data scrubbing, also known as data cleansing, is the process of changing or removing data in a database that is incorrect, incomplete, improperly formatted, or duplicated. Data scrubbing focuses on cleaning up data by making it more consistent and accurate.
All organizations deal with data, so scrubbing can be useful for a variety of industries. However, certain data-intensive fields may find it particularly beneficial such as banking, insurance, retailing, and telecommunications.
Database errors are prevalent for a variety of reasons. They typically result from human error in entering the data, merging of databases, a lack of company-wide or industry-wide data standards, or old systems that contain outdated data. Before technology had the capability and sophistication to sort and cleanse data, data scrubbing was done by hand. Not only was this time consuming and expensive, but it oftentimes led to even more human error.
This created the need and subsequent emergence of data scrubbing tools, which systematically examine data for flaws by using rules, algorithms, and lookup tables. However, a better alternative is today’s cloud-based solutions that work in real-time. As opposed to on-premise data scrubbing tools, cloud solutions can capture and cleanse data on the front-end. This saves a database administrator a significant amount of time and resources. It is less costly to correct from the get-go than fixing errors manually on the back-end.
While small errors may seem like a trivial problem, merging corrupt or erroneous data causes the problem to be magnified and exponentially troublesome. It is so burdensome that it is affectionately called the “dirty data” problem, which has existed for as long as there have been computers. Experts argue that the dirty data problem costs companies from millions to trillions of dollars each year. The problem is becoming increasingly critical as businesses are becoming more complex with more data and systems. There is no point in having a comprehensive database if that database is filled with errors and inaccuracies.
Look for a vendor like StrikeIron that offers cloud-based data quality solutions, not software, that go through a process of using algorithms to standardize, correct, match, and consolidate data.
Data scrubbing is sometimes skipped as part of a data warehouse or MDM project, but it is one of the most critical steps to having a good, accurate end-product. Since mistakes will always be made in data entry, the need for data scrubbing will always be present. Therefore, implement a cloud solution that can easily adapt as your company evolves and grows with time.