Splunk Makes Sense of Machine Data
The body of information targeted by Big Data has two primary sources: humans and machines. On one hand, Big Data involves exploring data created directly by people. Social media content, newsletters, blog posts, and other forms of electronic correspondence are actively created by individuals, and when analyzed for patterns and trends can have significant implications for marketing strategies, branding, and customer engagement.
The other, perhaps less glamorous side of Big Data is the “machine” side, which involves indexing and analyzing behind-the-scenes computer-generated data. Well-deciphered machine data can generate useful insights into customer transactions, security threats, fraudulent activity, and application behavior, among other valuable metrics. San Francisco-based Splunk is a leading solution for monitoring, searching, and analyzing machine data, and the company’s vast client list and impressive IPO earlier this year are a testament to the importance of harnessing machine data.
Key to understanding what Splunk does is understanding machine data itself. All the devices and applications we use, and much of the infrastructure that supports them, including servers and network switches, automatically create information when in use. This information could be application logs, Web access logs, clickstream data, packet data, or call detail records. This data is a definitive record of user, device, and application activity.
For example, any time a person visits a Web site their interaction with that Web site is recorded as clickstream data in a variety of locations—routers, proxy servers, ad servers, etc. This information has value for marketing and usability analysis, but it’s useless unless it’s extracted, indexed, analyzed, and presented in a useful way. This is where Splunk comes in.
Users download Splunk, point it toward their sources of machine data, and use it to troubleshoot application bugs, investigate security threats, reduce operational costs, and gain real-time visibility into customer behavior. Splunk is free for individuals, and available to corporations through a low-cost enterprise license.
One of the primary benefits touted by Splunk is the fast time-to-value and ROI users encounter. By offering real-time visibility into the operation of one’s IT infrastructure, Splunk can lead to greater productivity, a reduction in penalties associated with SLA violations, fewer instances of fraud, a reduction in downtime, and real-time business insight.
Founded in 2003, Splunk was one of the first strictly Big Data companies to IPO. In April 2012 the company went public, to great enthusiasm, as its stock jumped 109% in its first day of trading. Splunk is currently run by CEO Godfrey Sullivan, the former Hyperion chief who sold the company to Oracle in 2007 for $3.3B. The company’s customers hail from all industries, and include Adobe, Comcast, United States Department of Defense, Visa, Morgan Stanley, LinkedIn, Salesforce.com, and Raytheon.Tags: big data, Hadoop, machine data, Splunk