Tuesday, December 10, 2013

Data Visualization

In the past blogs, we have seen the storage of several kinds of data such as unstructured, semi-structured and structured, and also the platforms used to store the data. These data can be an operational data of an organization or collection of messages on social media such as facebook or twitter. Once we have collected and stored this data, we can move ahead with the studying the data to uncover useful information. There are several ways in which the data can be studied such as statistical analysis, predictive analysis and so on.
One such method is visualizing the data. There are many tools available for data visualization such as Tableau, Spotfire, Gephi, Raphael and many javascript (js) files available free for use. But, in order to visualize the available data, you have to first understand the data and recognize what patterns you are looking for. Without understanding of data, you might easily get lost while studying the data and creating the visualization. For example, lets consider you want to see the twitter activity of a user over time. You might need to define the time, whether it would be weeks or days or hours, basically the granularity of an attribute. And then proceed with creating a two dimensional graph to view the pattern of tweets posted by user.
A visualization can be as simple as a graph, and can get complex depending on the number of attributes you want to consider while visualizing. Also, you will have to research on tools and decide which one would fulfill your requirements. Tableau is a good option for linear and graphical visualization such as bar charts and heat maps; while Gephi is good for network visualization.
Data visualization is a method to present your data and reveal visually appealing patterns which can help create stories in the context of data. Considering the previous twitter example, one might find that the user tweet activity is high during the day which would be normal, or you might as well observe that the user's tweet activity increases during evening and peaks after midnight. This would clearly indicate that the user is active during night and sleeps or works during the daytime. Also, if you want to dig deeper, you might want to see what kind of tweets he posts while at his peak time, and consider direct marketing based on the user's deducted interests from the tweet activity and followings.
Depending on the visualization you intend to do, you might also have to prepare the data for loading into the tools you will be using. Some of the BI or data warehousing tools such as OBIEE provide an in-built capability to visually present the selected data. These visualized data are most effectively used in dashboards and reports which display several types of data which would help someone, probably a manager to monitor the operational performance of an organization and make appropriate decisions.
To conclude, I would like to remind you the phrase "A picture is worth thousand words" which stands true for explaining the data using visualization rather than using data tables with actual data values.

No comments:

Post a Comment