Data Analysis Using Tableau
There is no question that we are experiencing a data revolution. Data is collected on everything we do and think. Whether your practice focuses on business valuation, litigation support or fraud investigation, you can’t avoid the onslaught of data that is available to you. The questions we all face, however, are what to do with all this data and what is the most efficient way to analyze it.
There are lots of methodologies and applications currently available for data analysis making the task of determining what to use and how to use it even more frustrating. In this short article, I would like to
introduce you to Tableau, a visualization data analytics tool. I’ve used Tableau to analyze data for
litigation support and fraud investigations.
Since a picture is worth a thousand words, I have put together several Tableau exercises using 2016 City of Austin Crime Statistics. The visualizations and the steps required for preparing them are included in this article. You can try creating the visualizations on your own by downloading a copy of Tableau Public which is a free version of Tableau available to the public for learning to use the application https://public.tableau.com/en-us/s/. The 2016 City of Austin Crime data can be downloaded as either a CVS file or Excel file from the City of Austin Crime Data site https://data.austintexas.gov/Public- Safety/2016-Annual-Crime-Data/8iue-zpf6. I’d suggest downloading the data as an Excel file.
After downloading the file as an Excel file, I extracted the first 1,500 rows of data for my analysis, copied the rows into a separate spreadsheet which I named 2016 Austin Crime Data Sample and cleaned up the spreadsheet a little by getting rid of rows with missing addresses.
When you open Tableau Public, you will see on the left side of your screen a menu of file types including Excel and Text (for CVS files). I chose Excel and opened my 2016 Austin Crime Data Sample. Tableau pulled the spreadsheet into the application and displayed the data as shown below:
My first analysis is visual tracking of 2016 crime offenses committed in the City of Austin Zip Codes. On the left side of my visualization, you will see two lists, Dimensions, and Measures. Dimensions are categories or in statistical terms independent variables. Measures are numbers or dependent variables. So, for example, to analyze what zip codes have the most crime you pair up the number of records with zip codes. Records are dependent on zip codes for determining how much crime is occurring in any one zip code.
The right side of the map has a drop down box for selecting the type of offense. By selecting an offense, the viewer can see in which zip code a crime has occurred and how often it has occurred.
To create this analysis in Tableau Public, I used the following procedure:
- I opened Tableau Public and selected my Excel “2016 Austin Crime Data Sample” by clicking Excel from the left side
- Next, I clicked on Sheet 1 to open the Tableau Public workspace. My Dimensions and Measures
are shown on the Left and my workspace is in the middle.
- From Measures, I dragged Longitude onto Columns and Latitude onto Rows. In the Marks Box, I clicked the down arrow and changed Automatic to Map.
- Next, I dragged Go Location Zip onto the Label square (also in the Marks box). The Go Location Zip defined my zip code areas on the map and assigned each area its respective zip
- I also dragged Go Location Zip to the Color square in the Marks Box which resulted in a different color for each zip code area. Since I felt this view was distracting, I changed the multi-colors to a green gradient. To do this, I clicked the Color square and selected edit colors. In the color palette window, I clicked the dropdown arrow and selected green, then clicked on Assign Palette and OK. The color palette selection resulted in a green gradient for the zip code area making the individual zip codes easier to
- Next, I dragged Go Highest Offense from the Dimensions list onto the Filters Box and selected use all and OK. By placing my cursor over the Go Highest offense, a dropdown arrow appeared. I clicked the arrow and selected Show Filter. A filter box appeared on the right side of the screen with all the offense descriptions displayed. I clicked on the dropdown arrow and selected Show Single Value (dropdown). Each type of offense can now be individually selected
by the viewer. When an offense is selected, the interactive zip code map displays which zip codes the offense took place and how many times the offense occurred.
- I dragged Number of Records from the Measures list onto Label to display the total number of offenses per zip
- On the right side of the workspace, the Go Location Zip codes are displayed. Since the zip codes show on the map, this list is unnecessary. To remove it, I placed my mouse over the Go Location Zip title, clicked the down arrow and selected Hide Card. This action removed the Go Location Zip list from
- Finally, to edit the title from Sheet 1 to Offense by Zip I double clicked the Sheet 1 tab to highlight the title and typed over it with the title Offense by Zip.
This type of analysis adapts easily to a vendor fraud investigation. The geographic location of each vendor can be tracked on an interactive map. The dollar amount of vendor purchases, the purchasing agent assigned to each vendor and the type and quantity of the product purchased could all be analyzed and visualized.
The next analysis is a time series example which measures the type and number of offenses committed during a period; in this case, the period is a month.
To create this analysis, I deployed the following Dimensions and Measures:
- I dragged from Measures the Number of Records onto Columns and from Dimensions the Highest NIBRS/UCR Offense onto Rows. This action gave me a one-color bar chart with minimal information.
- To distinguish between the different Offense categories, I held down the control key and dragged Highest NIBRS/UCR Offense from the rows shelf onto Colors. By holding the control key, a copy of the Dimension is placed on Color, and the original Dimension remained on the Row shelf. This action also resulted in a color legend of offenses on the right side of my workspace which I decided was helpful to the
- I also held down the control key and dragged Sum(Number of Records) to the Labels square. This action placed a total number of offenses next to each bar, again aiding the viewers of this visualization.
- As a final step, I dragged the Go Report Date to the Filters Box and selected Month as my filter. By hovering over Month(Go Report Date) a down arrow is available. I clicked the down arrow and selected Show Filter. The Month(Go Report Date) filter appears on the right side of my workspace. Again, by hovering over the title, I clicked the drop-down arrow and selected Single Value (dropdown).
Both of these examples are easy to create and provide an interesting picture of the data employed in the analysis. The concepts and techniques used to create the visualizations are easily transferable to other analysis. The example is only the tip of the iceberg with what is possible with Tableau. I find the time savings to be well worth the time it takes to learn the application. If you want to learn more about how to use Tableau, I’d suggest exploring the Tableau website. There are lots of free training videos to help you get started. https://public.tableau.com/en-us/s/resources