Concepts of Tableau
Basic Theory and Terminologies of Tableau data visualization software
While accessing Tableau basic concepts, I found it difficult to get the relevant content in a single place or in a single website, so I came up with an idea to curate it in the form of simple and yet effective reading material.
Data visualization is the graphical representation of information and data. Data visualization uses statistical graphics, plots, information graphics, and other tools to communicate information clearly and efficiently. Effective visualization helps users analyze and reason about data and evidence. It makes complex data more accessible, understandable, and usable.
Tableau Software is an interactive data visualization software company focused on business intelligence. Tableau is one of the data visualization tools which uses visual elements like charts, graphs, and maps, and provides an accessible way to see and understand trends, outliers, and patterns in data. In the commercial environment, data visualization is often referred to as dashboards. Infographics are another very common form of data visualization.
A dashboard is a presentation of any number of related views and other elements (such as text or images) arranged together as a cohesive whole to communicate a message to an audience. Dashboards are often designed to be interactive.
In this article, we will have an overview of basic concepts and terminologies associated with Tableau Desktop. The purpose is to cover limited and most important topics, not the exhaustive topics of Tableau, in a short span of time.
*The scope is not to cover all the concepts.
Tableau Query Language
The unique and exciting experience of working with data in Tableau is a result of (VizQL, Visual Query Language).
Based on the requirement, it can then be translated into flavors of SQL (Structured Query Language), MDX (Multidimensional Expressions), and TQL (Tableau Query Language, used for extracted data). Tableau automatically performs the translation of VizQL into a native query to be run by the source data engine.
The web data connector allows you to write a connector for any online data you wish to retrieve. Additionally, for any database without a native connection, Tableau gives you the ability to use a generic ODBC connection. The Extract API allows you to programmatically extract and combine any data sources for use in Tableau.
Measure & Dimension
- Measures are values that are aggregated. For example, they are summed, averaged, counted, or have a minimum or a maximum.
- Dimensions are values that determine the level of detail at which measures are aggregated. The combination of dimensions used in the view defines the view’s basic level of detail.
Live & Extract data connection
Live connections offer the convenience of real-time updates, with any changes in the data source reflected in Tableau. But live connections also rely on the database for all queries. With live connections, your data queries are only as fast as the database itself.
Tableau Data Extracts (.tde or .hyper file) are snapshots of data optimized for aggregation and loaded into system memory to be quickly recalled for visualization. Extracts tend to be much faster than live connections, especially in more complex visualizations with large data sets, filters, calculations, etc.
You may choose to use extracts to increase performance over traditional databases as live connection databases are not always optimized for fast performance.
Extracts are advantageous for several reasons:
• Supports large data sets: You can create extracts that contain billions of rows of data.
• Help improve performance: When you interact with views that use extract data sources, you generally experience better performance than when interacting with views based on connections to the original data.
• Support additional functionality: Extracts allow you to take advantage of Tableau functionality that’s not available or supported by the original data, such as the ability to compute Count Distinct.
- Offline data access: Extract provides offline access to the data.
- Aggregate data to be used over transaction data.
Join & Blend
Joins are performed to link tables of data together on a row-by-row basis.
From Tableau 10, it is possible to join tables of data across various data connections for many different data sources.
Joining across different data connections is referred to as a cross-database join. For example, you can join SQL Server tables with text files or Excel files, or tables in one database with tables in another, even if those are on a different server.
Blends are performed to link multiple data sources at an aggregate level.
It allows you to use data from multiple data sources in the same view. Often these sources may be different types. For example, you can blend data from Oracle with data from Excel. Data blending is done at an aggregate level and involves different queries sent to each data source, unlike joining, which is done at a row-level and involves a single query to a single data source.
Data blending occurs by issuing two separate queries and then blending the aggregate results. Data blending simulates a traditional left join.
If you’re working with large sets of data, joins can put a strain on the database and significantly affect performance. In this case, data blending might help. Because Tableau handles combining the data after the data is aggregated, there are fewer data to combine.
Filter
The filtering capability can serve a variety of purposes including minimizing the size of the data for efficiency purposes, cleaning up underlying data, removing irrelevant dimension members, and setting measure or date ranges for what you want to analyze.
6 types of Filters
i. Data Source Filters are useful when you want to limit your analysis to a subset of data, used mainly to restrict sensitive data from the data viewers. These filters are applied before any other filters.
Custom SQL Filter can be accomplished using a live connection with custom SQL, which has a Tableau parameter in the WHERE clause.
ii. Extract Filter limits the data that is stored in an extract (.tde or .hyper). Extract filters are used to filter the extracted data from the data source saving a snapshot of its current state in your workbook and hence reducing the number of times Tableau queries the data source.
iii. Context Filter is a discrete, independent filter on its own, creating datasets based on the original datasheet and the preset chosen for compiling the data. Any other filters that you set are defined as dependent filters because they process only the data that passes through the context filter, and the context filter would ensure that it is first to get processed.
iv. Dimension Filter is another name for non-aggregated filters (blue pills), such as Dimensions, Groups, Bins, Sets, etc. In the filter dialog that pops up, there are four tabs for General, Wildcard, Condition, and Top.
v. Measure Filters are aggregated filters and are applied after non-aggregated filters. When dragging it on, Tableau will automatically aggregate by Sum. The second step gives four options: Range of values, At least, At most, and Special.
vi. Table Calculation Filters is the last filter applied and it applies to the filter after the view has been created. So, if you want to filter the view without filtering the underlying data, Table Calculations Filters are the way to go. Table Calculations are certain functions used when creating Calculated Fields such as LOOKUP, WINDOW_SUM, WINDOW_AVG, etc.
Quick Filters/Filter can be either Dimension Filter or Measure Filter and can control the appearance and interaction of your filter card in the view by selecting a filter card mode.
User filters secure the row-level data published in a server.
Calculations
Calculations allow you to create new data from data that already exists in your data source, as well as perform computations on your data. This allows you to perform complex analyzes and add fields to your data source on your own and on the fly.
4 types of Calculations
i. Row-level calculations - Basic expressions allow you to transform values or members at the data source level of detail (a row-level calculation).
ii. Aggregate-level calculations - Basic expressions allow you to transform values or members at the visualization level of detail (an aggregate calculation).
iii. Level Of Detail (LOD) calculations - Level of Detail or LOD is one of the types of tableau calculations. LOD expressions allow you to compute values at the data source level and the visualization level and can be computed at a level or level (dimension) using LOD.
3 Types of LOD calculations
a. Fixed - Compute a value using the specified dimensions, without reference to the dimensions in the view.
b. Include - Compute a value using the specified dimensions in addition to whatever dimensions are in the view.
c. Exclude - Compute a value not using the specified dimensions and declare dimensions to omit from the view level of detail.
- The syntax for LOD calculation-
{FIXED|INCLUDE|EXCLUDE [Dim 1],[Dim 2],..,[Dim n]: AGG([Measure])}
iv. Table calculations - Table calculations allow you to transform values at the level of detail of the visualization only. It enables solutions that really couldn’t be achieved any other way (writing a custom application or complex custom SQL scripts). Table Calculations are performed after the initial query on the aggregate table of data in Tableau’s cache right before the data visualization is rendered.
4 types of Table Calculations
a. Quick Table Calculation: Quick table calculations allow you to quickly apply a common table calculation to your visualization using the most typical settings for that calculation type.
The following quick table calculations are available in Tableau for you to use:
- Running total
- Difference
- Percent difference
- Percent of total
- Rank
- Percentile
- Moving average
- YTD total
- Compound growth rate
- Year of year growth
- YTD growth
b. Relative Table Calculation: The Table Calculation will be computed relative to the layout of the table. Scope and direction are terms that describe how a Table Calculation is computed relative to the table.
Scope options: Table, pane, and cells.
Direction options: Down, across, down then across, and across then down.
c. Fixed Table Calculation: The Table Calculation will be computed using one or more dimensions. Addressing and partitioning are very similar to scope and direction but are most often used to describe how Table Calculations are computed with absolute reference to certain fields in the view.
d. Custom Table Calculation: Custom Table Calculations are one of the most advanced concepts in Tableau. With nested table calculations, you can set Compute Using configurations for individual calculations independently.
The following custom table calculations are available in Tableau for you to use:
- Meta table functions - Index, First, Last, Size, Lookup, and Previous value.
- Rank functions
- Running functions - Running_Count(), Running_Sum(), Running_Avg(), Running_Min() and Running_Max().
- Window functions - Window_Sum, Window_Avg, Window_Max, and Window_Min.
- Total functions - Total(SUM([Sales])) gives the same result as Window_Sum(SUM([Sales])), but Total(AVG([Sales])) will give a different result from Window_AVG(SUM([Sales])) because Total is giving you the actual average of underlying rows, while the Window function is averaging the sums.
- Script functions - These functions allow for integration with the R analytics platform or Python.
- Late filtering (late filter) - LOOKUP(ATTR([Dimension]), 0).
To see the percentage of the total measure for all dimensions, also exclude some from the display.
Action
An action is a user-initiated event that triggers a response from Tableau.
6 types of Actions
i. Filter - The user’s action causes data from one view to filter data in another one or more views.
ii. Highlight - The user’s action causes specific marks and headers to be highlighted in one or more views and dimming others.
iii. Go to URL - The user’s action creates hyperlinks and a specific URL gets opened (either in a browser, a new tab, or in an embedded web object).
iv. Go to Sheet - The user’s action causes navigation to other worksheets, dashboards, or stories.
v. Change Set values - The user’s action changes the value of a set. Sets may be used in calculations, filters, and on shelves to change visual attributes.
vi. Change Parameter - The user’s action changes the value of a parameter. This allows the user to visually interact with parameters.
Running Actions
i. Hover - Rest the pointer over a mark in the view to run the action. This option works well for highlight and filter actions within a dashboard.
ii. Select - Click on a mark in the view to run the action. This option works well for all types of actions.
iii. Menu - Click on a selected mark in the view and then select an option on the custom or user-defined context menu. This option works well for filter and URL actions.
I believe this article will help the individuals, who are trying to get basic theoretical knowledge and familiar with Tableau as a data visualization solution.
Reference -
• Wikipedia
• Tableau