Artificial intelligence in the analysis of customer data – meet Salesforce Einstein Discovery
Salesforce Einstein Discovery is a tool that uses embedded data analysis models supported by artificial intelligence. It is a part of the Salesforce Einstein Analytics package you can read about on our blog.
Einstein Discovery enables Salesforce users to identify patterns and dependencies in the collected data. You do not need to build advanced statistical models to get to know them. Einstein Discovery will analyze the data provided and generate business stories consisting of descriptions and charts. These will allow you to understand what is happening in the company and why, and then help you decide what actions it will be best to take in the future.
Later in the article, we describe Salesforce Einstein Discovery using a company that supplies car parts as an example. Its products are sold to end customers through distributors operating in various regions and industries. In this company, it was noticed that the average margins of the products sold were decreasing, so it was decided to investigate why this is the case.
Getting ready to work with Salesforce Einstein Discovery
Including Einstein Analytics in a production environment requires additional paid licenses. To try out Einstein Discovery’s capabilities (part of the Einstein Analytics package), you must sign up on this website. After completing the form, you get access to the Salesforce Developer Edition environment, containing the Einstein Discovery module.
How does Salesforce Einstein Discovery work?
There are two stages in the process of creating data analysis in Einstein Discovery:
1.preparing the data set: data import, data processing
2.creating the analysis: indicating what we are looking for, setting the analysis parameters
Preparing the data set
The set is all the data we have and we want to use in the analysis.
In order to create a data set, we can use the data that we already have in Salesforce. They can come from any single object – a standard one or created by us, or they can link data of several objects, e.g. business opportunities with the accounts and contacts they concern.
Data import – possible data sources
Salesforce Einstein Discovery also allows you to directly connect and import data from external data sets, such as:
- Microsoft SQL Server
The csv file option is also available. This allows you to import data properly from any source.
After loading the data, Salesforce Einstein Discovery automatically analyzes their quality and suggests what errors they may contain and how to correct them.
Examples of tool suggestions:
- Discovery finds values that are very rare. Their small number makes it very difficult to look for patterns between them and the rest of the data. The tool recommends checking that the values are correct, do not contain typos or are not substitute names for other data. If so, you can easily convert them into valid values. It is also possible to delete the indicated rows from the data set.
- If the record contains the date and time, Discovery suggests that you ascertain whether the exact time is important in our analysis or if it’s better to delete it.
- If the two columns contain the same name, Discovery recommends renaming one of them.
Errors in data
In our example, Discovery says that the name “Bea-rings” occurs in a small number of records. We can then change it to the correct and more common “Bearings”. It would be difficult to detect this manually when operating on tens of thousands of rows of data.
Salesforce Einstein Discovery assigns one of three types independently: text, number or date. In the analyses, Discovery will examine how numbers change over time and depending on the text data corresponding to them. Before confirming the data, make sure that the types have been assigned correctly. For example, keep in mind that by importing a product ID consisting only of numbers, Discovery may mistakenly consider them a number, not text.
In addition, Salesforce Einstein Discovery allows you to:
- add a column containing the average, maximum, minimum, or the result of basic mathematical operations on numeric fields
- for date type fields add or subtract the number of days indicated, check if the value is before, after or equal to the selected, different date
- add additional columns from other previously created data sources
- group the data by indicated fields and values
In our example company, we will import data from a csv file containing 7 columns of data:
- Close Date
- Discount Level
Our company has 32212 rows of data to import.
Creating the analysis
Having a prepared data source, the Salesforce user can run the analysis on the topic of interest in just a few minutes. No technical programming knowledge or statistical data models are necessary.
In order to create it, select the data source and click “Create Story”. We will then be transferred to the settings page, where we indicate which value we want to maximize or minimize.
Additionally, we can:
- indicate which fields of the data source we want to use to build the analysis – by default, all the fields are selected
- if there are date type fields in the source data, we can decide whether we want to analyze the changes of the data value over time and if so, in what range
- indicate the fields that are dependent on us and we can modify them – Einstein Discovery will convert the ones that have the greatest impact on the obtained results
- change the settings of the analysis model: to predicting what results we can obtain in the future, choose the statistical method we want to use.
The selection screen of the main analyzed variable and the selection of the analysis option
After selecting the right parameters, click “Create Story” again. Einstein Discovery will then start analyzing the data. The analysis time depends on the amount of data provided and the options selected.
In the case of the example company, we choose maximizing margin as the most interesting issue. We leave the standard settings for the remaining parameters.
Einstein Discovery analyzes the data
What do we get as a result of the analysis?
Salesforce Einstein Discovery presents results in the form of short, text tips and charts referring to the indicated parameter that we wanted to investigate (maximize or minimize).
The tool selects the tips it considers the most important and creates a business history, indicating what the parameter being tested depends on and how we can influence its value.
In addition, Discovery presents, in accordance with the analysis, less important tips.
We can edit the history by deleting or adding tips to it. When we think that it’s ready and answers the questions of interest to us, we can easily view it as a ready presentation and export it to Microsoft PowerPoint or Word.
We can also choose the question for which we want to see tips – answers. Depending on previously specified settings, we can choose:
- What happened?
- What has changed over time?
- Why did this happen?
- What can happen?How can I improve this?
Einstein Discovery shows how many successive fields affect the indicated parameter. It is possible to compare the results depending on one parameter, a combination of several parameters, a combination of parameters with specific values.
For each result, we’ll get:
- a chart
- the main conclusions in text form
- numerical data showing the value for the selected parameter, the difference from the mean, standard deviation, the number of records
After the first analysis of the car parts manufacturer’s data, Salesforce Einstein Discovery pointed out that the product category had the biggest impact on the margin. The graph and descriptions show that the biggest margins were for safety belts and screws for rotors of brake discs, and the smallest were for alternators and brake pads.
Data analysis – a graph presenting the average margin (vertical axis) depending on the type of product (horizontal axis)
The extreme largest and smallest values are marked in blue.
The product marked by the user is in orange, below is its value, standard deviation, the difference to the average value of all products, and the number of records in which it occurs.
Below the graph are the main conclusions found by Salesforce Einstein Discovery.
In the next analysis – history, Discovery advises to look at one of the distributors, the company Nisizu. It sells the most products; for most, it has above average margins. However, it has poor results for alternators (below the average margin for this product).
Comparison of the margin depending on the type of product (vertical axis), when the distributor is Nisizu (bars on the left side of the given type of product) and all other companies (bars on the right)
In the upper right corner of the analysis, we can change the question to “Why is this happening?” and indicate the product type “Alternator” as the variable tested. We will get a waterfall chart, showing on the extreme bars the average result for all data and the result for the selected parameter, and between them – fields and values that affect the difference in results.
The graph shows that Nisizu is responsible for selling 35% of all products and for 52% of the sold alternators. The low Nisizu margin for alternators can, therefore, have a large impact on the average margin of the entire company.
The left extreme bar is the average margin for all products, the right extreme bar is the average margin of alternators in Nisizu, and between them, there are values showing the reason for the difference. For example, the second bar from the left shows a drop in the margin value if we limit all the records to alternators. The next bar is a drop in value if the customer gets a ‘Gold’ discount.
When we change the question to “How do we improve this?”, Discovery advises Nisizu to change the products from alternators to seat belts. In addition, Discovery notes that the company Dimago has margins above average but a small share in the sale of our company’s products.
After reviewing all of the above stories, we can come to the following conclusions:
- Nisizu is the main distributor, achieving good results, but we need to work with him on sales and the margin for alternators
- marketing expenditures should be increased for the sale of seat belts
- cooperation with the company Dimago should be strengthened
The results of the analysis can be easily transferred to the Management Board. To do this, simply export the charts and tips that have been marked the most important to PowerPoint and you will receive a ready presentation.
Streamlining the analysis
During data analysis, Salesfoce Einstein Discovery further evaluates its quality and indicates ways to improve it. After reviewing the analysis and recommendations for improvement, we can decide if we want to use them. If so, Discovery will perform another analysis.
Examples of improvements:
- Removing columns that copy values, if more than one column contains the same information (e.g. the data source may contain the name and ID associated with it).
- Removing records with values that differ greatly from the majority. These are extreme, almost non-existent cases that may interfere with the mean results, standard deviation.
- Removing fields which have, according to the first analysis, the greatest impact on the tested parameter, and we believe that there is no mathematical dependence between them or whose values are the result of the size of the parameter being investigated.
Recommended ways to improve the analysis
Salesforce Einstein Discovery is a tool for analyzing large data sets, using advanced statistical models supported by artificial intelligence.
These analyses can be prepared by people who do not have specialized technical knowledge, in a shorter time than if these models were built from scratch.
In addition, the results obtained are immediately visualized with charts and descriptions, which significantly helps in drawing conclusions.