Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.
[ DATA SCIENCE Q&A]
Q:Do we always need the intercept term in a regression model?
A: * It guarantees that the residuals have a zero mean
* It guarantees the least squares slopes estimates are unbiased
* the regression line floats up and down, by adjusting the constant, to a point where the mean of the residuals is zero
Everybody has an opinion about Steve Jobs. Please tell me how he has impacted your life in this briefÂ survey.
I have read more about Steve Jobs after his passing than before. The outpouring of emotion and words ofÂ remembrance for him on the Web reflects the impact that he had on people who knew him and people who just used his products. I am part of the latter group.
Writing and Creating
I purchased my first computer, the Macintosh Plus, while I was in graduate school. I was amazed at the things I could do with this machine. I could write, play games (okay, mostly solitaire) and make art. Â I wrote my first book,Â Measuring Customer Satisfaction and Loyalty, on that little magical box. My Mac allowed me to create everything for that book, from text and tables to fancy figures, helping me to describe complex ideas like sampling error. Sixteen years later, those exact figures still appear in the third edition of my book.
That book has greatly impacted my life and career. The process of writing the book helped me through a personal breakup. It helped me learn about the topic on which I was writing. It made me a better writer. The book itself even lead me into a career in helping companies improve the quality of the relationship they have with their customers. Without the computer that Steve Jobs created, I know my life would have been different than what it is today.
Writing and creating art are a big part of my life. To some degree, I have Steve Jobs to thank for that. I created the word cloud you see in this post, combining the words used to describe him after his passing with the image of him on the Apple.com site.Â Â The words are based on many articles/quotes I found online today. Some words represented in this picture are from quotes from President Obama, Mark Zuckerberg, Guy Kawasaki, and Bill Gates, to name a few. The larger the font size, the more frequently that word was used to describe him.Â This picture represents how people define him, remember him.
I will leave you with words from Steve Jobs. I recently watched a recording of his 2005 commencement address to the graduating class of Stanford. While I enjoyed his entire address, one particular passage resonated with me.
“Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because almost everything â all external expectations, all pride, all fear of embarrassment or failure – these things just fall away in the face of death, leaving only what is truly important. Remembering that you are going to die is the best way I know to avoid the trap of thinking you have something to lose. You are already naked. There is no reason not to follow your heart.”
Strong business case could save your project
Like anything in corporate culture, the project is oftentimes about the business, not the technology. With data analysis, the same type of thinking goes. It’s not always about the technicality but about the business implications. Data science project success criteria should include project management success criteria as well. This will ensure smooth adoption, easy buy-ins, room for wins and co-operating stakeholders. So, a good data scientist should also possess some qualities of a good project manager.
[ DATA SCIENCE Q&A]
Q:Is it better to design robust or accurate algorithms?
A: A. The ultimate goal is to design systems with good generalization capacity, that is, systems that correctly identify patterns in data instances not seen before
B. The generalization performance of a learning system strongly depends on the complexity of the model assumed
C. If the model is too simple, the system can only capture the actual data regularities in a rough manner. In this case, the system poor generalization properties and is said to suffer from underfitting
D. By contrast, when the model is too complex, the system can identify accidental patterns in the training data that need not be present in the test set. These spurious patterns can be the result of random fluctuations or of measurement errors during the data collection process. In this case, the generalization capacity of the learning system is also poor. The learning system is said to be affected by overfitting
E. Spurious patterns, which are only present by accident in the data, tend to have complex forms. This is the idea behind the principle of Occams razor for avoiding overfitting: simpler models are preferred if more complex models do not significantly improve the quality of the description for the observations
Quick response: Occams Razor. It depends on the learning task. Choose the right balance
F. Ensemble learning can help balancing bias/variance (several weak learners together = strong learner) Source
A quarter of decision-makers surveyed predict that data volumes in their companies will rise by more than 60 per cent by the end of 2014, with the average of all respondents anticipating a growth of no less than 42 per cent.
This is Part 2 of a series on the Development of the Customer Sentiment Index (see introduction, and Part 1). The CSI assessesÂ the extentÂ to which customers describe your company/brand with words that reflectÂ positive or negativeÂ sentiment. This post covers the development of a judgment-basedÂ sentiment lexicon and compares it to empirically-based sentiment lexicons.
Last week, I created four sentiment lexicons for use in a new customer experience (CX) metric, the Customer Sentiment Index (CSI). The four sentiment lexicons were empirically derived using data from a variety of online review sites from IMDB, Goodreads, OpenTable and Amazon/Tripadvisor.Â This week, I develop a sentiment lexicon using a non-empirical approach.
Human JudgmentÂ Approach to Sentiment Classification
TheÂ judgment-based approach does not rely on data to derive the sentiment values; rather this method requires the use of subject matter experts toÂ classify words into sentiment categories. This approach is time-consuming, requiring the subject matter experts to manually classify each of the thousands of words in ourÂ empirically-derived lexicons. To minimize the work required by the subject matter experts, an initial set of opinion words were generated using two studies.
In the first study, as part of an annual customer survey, a B2B technology company included an open-ended survey question, “Using one word, please describe COMPANY’S products/services.” From 1619Â completed surveys, 894 customers provided an answer for the question.Â Many respondents used multiple words or the company’s name as their response, reducing the number of useful responses to be 689.Â Of these respondents,Â a total of 251 usable unique words were used by respondents.
Also, the customer survey included questions that required customers to provide ratings on measures of customer loyalty (e.g., overall satisfaction, likelihood to recommend, likelihood to buy different products, likelihood to renew) and satisfaction with the customer experience (e.g., product quality, sales process, ease of doing business, technical support).
In the second study, as part of a customer relationship survey, I solicited responses from customers of wireless service providers (B2C sample).Â The sample was obtained using Mechanical Turk by recruiting English-speaking participants to complete a short customer survey about their experience with their wireless service provider. In addition toÂ the standard rated questions in the customer survey (e.g., customer loyalty, CX ratings),Â the following question was used to generate the one word opinion: “What one word best describes COMPANY? Please answer this question using one word.”
From 469Â completed surveys, 429Â customers provided an answer for the question, Many respondents used multiple words or the company’s name as their response, reducing the number of useful responses to be 319. Of these respondents,Â a total of 85 usable unique words were used by respondents.
Sentiment Rating of Opinion Words
The list of customer-generated words for each sample wasÂ independently rated by the two experts. I was one of those experts. My good friend and colleague was the other expert. We bothÂ hold a PhD in industrial-organizational psychology and specialize in test development (him) and survey development (me). We have extensive graduate-levelÂ training on the topics ofÂ statistics and psychological measurement principles. Also, we have applied experience, helping companies gain value from psychological measurements. We each haveÂ over 20 years of experience in developing/validating tests and surveys.
For eachÂ list of words (N = 251 and N = 85), each expertÂ was given the list of words and wasÂ instructed to “rate each word on a scale from 0 to 10; where 0 is most negative sentiment/opinion and 10 is most positive sentiment/opinion; and 5 is the midpoint.” After providing their first rating of each word, each of the two raters were then given the opportunity to adjust their initial ratings for each word. For this process, each rater was given the list of 251 words with their initial rating and were asked to make any adjustments to their initial ratings.
Results of Human Judgment Approach to Sentiment Classification
Descriptive statistics of and correlations among the expert-derived sentiment values of customer-generated words appears in Table 1. As you can see, the two raters assignÂ very similar sentiment ratings to words for both sets. Average ratings were similar. Also, the inter-rater agreement between the two raters for the 251 words was r = .87 and for the 85 words was .88.
After slight adjustments, the inter-rater agreement between the two raters improved to r = .90 for the list of 251 words and .92 for the list of 85 words. This high inter-rater agreement indicated that the raters were consistent in their interpretation of the two lists of words with respect to sentiment.
Because of the high agreement between the raters and comparable means between raters, an overall sentiment score for each word was calculated as the average of the raters’ second/adjusted rating (See Table 1 or Figure 2 for descriptive statistics for this metric).
Comparing Empirically-Derived and Expert-Derived Sentiment
In all, I have created five lexicons; four lexicons are derived empirically from four data sources (i.e., OpenTable, Amazon/Tripadvisor, Goodreads and IMDB) and one lexicon is derived using subject matter experts’ sentiment classification.
I compared these five lexicons to better understand the similarity and differences of each lexicon. I applied the four empirically-derived lexicons to each list of customer-generated words. So, in all, for each list of words, I have 5 sentiment scores.
The descriptive statistics of and correlations among the five sentiment scores for the 251 customer-generated words appears in Table 2. Table 3 houses the information for the 85 customer-generated words.
As you can see, there is high agreement among the empirically-derived lexicons (average correlation = .65 for the list of 251 words and .79 for the list of 85 words.
There are statistically significant mean differences across the empirically-derived lexicons; Amazon/Tripadvisor has the highest average sentiment value and Goodreads has the lowest. Lexicons from IMDB and OpenTable provide similar means. The expert judgment lexicon provides the lowest average sentiment ratings for each list of customer-generated words. The absolute sentiment value of a word is dependent on the sentiment lexicon you use. So, pick a lexicon and use it consistently;Â changing your lexicon couldÂ change your metric.
Looking at the the correlations of the expert-derived sentiments with each of the empirically-derived sentiment, we see that OpenTable lexicon had higher correlation with the experts compared toÂ the Goodreads lexicon. The pattern of results make sense. The OpenTable sample is much more similar to the sample on which the experts provided their sentiment ratings. OpenTableÂ represents a customer/supplier relationship regarding a service while the Goodreads’ sample represents a different type of relationship (customer/book quality).
Summary and Conclusions
These two studies demonstrated that subject matter experts are able to scale words along a sentiment scale. There was high agreement among the experts in their classification.
Additionally, these judgment-derived lexicons were very similar to fourÂ empirically derived lexicons.Â Lexicons based on subject matter experts’ sentiment classification/scaling of wordsÂ are highly correlated to empirically-derived lexicons.Â It appears that each of the five sentiment lexicons tells you roughly the same thing as the other lexicons.
The empirically-derived lexicons are less comprehensive than the subject matter experts’ lexicons regarding customer-generated words. By design, the subject matter experts classified all words that were generated by customers; some of the words that were used by the customers do not appear in the empirically-derived lexicons. For example, the OpenTable lexicon only represents 65% (164/251) of the customer-generated words for Study 1 and 71% (60/85) of the customer-generated words for Study 2. Using empirically-derived lexicons for the purposes of calculatingÂ the Customer Sentiment IndexÂ couldÂ be augmented using lexicons that are based on subject matter experts’ classification/scaling of words.
In the next post, I will continue presenting information about the validating the Customer Sentiment Index (CSI). So far, the analysisÂ shows that the sentiment scores of the CSI are reliable (we get similar results using different lexicons). We now need to understand what the CSI is measuring. I will show this by examining the correlation of the CSI with other commonly used customer metrics, including likelihood to recommend (e.g., NPS), overall satisfaction and CX ratings of important customer touch points (e.g., product quality, customer service). Examining correlations of this nature will also shed light on the usefulness of the CSI in a business setting.
A confluence of new capabilities is creating an innovative, more precise approach to performance improvement. New approaches include advanced analytics, refined sales competency and behavioral models, adaptive learning, and multiple forms of technology enablement. In a prior post (The Myth of the Ideal Sales Profile) we explored an emerging new paradigm that is disrupting traditional thinking with respect to best practices: the world according to YOU.
However, with only 17% of sales organizations leveraging sales talent analytics (TDWI Research), it seems that most CSOâs and their HR business partners are gambling â using intuition as the basis for making substantial investments in sales development initiatives. If the gamble doesnât pay off, then the investment is wasted.
Is your sales talent aligned to your companyâs strategy of increasing revenue? According to the Conference Board, 73% of CEOâs say no. This lack of alignment is the main reason why 86% of CSOâs expect to miss their 2015 revenue targets (CSO Insights). The ability to properly align your sales talent to your companyâs business goals is the difference between being in the 86% or the 14%.
What Happens When You Assume?
Historically, sales and Human Resource leaders based sales talent alignment decisions — both development of the existing team and acquisition of future talent — on assumptions and somewhat subjective data.
Common practices include:
Polling the field to determine the focus for sales training
Hiring sales talent based largely on the subjective opinion of interviewers
Defining your âideal seller profileâ based on the guidance of industry pundits
Making a hiring decision based on the fact that the candidate made Achieverâs Club 3 of the last 5 years at their previous company
Deploying a sales training program based on what a colleague did at their last company
Aligning sales talent based on any of the above is likely to land your company in the 86% because these approaches fail far more times than they succeed. They fail to consider the many cause-and-effect elements that impact success in your company, in your markets, for your products, and for your customers. As proof of their low success rate, a groundbreaking study by ES Research found that 90% of sales training [development initiatives] had no lasting impact after 120 days. And the news isnât any better when it comes to sales talent acquisition; Accenture reports that the average ramp-up time for new reps is 7-12 months.
Defining YOUR Ideal Seller Profile(s)
So how does your organization begin to apply the ânew wayâ (see illustration below) as an approach to optimize sales performance? It begins with zeroing in on the capabilities of your salespeople that align most closely to the specific goals of your business. In essence, it means understanding what the YOUR ideal seller profiles are.
Applying the new way begins with specific business goals of your company. What if market share growth was the preeminent strategic goal for your organization? Would it not be extremely valuable to understand which sales competencies were most likely to impact that aspect of your corporate strategy? The obvious answer is yes; and the obvious question is how align and optimize sales to drive increased market share?
How does a CSO identify where to target development in order to have the biggest impact on business results?
By using facts as the basis for these substantial investments. Obtaining facts requires several essential ingredients. The first is a rigorous, comprehensive model for sales competencies; that is, a well-defined model of âwhat good looks likeâ for a broad range of sales competencies. This model can be adapted for a specific selling organization, and provides the baseline sales-specific assessments (personality, knowledge, cognitive ability, behavior, etc.).
Then, by applying advanced analytics, including Structural Equations Modeling (SEM) â we can begin to identify cause-effect relationships between specific competencies and the metrics and goals of YOUR organization. With SEM, CSOâs can statistically identify the knowledge and behavior that set top-performers apart from the rest of their team. With this valuable insight, the organization can now align both talent development and acquisition to the companyâs most important business goals.
Sales Talent Analytics Provide Proof
Times have changed. The days of aligning sales talent based on gut feel, assumptions or generally accepted best-practices are over. By leveraging sales talent analytics, todayâs sales leader can apply a proven 3-step approach to stop gambling and get the facts to statistically pinpoint where to focus development of the sales team, quantifiably measure the business impact / ROI of that development, and improve the quality of new hires. But buyer beware; not all analytical approaches are equal. The vast majority leverage correlation-based analytics which can lead to erroneous conclusions.
By the way weâre not eschewing well designed research that provides insights into broader application of best practices. Aberdeen Group found that best-in-class sales teams that leverage data and analytics increased team quota attainment 12.3% YOY (vs. 1% for an average company) and increased average deal size 8% YOY (vs. 0.8%)
Itâs time to define the ideal seller profile for YOUR company. In our next post in this series, we answer the question â how do we capitalize on that understanding to drive the highest impact on our business goals?
Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.
[ DATA SCIENCE Q&A]
Q:What does NLP stand for?
A: * Interaction with human (natural) and computers languages
* Involves natural language understanding
– Machine translation
– Question answering: whats the capital of Canada?
– Sentiment analysis: extract subjective information from a set of documents, identify trends or public opinions in the social media
Technological innovation has given Human Resources the ability to predict the future â and has moved HR into the boardroom. But itâs up to data-savvy HR professionals to make that move permanent.
âWhy should I, as a managing director or a CEO, give as much credibility to HR as I do to finance, operations, procurement, sales and marketing?âPersonnel Today asked last week. â[Because] those functions are data led; they can provide me with numeric business cases, forecasts and scenarios; [and] I know where I stand with them.â
âThis is an absolutely exciting time to be in human resources,â SAPâs David Swanson said Monday in Las Vegas, ahead of SuccessConnect 2015. âIâve been in HR for the better part of 20 years, and I really feel that for the first time HR is front and center at the executive table.â
CEOs want to learn from past hiring successes and failures, a job thatâs perfect for analytics-enabled HR departments, according to Swanson. The end goal is to use analytics for predicting the future, knowing whom to hire â and which new hires will most quickly become productive.
Swanson leads a global team of product evangelists. He and his teammates explain to customers and prospects how his company runs cloud-based human capital management software.
âOne of the beauties of my role is that I get to go out and talk to companies about how we use theSuccessFactors solutions â¦ to really take a look at whatâs making a difference in the workplace,â Swanson said. âWe can use it to predict future success â versus just saying, âWell, I think this might happen.ââ
That data-driven confidence will help HR professionals identify behaviors and interview styles that attract better employees, as well as qualities that make effective workers â and lead to faster promotions.
HRâs Turning Point
The promise of big data and analytics brought HR to the table, but that promise alone wonât keep them there. HR professionals must learn to embrace the technology â and wield it effectively.
âWe have an opportunity to help make the strategy, not just execute the strategy,â Swanson said. âThe way that we can do that is using data and analytics to be able to predict success.â
A successfulÂ customer experience managementÂ (CEM) programÂ requires the collection, synthesis, analysis and dissemination of customer metrics. Â Customer metrics areÂ numerical scores or indicesÂ that summarize customer feedback results for a given customer group or segment. Customer metrics are typically calculated usingÂ customer ratings of survey questions.Â I recently wrote about how you canÂ evaluate the quality of your customer metricsÂ and listedÂ four questions you need to ask, including howÂ the customer metric is calculated. Â There needs to be a clear, logical method ofÂ how the metric is calculated, including all items (if there are multiple items) and how they are combined.
Calculating Likelihood to Recommend Customer Metric
Let’s say that we conducted a survey asking customers the following question: “How likely are you recommend COMPANY ABC to your friends/colleagues?” Using a rating scale from 0 (not at all likely) to 10 (extremely likely), customers are asked to provide their loyalty rating. Â How should you calculate a metric to summarize the responses? What approach gives you the most information about the responses?
There are different ways to summarize these responses to arrive at a customer metric. Four common ways to calculate a metric are:
Mean Score: Â This is the arithmetic average of the set of responses. The mean is calculated by summing all responses and dividing by the number of responses. Possible scores can range from 0 to 10.
Top Box Score: The top box score represents the percentage of respondents who gave the best responses (either a 9 and 10 on a 0-10 scale). Possible percentage scores can range from 0 to 100.
Bottom Box Score: The bottom box score represents the percentage of respondents who gave the worst responses (0 through 6 on a 0-10 scale). Possible percentage scores can range from 0 to 100.
Net Score: The net score represents the difference between the Top Box Score and the Bottom Box Score. Net scores can range from -100 to 100. While the net score was made popular by the Net Promoter Score camp, others have used a net score to calculate a metric (please seeÂ Net Value Score.) While the details might be different, net scores take the same general approach in their calculations (percent of good responses – percent of bad responses). For the remainder, I will focus on the Net Promoter Score methodology.
Comparing the Customer Metrics
To study these four different ways to summarize the “Likelihood to recommend” question, I wanted to examine how these metrics varied over different companies/brands. Toward that end, I re-used some prior research data by combining responses across three data sets. Each data set is from an independent study about consumer attitudes toward either their PC Manufacturer or Wireless Service Provider. Here are the specifics for each study:
PC manufacturer:Â Survey of 1058 general US consumers in Aug 2007 about their PC manufacturer.Â All respondents for this study were interviewed to ensure they met the correct profiling criteria, and were rewarded with an incentive for filling out the survey. Respondents were ages 18 and older.Â GMI (Global Market Insite, Inc.,Â www.gmi-mr.com) provided the respondent panels and the online data collection methodology.
Wireless service provider: Survey of 994 US general consumers in June 2007 about their wireless provider.Â All respondents were from a panel of General Consumers in the United States ages 18 and older. The potential respondents were selected from a general panel which is recruited in a double opt-in process; all respondents were interviewed to ensure they meet correct profiling criteria. Respondents were given an incentive on a per-survey basis. GMI (Global Market Insite, Inc.,Â www.gmi-mr.com) provided the respondent panels and the online data collection methodology.
Wireless service providers: Survey of 5686 worldwide consumers from Spring 2010 about their wireless provider. All respondents for this study were rewarded with an incentive for filling out the survey. Respondents were ages 18 or older.Â Mob4Hire (www.mob4hire.com)Â Â provided the respondent panels and the online data collection methodology.
From these three studies across nearly 8000 respondents, I was able to calculate the four customer metrics for 48 different brands/companies. Â Companies that had 30 or more responses were used for the analyses. Of the 48 different brands, most were from the Wireless Service provider industry (N = 41). The remaining seven were from the PC industry. Each of these 48 brands had four different metrics calculated on the “Recommend” question. The descriptive statistics of the four metrics and the correlations across the 48 brands appear in Table 1.
As you can see in Table 1, the four different customer metrics are highly related to each other. The correlations among the metrics vary from .85 to .97 (the negative correlations with Bottom 7 Box indicate that the bottom box score is a measure of badness; higher scores indicate more negative customer responses).
These extremely high correlations tell us that these four metrics tell us roughly the same thing about the 48 brands. That is, brands with high Mean Scores are those that are getting high Net Scores, high Top Box Scores and Low Bottom Box scores. These are overly redundant metrics.
When you plot the relationship between the Mean Scores and Net Scores, you can clearly see the close relationship between the two metrics (see Figure 1.). In fact, the relationship between the Mean Score and NPS is so high, that you can, with great accuracy, predict your NPS score (y) from your Mean Score (x) using the regression equation in Figure 1.
Mean Score vs Net Promoter Score vs Top/Bottom Box
The “Likelihood to Recommend” question is a commonly used question in customer surveys. I use it as part of a larger set of customer loyalty questions. What is the most efficient way to summarize the results? Based on the analyses, here are some conclusions regarding the different methods.
1. NPS does not provide any additional insight beyond what we know by the Mean Score.Â Recall that the correlation between the Mean Score and the NPS across the 48 brands was .97! Â Both metrics are telling you the same thing about how the brands are ranked relative to each other.Â The mean score uses all the data to calculate the metric while the NPS ignores specific customer segments. So, what is the value of the NPS?
2. NPS score is ambiguous/difficult to interpret.Â An NPS value of 15 could be derived from a different combination of promoters and detractors. For example, one company could arrive at an NPS of 15 with 40% promoters and 25% detractors while another company could arrive at the same NPS score of 15 with 20% promoters and 5% detractors. Are these two companies with the same NPS score really the same?
Also, more importantly, the ambiguity of the NPS lies in the lack of a scale of measurement. While the calculation of the NPS is fairly straightforward (e.g., take the difference of two values to arrive at a score), the score itself becomes meaningless because the difference transformation creates an entirely new scale that ranges from -100% to 100%. So, what does a score of zero (0) indicate? Is that a bad score? Does that mean aÂ majority of your customers would not recommend you?
Understanding what an NPS of zero (0) indicates can only occur when you map the NPS value back to the original scale of measurement (0 to 10 likelihood scale). A scatterplot (and corresponding regression equation) of NPS and Mean Score is presented in Figure 2. If we plug zero (0) into the equation, your expected Mean Score would be 7.1, indicating that aÂ majority of your customers would recommend youÂ (mean score is above the midpoint of the rating scale). If you know your NPS score, you can estimate your mean score using this formula. Even though it is based on a narrowly defined sample, I think the regression model is more a function of the constraints of the calculations than a characteristic of the sample. I think it will provide some good approximation. If you try it, let me know how how accurate it is.
3. Top/Bottom Box provides information about clearly defined customer segments.Â Segmenting customers based on their survey responses makes good measurement and business sense. Using top box and bottom box methods helps you create customer segments (e.g., disloyal, loyal, very loyal) that have meaningful differences across segments in driving business growth. So, rather than creating a net score from the customer segments (see number 2), you are better off simply reporting the absolute percentages of the customer segments.
Communicating survey results requires the use of summary metrics. Summary metrics are used to track progress and benchmark against loyalty leaders. There are a variety of ways to calculate summary metrics (e.g., mean score, top box, bottom box, net score), yet the results of my analyses show that these metrics are telling you the same thing. All metrics were highly correlated with each other.
There are clear limitations to the NPS metric. The NPS does not provide any additional insight about customer loyalty beyond what the mean score tells us. Â The NPS is ambiguous and difficult to interpret. Without a clear unit of measurement for the difference score, the meaning of an NPS score (say, 24) is unclear. The components of the NPS, however, are useful to know.
I typically report survey results using mean scores and top/middle/bottom box results. I find that combining these methods help paint a comprehensive picture of customer loyalty. Figure 3 includes a graph that summarizes the results of responses across three different types of customer loyalty. I never report Net Scores as they do not provide any additional insight beyond the mean score or the customer segment scores.
Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.
[ DATA SCIENCE Q&A]
Q:What is cross-validation? How to do it right?
A: It’s a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. Mainly used in settings where the goal is prediction and one wants to estimate how accurately a model will perform in practice. The goal of cross-validation is to define a data set to test the model in the training phase (i.e. validation data set) in order to limit problems like overfitting, and get an insight on how the model will generalize to an independent data set.
the training and validation data sets have to be drawn from the same population
predicting stock prices: trained for a certain 5-year period, it’s unrealistic to treat the subsequent 5-year a draw from the same population
common mistake: for instance the step of choosing the kernel parameters of a SVM should be cross-validated as well
Bias-variance trade-off for k-fold cross validation:
Leave-one-out cross-validation: gives approximately unbiased estimates of the test error since each training set contains almost the entire data set (n?1n?1 observations).
But: we average the outputs of n fitted models, each of which is trained on an almost identical set of observations hence the outputs are highly correlated. Since the variance of a mean of quantities increases when correlation of these quantities increase, the test error estimate from a LOOCV has higher variance than the one obtained with k-fold cross validation
Typically, we choose k=5 or k=10, as these values have been shown empirically to yield test error estimates that suffer neither from excessively high bias nor high variance. Source