Jun 28, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data analyst  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Why Using the ‘Cloud’ Can Undermine Data Protections by analyticsweekpick

>> Big, Bad Data: How Talent Analytics Will Make It Work In HR by analyticsweekpick

>> The Definitive Guide to Do Data Science for Good by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 Doing Converged Infrastructure Right: A Practical Approach – Data Center Frontier (blog) Under  Data Center

>>
 Machine Learning: Increasing Engagement Within Loyalty Programs – Colloquy.com Under  Machine Learning

>>
 Windows Server 2019 embraces hybrid cloud, hyperconverged data … – Network World Under  Hybrid Cloud

More NEWS ? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:Why is naive Bayes so bad? How would you improve a spam detection algorithm that uses naive Bayes?
A: Naïve: the features are assumed independent/uncorrelated
Assumption not feasible in many cases
Improvement: decorrelate features (covariance matrix into identity matrix)

Source

[ VIDEO OF THE WEEK]

#HumansOfSTEAM feat. Hussain Gadwal, Mechanical Designer via @SciThinkers #STEM #STEAM

 #HumansOfSTEAM feat. Hussain Gadwal, Mechanical Designer via @SciThinkers #STEM #STEAM

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

@AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

 @AnalyticsWeek #FutureOfData with Robin Thottungal(@rathottungal), Chief Data Scientist at @EPA

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

The largest AT&T database boasts titles including the largest volume of data in one unique database (312 terabytes) and the second largest number of rows in a unique database (1.9 trillion), which comprises AT&T’s extensive calling records.

Sourced from: Analytics.CLUB #WEB Newsletter

Four Ways Big Data Can Improve Customer Surveys

Customer surveys remain the primary source of customer feedback for many companies, despite the growth in adoption of other customer feedback sources like social media, call center conversations and emails. Customer surveys typically contain structured questions, asking customers to rate their level of satisfaction with their experience. Two popular customer surveys are relationship and transactional surveys. The primary difference between the two surveys is that relationship surveys measure attitudes about the experience (looking back over a long period) and transactional surveys measure attitudes in the experience (in the moment). These survey approaches are used to help businesses improve strategic and tactical decision-making, respectively.

Big Data

Customer surveys can provide rich information about customers. They provide the data that fuel the customer experience analytics that generate the customer insights to drive the business forward. In this new Big Data world, however, it’s clear that businesses are now able to use different sources of data (e.g., buying behavior, number of calls) as well as the accompanying technology (e.g., quick processing, text analytics) to get additional customer insights. Here are four things companies can do to improve their customer surveys.

1. Have Clear Analytic Goals

Data Definition Framework
Businesses, having access to so much data, can easily get lost in a cycle of never-ending analysis. Be sure to be clear about your analytic goals. Click image to read about the what and where of Big Data.

Companies are merging different data sources together to get better insight about their customers. Companies are analyzing data from CRM systems, public data (e.g., weather), inventory data to make better predictions about their customers, resulting in an explosion of possible analyses they can run on the combined data set. With hundreds of variables to consider, it’s easy to get stuck in a never-ending circle of analysis, not taking action on any given result (aka analysis paralysis).

Be sure to set clear goals for your analysis. Approach your data with a specific problem you are trying to solve and tailor analytics to solve that problem. Generally speaking, analytics is a way of helping you identify how to optimize an outcome, any outcome. Pick your key outcome (e.g., loyalty, sentiment, see point 4 below) and identify the situations where that outcome is optimized. For example, in call centers, looking at relationships between operational metrics (what happens to the customers) and customer satisfaction ratings (what customers feel) can provide new insights about how to improve the operations to increase customer satisfaction.

2. Personalize Reporting

Role-based_reporting
Provide reports to recipients that are relevant to the decisions they have to make.

With more data comes more analytics and metrics. Big Data technology can quickly sift through different metrics and present them in a variety of ways. Because not everybody needs the same reports with the same metrics, make sure your analysis and reporting are tailored to your audiences’ needs. Executives, mid-managers and front-line staff require different types of information to assist them in role-specific decision-making. Executives need high-level reports about their organization to help them make better strategic decisions (e.g., allocating resources to the right areas to improve customer loyalty). Managers will need to a look at the CX results to help them make better tactical decisions (e.g., knowing what to do to improve the customer experience). Still further, front-line staff in call centers may need more detailed, real-time results to manage callers. Different organizational roles require different types of information to assist in different types of decision-making. The process of quickly pulling, integrating and presenting the right summary helps each group make better decisions.

3. Leverage Text Analytics

Sentiment Distributions of Four Lexicons
Text analytics can be used to tell you what customers are talking about and how they feel (e.g., customer sentiment) about it.  Click image to read about the Customer Sentiment Index for customer surveys.

Data can come in two formats, structured and unstructured. Because much of the data generated today is unstructured, companies have developed ways of extracting value from what customers say about them. Text analytics can be applied to customer surveys to help you get insights about a couple of things.

First, text analytics can categorize comments into general themes, helping you identify common customer pain points. Knowing what customers are saying about you is a good first step toward understanding how to improve their experience.

Second, text analytics provide an effective way of extracting customers’ sentiment or attitudes. The use of sentiment analysis has been driven primarily by attempts to extract attitudes from social media sources like Twitter and Facebook. However, businesses can apply sentiment analysis principles in customer surveys to measure customer sentiment, helping them to identify at risk customers.

4. Use Real Loyalty Behaviors in your Analysis

IntegratingBusinessData
Customer loyalty is about what customers actually do. Include objective measures of customer loyalty in your CX analytics.

Customer relationship surveys typically include questions asking customers to indicate their loyalty intentions (e.g., likelihood to… recommend, buy, buy more). In the world of Big Data, companies have a history of customer loyalty behaviors in their CRM systems, making self-reported loyalty questions obsolete. In your analytic efforts, use objective loyalty metrics as the criteria along with CX satisfaction questions as the predictors. We are more interested in predicting real loyalty behaviors than behavioral intentions. If we already have access to the metric of interest (behavior), why ask a question about it (intentions)?

Also, using objective loyalty metrics in your analysis reduces the problem of common method variance that can artificially inflate the correlation between satisfaction and loyalty.  The correlation between CX satisfaction and loyalty is used to denote the importance of satisfaction on improving customer loyalty. The size of the correlation is due, in part, to the fact that the same method is being used to measure both satisfaction and loyalty (i.e., self-reports using a similar rating scale). Because of the problem of common method variance, I think that we have been over-inflating the importance of the customer experience, perhaps contributing to an unrealistic expectation on the part of executives of what CX improvements will do. Maybe the executives’ lack of action stems from their lack of belief of what ROI analyses of CX improvements typically show.

Summary

Customer experience programs need to adapt in this world of Big Data. I took a look at customer surveys to outline some ways companies can optimize this process. Companies need to incorporate four things in their customer survey process: identify business goals of analytics, personalize reports, leverage text analytics and use objective loyalty behaviors in their analysis. Adopting these practices will help improve the value companies can get from their customer surveys.

This article was originally published on CustomerThink.

Source: Four Ways Big Data Can Improve Customer Surveys

Jun 21, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data analyst  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> The Question to Ask Before Hiring a Data Scientist by michael-li

>> Digital Marketing 2.0 – Rise of the predictive analytics by analyticsweekpick

>> When will internet see Zetabyte Era? [Infographic] by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 ‘Sexiest job’ ignites talent wars as demand for data geeks soars – Chicago Tribune Under  Data Scientist

>>
 NB-IoT big guns team up in Europe for international roaming – Siliconrepublic.com Under  IOT

>>
 ‘Killer robots’: AI experts call for boycott over lab at South Korea university – The Guardian Under  Artificial Intelligence

More NEWS ? Click Here

[ FEATURED COURSE]

Machine Learning

image

6.867 is an introductory course on machine learning which gives an overview of many concepts, techniques, and algorithms in machine learning, beginning with topics such as classification and linear regression and ending … more

[ FEATURED READ]

Data Science from Scratch: First Principles with Python

image

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn … more

[ TIPS & TRICKS OF THE WEEK]

Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.

[ DATA SCIENCE Q&A]

Q:How would you come up with a solution to identify plagiarism?
A: * Vector space model approach
* Represent documents (the suspect and original ones) as vectors of terms
* Terms: n-grams; n=1 to as much we can (detect passage plagiarism)
* Measure the similarity between both documents
* Similarity measure: cosine distance, Jaro-Winkler, Jaccard
* Declare plagiarism at a certain threshold

Source

[ VIDEO OF THE WEEK]

Decision-Making: The Last Mile of Analytics and Visualization

 Decision-Making: The Last Mile of Analytics and Visualization

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Without big data, you are blind and deaf and in the middle of a freeway. – Geoffrey Moore

[ PODCAST OF THE WEEK]

#FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

 #FutureOfData with @theClaymethod, @TiVo discussing running analytics in media industry

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

By 2020, we will have over 6.1 billion smartphone users globally (overtaking basic fixed phone subscriptions).

Sourced from: Analytics.CLUB #WEB Newsletter

Simplifying Loyalty Driver Analysis

Customer Experience Management (CEM) programs use customer feedback data to help understand and improve the quality of the customer relationship. In their attempts to improve systemic problems, companies use these data to identify where customer experience improvement efforts will have the greatest return on investment (ROI). Facing a tidal wave of customer feedback data, how do companies make sense of the data deluge? They rely on Loyalty Driver Analysis, a business intelligence solution that distills the feedback data into meaningful information. This method provides most of the insight you need to direct your customer experience improvement efforts to business areas (e.g., product, service, account management, marketing) that matter most to your customers.

The Survey Data

Let’s say our customer experience management program collects customer feedback using a customer relationship survey that measures satisfaction with the customer experience and customer loyalty. Specifically, these measures are:

  1. Satisfaction with the customer experience for each of seven (7) business areas: Measures that assess the quality of the customer experience. I focus on these seven customer experience areas: 1) Ease of Doing Business, 2) Account Management, 3) Overall Product Quality, 4) Customer Service, 5) Technical Support, 6) Communications from the Company and 7) Future Product/Company Direction. Using a 0 (Extremely Dissatisfied) to 10 (Extremely Satisfied) scale, higher ratings indicate a better customer experience (higher satisfaction).
  2. Customer Loyalty: Measures that assess the likelihood of engaging in different types of loyalty behaviors. I use three measures of customer loyalty: 1) Advocacy Loyalty, 2) Purchasing Loyalty and 3) Retention Loyalty. Using a 0 (Not at all likely) to 10 (Extremely likely) scale, higher ratings indicate higher levels of customer loyalty.

Summarizing the Data

You need to understand only two things about each of the seven business areas: 1) How well you are performing in each area and 2) How important each area is in predicting customer loyalty:

  1. Performance:  The level of performance is summarized by a summary statistic. Different approaches provide basically the same results; pick one that senior executives are familiar with and use it. Some use the mean score (sum of all responses divided by the number of respondents). Others use the “top-box” approach which is simply the percent of respondents who gave you a rating of, say, 9 or 10 (on the 0-10 scale).  So, you will calculate seven (7) performance scores, one for each business area. Low scores reflect a poor customer experience while high scores reflect good customer experience.
  2. Impact:  The impact on customer loyalty can be calculated by simply correlating the ratings of the business area with the customer loyalty ratings. This correlation is referred to as the “derived importance” of a particular business area. So, if the survey has measures of seven (7) business areas, we will calculate seven (7) correlations. The correlation between the satisfaction scores of a business area and the loyalty index indicates the degree to which performance on the business area has an impact on customer loyalty behavior. Correlations can be calculated using Excel or any statistical software package. Higher correlations (max is 1.0) indicate a strong relationship between the business area and customer loyalty (e.g., business area is important to customers). Low correlations (near 0.o) indicate a weak relationship between the business area and customer loyalty (e.g., business area is not important to customers).
Figure 1. Loyalty Driver Matrix is a Business Intelligence Solution

Graphing the Results: The Loyalty Driver Matrix

So, we now have the two pieces of information for each business area: 1) Performance and 2) Impact. Using both the performance index and derived importance for a business area, we plot these two pieces of information for each business area.

The abscissa (x-axis) of the Loyalty Driver Matrix is the performance index (e.g., mean score, top box percentage) of the business areas. The ordinate (y-axis) of the Loyalty Driver Matrix is the impact (correlation) of the business area on customer loyalty.

The resulting matrix is referred to as a Loyalty Driver Matrix (see Figure 1). By plotting all seven datapoints, we can visually examine all business areas at one time, relative to each other.

Understanding the Loyalty Driver Matrix: Making Your Business Decisions

The Loyalty Driver Matrix is divided into quadrants using the average score for each of the axes. Each of the business areas will fall into one of the four quadrants. The business decisions you make about improving the customer experience will depend on the quadrant in which each business area falls:

  1. Key Drivers: Business areas that appear in the upper left quadrant are referred to as Key Drivers. Key drivers reflect business areas that have both a high impact on loyalty and have low performance ratings relative to the other business areas. These business areas reflect good areas for potential customer experience improvement efforts because we have ample room for improvement and we know business areas are linked to customer loyalty; when these business areas are improved, you will likely see improvements in customer loyalty (attract new customers, increase purchasing behavior and retain customers).
  2. Hidden Drivers: Business areas that appear in the upper right quadrant are referred to as Hidden Drivers. Hidden drivers reflect business areas that have a high impact on loyalty and have high performance ratings relative to other business areas. These business areas reflect the company’s strengths that keep the customer base loyal. Consider using these business areas in marketing and sales collateral in order to attract new customers, increase purchasing behaviors or retain customers.
  3. Visible Drivers: Business areas that appear in the lower right quadrant are referred to as Visible Drivers. Visible drivers reflect business areas that have a low impact on loyalty and have high performance ratings relative to other business areas. These business areas reflect the company’s strengths. These areas may not impact loyalty but they are areas in which you are performing well. Consider using these business areas in marketing and sales collateral in order to attract new customers.
  4. Weak Drivers: Business areas that appear in the lower left quadrant are referred to as Weak Drivers. Weak drivers reflect business areas that have a low impact on loyalty and have low performance ratings relative to other business areas. These business areas are lowest priorities for investment. They are of low priority because, despite the fact that performance is low in these areas, these areas do not have a substantial impact on whether or not customers will be loyalty toward your product/company.

Example

Figure 2. Loyalty Driver Matrix for Software Company

A software company wanted to understand the health of their customer relationship. Using a customer relationship survey, they collected feedback from nearly 400 of their customers. Applying driver analysis to this set of data resulted in the Loyalty Driver Matrix in Figure 2. The results of this driver analysis shows that Account Management is a key driver of customer loyalty; this business area is the top candidate for potential customer experience improvement efforts; it has a large impact on advocacy loyalty AND there is room for improvement.

While the Loyalty Driver Matrix helps steer you in the right direction with respect to making improvements, you must consider the cost of making improvements. Senior management needs to balance the insights from the feedback results with the cost (labor hours, financial resources) of making improvements happen. Maximizing ROI occurs when you are able to minimize the costs while maximizing customer loyalty. Senior executives of this software company implemented product training for their Account teams. This solution was inexpensive relative to the expected gains they would see in new customer customer growth (driven by advocacy loyalty). Additionally, the company touted the ease of doing business with them as well as the quality of their products, customer service and technical support in their marketing and sales collateral to attract new customers.

Although not presented here, the company also calculated two additional driver matrices based on the results using the other two loyalty indices (purchasing loyalty and retention loyalty). These three Loyalty Driver Matrices provided the foundation for making improvements that would impact different types of customer loyalty.

Summary

Loyalty Driver Analysis is a business intelligence solution that helps companies understand and improve the health of the customer relationship. The Loyalty Driver Matrix is based on two key pieces of information: 1) Performance of the business area and 2) Impact of that business area on customer loyalty. Using these two key pieces of information for each business area, senior executives are able to make better business decisions to improve customer loyalty and accelerate business growth.

Source by bobehayes

Jun 14, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Conditional Risk  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Which Machine Learning to use? A #cheatsheet by v1shal

>> Low Hanging Fruit: Seizing Immediate Business Value from Artificial Intelligence by jelaniharper

>> Join View from the Top 500 (VFT-500) to Share with and Learn from your CEM Peers by bobehayes

Wanna write? Click Here

[ NEWS BYTES]

>>
 Data Science for the 99%: helping everyone with decision-making – National Science Foundation (press release) Under  Data Science

>>
 Marketing analytics startups Radius and … – Business Insider – Business Insider Under  Sales Analytics

>>
 New Study Focusing on Hadoop-as-a-Service Market CAGR of +70% by 2025: Regulative Landscape, Newly Invented … – Business Services Under  Hadoop

More NEWS ? Click Here

[ FEATURED COURSE]

Statistical Thinking and Data Analysis

image

This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more

[ FEATURED READ]

Antifragile: Things That Gain from Disorder

image

Antifragile is a standalone book in Nassim Nicholas Taleb’s landmark Incerto series, an investigation of opacity, luck, uncertainty, probability, human error, risk, and decision-making in a world we don’t understand. The… more

[ TIPS & TRICKS OF THE WEEK]

Winter is coming, warm your Analytics Club
Yes and yes! As we are heading into winter what better way but to talk about our increasing dependence on data analytics to help with our decision making. Data and analytics driven decision making is rapidly sneaking its way into our core corporate DNA and we are not churning practice ground to test those models fast enough. Such snugly looking models have hidden nails which could induce unchartered pain if go unchecked. This is the right time to start thinking about putting Analytics Club[Data Analytics CoE] in your work place to help Lab out the best practices and provide test environment for those models.

[ DATA SCIENCE Q&A]

Q:What is the Law of Large Numbers?
A: * A theorem that describes the result of performing the same experiment a large number of times
* Forms the basis of frequency-style thinking
* It says that the sample mean, the sample variance and the sample standard deviation converge to what they are trying to estimate
* Example: roll a dice, expected value is 3.5. For a large number of experiments, the average converges to 3.5

Source

[ VIDEO OF THE WEEK]

Understanding Data Analytics in Information Security with @JayJarome, @BitSight

 Understanding Data Analytics in Information Security with @JayJarome, @BitSight

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

He uses statistics as a drunken man uses lamp posts—for support rather than for illumination. – Andrew Lang

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Eloy Sasot, News Corp

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Within five years there will be over 50 billion smart connected devices in the world, all developed to collect, analyze and share data.

Sourced from: Analytics.CLUB #WEB Newsletter

Genomics England exploits big data analytics to personalise cancer treatment

In August 2014, UK prime minister David Cameron announced £300m government funding for a four-year project to map 100,000 people’s genomes by 2017. This will be undertaken through a partnership between a government-owned organisation, Genomics England, set up in 2013, and US biotechnology firm Illumina.

The aim of the project is to use big data and genetics to develop personalised medicine. This will not, at least for the time being, mean the development of bespoke treatments for individuals. Instead, it will mean that patients with the same condition, who would previously have received the same therapy, will have the most appropriate from a range of treatments. This entails terms such as “stratified medicine” and “precision medicine“, particularly in the project’s current early stages.

Cancer_cell.jpg

Researchers believe it has enormous potential: “Personalised medicine is the most exciting change in cancer treatment since the invention of chemotherapy,” says Peter Johnson, chief clinician at charity Cancer Research UK.

Anthea Martin, science communications manager at the Cancer Research UK, says standard treatments based on type and stage of cancer fail many: “We know it doesn’t work for everyone, because not everyone survives. Every cancer is different.”

As a result, a few cancers are already treated using basic stratified medicine. Tests showing a harmful mutation in BRCA1 or BRCA2 genes – which produce proteins that suppress tumours – lead some women to have mastectomies to avoid breast cancer. The technique is also used for those who develop breast cancer. The drug Herceptin is only useful for patients with a high level of the HER2 receptor, while Tamoxifen is used for those with high levels of oestrogen or progesterone.

Sequencing cancerous DNA

But cancer researchers are aiming to use the DNA of an individual’s cancer cells, rather than the “germline” – that is, the patient’s original, inherited DNA – for the stratification. This requires the genetic sequencing of the three billion components of the cancer’s DNA, an IT-dependent process which has got much faster and cheaper since the first draft sequence of the human genome was produced in 2001.

“The computer power has advanced, allowing us to analyse that massive pile of data,” says Martin. “All that has combined to let us look at the genetic nature of cancers.”

Cancer Research UK is taking part in the National Lung Matrix Trial, which will see National Health Service (NHS) patients given one of a number of drugs, based on DNA tests of cancerous cells. This will be funded by the charity and pharmaceutical companies AstraZeneca and Pfizer, which have developed the early-stage medicines that will be used in the trial. So far, work has focused on setting up the logistics – including the ability to run the complex tests quickly, send the data back to hospitals and also store it for future research – with patient treatments due to start towards the end of 2014.

Martin says IT is central to testing and research, and the substantial amounts of data involved present their own problems – for example, some of the charity’s researchers send and receive hard drives by post, because computer networks cannot handle transfers fast enough. Such problems are familiar to Jim Davies, professor of software engineering at the University of Oxford and chief technology officer of Genomics England, where he is working to specify the ICT requirements of the organisation.

Working with human genomic sequences of three billion items generates specific challenges. “You’re dealing with files of 150Gbytes each,” says Davis, adding that files of this size bring demands for data transfer capacity only needed in a few other industries, such as film and video production. Furthermore, he says: “The application programming interfaces (APIs) and abstractions of that data are still under development – there is a lot of work going on with global standards.”

This work is taking place through the Global Alliance for Genomics and Health but, until a standard way of abstracting data from a genomic sequence has been decided, “we don’t yet have a definitive stable data management architecture,” he says. There is a similar issue with the annotation of genetic data.

Privacy and data security

Aside from the size of data and the lack of standards, Genomics England has to establish systems that allow researchers to work effectively while protecting the security of people’s genomic and medical data.

“The idea we’d be giving people a download is almost unthinkable,” says Davies. “We have to keep it in a managed environment where access is recorded.” This will take the form of a virtual desktop, where users do not have programmatic access and everything they do is tracked: “The data stays in the datacentre, you take the results.”

This means a challenging procurement process services for Genomics England, although it hopes to have contracts in place by next spring. It may use accredited G-Cloud providers for the datacentres – these are already cleared to hold sensitive personal information – but the quantities of data means services will need to be specifically configured. “We couldn’t stick it on Amazon’s cloud or Azure, as the default configurations of their machines will not match the requirements of the bigger genomic data,” says Davies.

However, he remains confident it will be possible to buy the hosting facility and configure the management of the hardware and lower levels of software for Genomics England through a normal government procurement process, which will start soon.

“At the higher levels of the software stack, we are just going to have to do it with the people who do it at moment,” he says, referring to those who already serve research institutions including Oxford and Cambridge universities, US institutions and the European Bioinformatics Institute, based at the Wellcome Trust Genome Campus near Cambridge.

Genomics England will also support – although not necessarily pay for – the development of the higher-level software needed, and hopes this could help people working in science research programmes who launch startups for their software.

The organisation is also working on how it will handle patient data from NHS trusts, which will be accredited by NHS England, the management body for the health service in England to participate in the work (UK devolution means that Scotland, Wales and Northern Ireland will organise their own health services).

A quarter of the scoring as to whether a trust can take part is based on data informatics, so only trusts with good IT are likely to be involved. NHS England is currently part-funding improvements through its £500m Integrated Digital Care Technology fund, which matches spending by trusts.

Patient data and consent

But that still leaves issues over record standards, Davies says. Many NHS pathology reports are dictated verbally by staff: “While formulaic, the data hasn’t been captured in a structured form,” he notes. And parts of the NHS still have legacy communications technology, meaning data recorded in a computerised form can get lost. Davies knows of the output of one multi-disciplinary NHS cancer team meeting which is recorded in Microsoft Word, faxed to another department, scanned then added to the patient’s record as an image. However, he adds: “There’s a push from NHS England to raise your game if we’re going to accredit you as a genomic medicine centre.”

Privacy campaigners have concerns about such work, which they believe are shared by many. “I think people are generally nervous about this shift to genomic data,” says Phil Booth, co-ordinator of MedConfidential. Genetic data can be used for a wide range of purposes and cover more people than just the individual: “It’s not just your genes, it’s your family’s,” points out Booth, giving the example of the potential of DNA records to reveal previously unknown fathers.

Booth believes combining genetic and patient record information “blows out any anonymisation”, with the full set of data making it possible to work out the patient’s identity, even if names and full addresses are removed. “It’s going to need to be managed very carefully indeed,” he says.

While Genomics England intends to securely store data, Booth says there are issues over how consent will be sought. “Absolute clarity is necessary and absolute rock solid processes have to be in place. It has to be accepted that this stuff cannot be anonymised. It should be consensual, safe and transparent. If you have those three together, people can make an informed choice,” he says.

Genomics England’s Davies agrees the security of patient data is vital: “There is as great an expectation on us to manage the confidentiality and integrity of this data as with patient data in the NHS. We have an extra obligation, as we’re taking it out of the care pathway. There is potential controversy.”

However, he says patients involved so far have been supportive, with particular enthusiasm from patients of ill children wanting to enrol them in the work: “The only negative feedback we’ve had from patients is: ‘Why is this taking so long?’”

Originaly posted via “Genomics England exploits big data analytics to personalise cancer treatment”

 

Source by analyticsweekpick

The Differences Between a Business Analyst & a Data Analyst

he terms business analyst and data analyst are often used interchangeably. In smaller organizations, these positions are indeed the same, and “business analyst” becomes the generic title for tasks that involve data or system analysis. In larger organizations, though the roles sometimes blur in that analysts in both categories access data, what the analysts do with that data is quite different. Their skill-set, and sometimes, even people with whom they work, are also sometimes different.

Business Analyst Tasks

Tasks assigned to business analysts include assessing the requirements of the organization with respect to its operations and functions. The analysts then translate the requirements to physical and financial aspects, such as software and hardware specifications, and often help find the best financing options. The design and implementation of new systems also falls under the job description of business analysts. The lines between a business analyst and a programmer are often blurred as business analysts sometimes provide coding for the new systems and applications they design. Business analysts also test legacy and new systems and recommend changes after their assessment. Thus, their work relates to the organization’s overall systems, ensuring that these systems meet stakeholder requirements.

Business Analyst Characteristics

Business analysts are experts in the industry in which they operate, be it finance, manufacturing, retail or research. For example, a business analyst working in the finance industry must understand calculations for a payback period and internal rate of return, as both calculations are required to calculate returns on an investment. Business analysts must also be comfortable with data manipulation and often must be able to perform analyses using spreadsheets and database tools. Communication skills are critical for business analysts as they must be able to convey technical messages in layman terms so that the receiver can understand, regardless of their technical aptitude. As they must resolve issues relating to end user interactions with computer systems, business analyst must also be strong problem solvers.

Data Analyst Tasks

The main tasks of data analysts are to collect, manipulate and analyze data. They prepare reports, which may be in the form of visualizations such as graphs, charts and dashboards, detailing the significant results they deduced. Data analysts guard and protect the organization’s data, making sure that the data repositories produce consistent, reusable data. They go about doing all these tasks in a technical and systematic way, using the standard formulas and methods as are common in the industry and relevant to the current data. For example, data analysts might perform basic statistics such as variations and averages. They also might predict yields or create and interpret histograms. They use standard methods in all stages including collection, analysis and reporting.

Data Analyst Skills

Like business analysts, data analysts often possess sharp technical acumen complimented by strong industry knowledge. As the gatekeeper of the company’s data, they often have a thorough understanding of relationships that exist among the organization’s various databases. They extract information from these databases using complex query statements and advanced database tools. As industry experts, these data miners use their skills to provide data analyses comparing the company’s results to those of competitors and the industry. Sometimes called statisticians or even data scientists, they draw insights from the organization’s data.

Differences

When the roles are segregated, you’ll find differences in the skillset the analysts employ and people with whom they interact. As the bridge between the business and company’s information system department, business analysts interact with system developers and computer system users, hence the reason they are often called system analysts. Data analysts use their skillset to help the company’s management interpret data and test hypotheses. They use their reports and analyses to help management make decisions and set goals. Business analysts and data analysts must be proficient in modeling, but while business analysts model company infrastructure, data analyst model business data structures.

Originally posted via, “The Differences Between a Business Analyst & a Data Analyst”

Source: The Differences Between a Business Analyst & a Data Analyst

Jun 07, 18: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Data interpretation  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ NEWS BYTES]

>>
 Buried in Data? Prescriptive Analytics Helps Fashion Retailers Sift Through the Numbers – WWD Under  Analytics

>>
 Bitcoin Cash [BCH] to fall below $600? – Sentiment Analysis – April 11 – AMBCrypto Under  Sentiment Analysis

>>
 Prediction, anticipation and influence: The importance of AI and … – MarTech Today Under  Machine Learning

More NEWS ? Click Here

[ FEATURED COURSE]

CS109 Data Science

image

Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data managem… more

[ FEATURED READ]

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World

image

In the world’s top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Mast… more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:What is: collaborative filtering, n-grams, cosine distance?
A: Collaborative filtering:
– Technique used by some recommender systems
– Filtering for information or patterns using techniques involving collaboration of multiple agents: viewpoints, data sources.
1. A user expresses his/her preferences by rating items (movies, CDs.)
2. The system matches this user’s ratings against other users’ and finds people with most similar tastes
3. With similar users, the system recommends items that the similar users have rated highly but not yet being rated by this user

n-grams:
– Contiguous sequence of n items from a given sequence of text or speech
– ‘Andrew is a talented data scientist”
– Bi-gram: ‘Andrew is”, ‘is a”, ‘a talented”.
– Tri-grams: ‘Andrew is a”, ‘is a talented”, ‘a talented data”.
– An n-gram model models sequences using statistical properties of n-grams; see: Shannon Game
– More concisely, n-gram model: P(Xi|Xi?(n?1)…Xi?1): Markov model
– N-gram model: each word depends only on the n?1 last words

Issues:
– when facing infrequent n-grams
– solution: smooth the probability distributions by assigning non-zero probabilities to unseen words or n-grams
– Methods: Good-Turing, Backoff, Kneser-Kney smoothing

Cosine distance:
– How similar are two documents?
– Perfect similarity/agreement: 1
– No agreement : 0 (orthogonality)
– Measures the orientation, not magnitude

Given two vectors A and B representing word frequencies:
cosine-similarity(A,B)=?A,B?/||A||?||B||

Source

[ VIDEO OF THE WEEK]

Agile Data Warehouse Design for Big Data Presentation

 Agile Data Warehouse Design for Big Data Presentation

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

The data fabric is the next middleware. – Todd Papaioannou

[ PODCAST OF THE WEEK]

@ReshanRichards on creating a learning startup for preparing for #FutureOfWork #JobsOfFuture #Podcast

 @ReshanRichards on creating a learning startup for preparing for #FutureOfWork #JobsOfFuture #Podcast

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Every person in the world having more than 215m high-resolution MRI scans a day.

Sourced from: Analytics.CLUB #WEB Newsletter

Know your options when choosing the appropriate social media platforms for advertising

The social media is attractive for marketers not only due to its ability to support marketing but because for advertising too.  Although the social media does not have any direct impact on the search results, yet it can complement SEO in many ways.  In fact, using social media in tandem with search engine optimization campaign constitutes a forceful marketing package. Although you can adopt hundreds of social media marketing strategies, none of them can assure consistent inflow of revenue.  However, social media advertising is so high impacting that it can ensure consistent sales right from the first day of the campaign. It provides unique opportunities for targeting the audience and drives them to take some positive action. By gathering user information from the social media networks, it is possible to create high impacting advertisements based on the user interaction that has high appeal.

The targeted advertising is the specialty of social media that is not possible when you advertise on any other media. Since the advertisements have its roots in the data derived from user interaction on the platform, it caters to specific groups of the audience to satisfy their needs. Interestingly, you achieve it at a much lower acquisition cost.

Why advertise on the social media

The social media platform is unmatched in its ability to provide fast ROI to marketers. Whether you take up content marketing or search engine optimization, you have to give it considerable time for the results to show up.  Quick returns regarding sales are possible in influencer marketing, but despite low efforts, the cost would be high. Moreover, the results do not keep spreading over a period as sales happen against specific posts and there is no assurance of continuity. Some online advertising channels like Adwords can provide returns, but it would take time. When you take the social media advertising route, right from the first day the advertisement appears, you would be receiving consistent sales. The way to do it would become clear on reading this article.

Choose the popular networks

The popular social media platforms from Facebook to Instagram, Twitter to Pinterest and from LinkedIn to Snapchat, which enjoy high popularity, are the best ones for advertising.  All these networks have a high potential for allowing websites to gain traction from advertising.  Once you have tasted success by advertising on the popular social media platforms, then you can try to experiment with some lesser-known networks.  Betting on the popular platforms would be a wise decision because of its proven capabilities in hosting high performing advertisement campaigns. Targeted advertisement saves cost and assures returns as your customer, and follower base keeps increasing. You have better control on the campaigns and can enhance it on an ongoing basis by continuous testing. Scaling up the advertisements is easy, and there is no limit to it.

Facebook advertising

For presenting your business in front of almost a quarter of the world’s population, you have to advertise on Facebook.  It is possible to reach out to almost anyone and everyone by placing ads on Facebook.  For online businesses engaged in e-commerce, Facebook is excellent in generating leads, and each lead could cost less than a dollar.  It is also the go-to place for those seeking email addresses.  From e-books to white papers, from product coupons to limited time offers and from free shipping to discounts across the websites are some typical methods of advertising on Facebook. Once you have the leads, it needs nurturing for familiarization with the brand and products for which you can use an autoresponder.

Instagram advertising

Instagram provides the highest audience engagement rates among all social media networks and is the reason why advertisers vie for hosting their ads on Instagram. It would not be an exaggeration to say that Instagram is the king of social media advertising. The engagement rate is 2000% higher than Twitter and 58% higher than Facebook, and it has an impressive number of active users (800 million). If your products are worthy of creating compelling visuals then Instagram, which uses images and videos, is your best choice. If females and minorities together with the audience belonging to the 18-29 years are your targets, then Instagram is the only choice.

Twitter advertising

Although Twitter is better known for creating organic engagement and does not have the appeal of Facebook and Instagram concerning advertising, it is still a viable option to run paid advertisements on the network.  Riding on the aspect of better organic engagement, Twitter advertising can be quite remunerating.  It is particularly attractive for SMBs because 60% Twitter users prefer to buy from them. Twitter users have 6.9 times more inclination to make online purchases over those who do not use Twitter.

Pinterest advertising

Pinterest is a visual-based social media network that has huge female following. Eighty-one percent of Pinterest followers are women.  Pinterest also creates high engagement and the platform is a favorite for users who have an inclination for creative products and have made up their minds for buying it. If your product is custom-built women-centric and innovative, then you should not miss any opportunity of advertising on Pinterest.

LinkedIn advertising

LinkedIn is an attractive platform for advertising especially for those engaged in B2B marketing.  Among 546 million users as of March 2018, the majority of users (61%) are in the age bracket of 30-64 years. LinkedIn is the place where you find individuals with highest disposable income (average) with 75% users earning $50,000 annually. Here you can look for the highest quality leads is certain industry niche.

Snapchat advertising

Snapchat though a newcomer has made its mark among the leading social media networks. With 41% of Americans in the age group of 18- 34 years using the platform, it is a choice for advertise to take advantage of the outreach.  The pricing though could be concerning to some advertisers.

When the social media advertising is capable of aligning the target market with the user demographics of the platform, it enhances the conversions that lead to better sales.

Source by thomassujain