Tips to Help Get the Most Value from Data Analytics and Database Support

Given the huge amount of data that is being generated today from multiple sources, businesses must learn how to properly analyze this data else it will be useless. However, extracting value from a variety of data types requires adopting diverse sets of data analytic techniques. Good analytics can help improve the performance of a business. But how exactly do you get started with big data analytics? This post looks at some of the best practices that will help achieve success with big data analytics.

Have a business problem in mind

Exploring enormous amounts of data using advanced analytics tools can be fun but this can also be a waste of time and resources for your team if the end results don’t end up translating into something that can help your company solve a problem. This is why before you get started with big data analytics, the first thing you should do is identify the plight of your business.

You have to know the problems that big data analytics can help you solve. This simply means that even before you start considering data analytics, you must make sure that you acquire the right data. For example, the most important source of data for most businesses is consumer transactions. This will give you structured data. Speeches and videos will give you unstructured data which might not even be relevant to your organization.

The rule of thumb here is, before you start with data analytics find out what kind of business challenge or problem you can address with the data you have. You also need to make sure that the data that you end up analyzing is not only accurate but also current and one that offers real insight. You will need reliable database support if you want to deal with quality content at all times.

Focus on deployment

To be able to achieve real value, you have to operationalize the results of big data analytics. The last thing you want is to abandon a project midway. The cost will be immense. The right selection of data is essential. Some data may not be available while other sources may be too expensive to use. There are also industry regulations to keep up with in data gathering. The analytic development team has to consider how the models they choose will be published and used by the customer service, marketing, operation and product development teams. A streamlined analytical method will save you time, money and make analytics easier.

Leverage on innovation in analytics

Keeping up with the trends in big data analytics is a must. It is important that you invest in the right analytics tools and make sure that you are up-to-date with the data processing techniques being used by other analysts. The right tools and infrastructure will make your work easier and the analytics results more valuable.

There is much more that you need to do including leveraging on cloud services, balancing automation with expertise and embracing analytic diversity. When it comes to database management and data analytics, you have to keep learning.

Author Bio:

Sujain Thomas is a data IT professional who works closely with DBA experts to provide her clients with fantastic solutions to their data problems. If you need data IT solutions, she is the person for the job.

Source: Tips to Help Get the Most Value from Data Analytics and Database Support

10 Things to Know about the Technology Acceptance Model

A usable product is a better product.

But even the most usable product isn’t adequate if it doesn’t do what it needs to.

Products, software, websites, and apps need to be both usable and useful for people to “accept” them, both in their personal and professional lives.

That’s the idea behind the influential Technology Acceptance Model (TAM). Here are 10 things to know about the TAM.

1. If you build it, will they come? Fred Davis developed the first incarnation of the Technology Acceptance Model over three decades ago at around the time of the SUS. It was originally part of an MIT dissertation in 1985. The A for “Acceptance” is indicative of why it was developed. Companies wanted to know whether all the investment in new computing technology would be worth it. (This was before the Internet as we know it and before Windows 3.1.) Usage would be a necessary ingredient to assess productivity. Having a reliable and valid measure that could explain and predict usage would be valuable for both software vendors and IT managers.

2. Perceived usefulness and perceived ease of use drive usage. What are the major factors that lead to adoption and usage? There are many variables but two of the biggest factors that emerged from earlier studies were the perception that the technology does something useful (perceived usefulness; U) and that it’s easy to use (perceived ease of use; E). Davis then started with these two constructs as part of the TAM.

Figure 1: Technology Acceptance Model (TAM) from Davis, 1989.

3. Psychometric validation from two studies. To generate items for the TAM, Davis followed the Classical Test Theory (CTT) process of questionnaire construction (similar to our SUPR-Q). He reviewed the literature on technology adoption (from 37 papers) and generated 14 candidate items each for usefulness and ease of use. He tested them in two studies. The first study was a survey of 120 IBM participants on their usage of an email program, which revealed six items for each factor and ruled out negatively worded items that reduced reliability (similar to our findings). The second was a lab-based study with 40 grad students using two IBM graphics programs. This provided 12 items (six for usefulness and six for ease).

       Usefulness Items

1. Using [this product] in my job would enable me to accomplish tasks more quickly.
2. Using [this product] would improve my job performance.*
3. Using [this product] in my job would increase my productivity.*
4. Using [this product] would enhance my effectiveness on the job.*
5. Using [this product] would make it easier to do my job.
6. I would find [this product] useful in my job.*

       Ease of Use Items

7. Learning to operate [this product] would be easy for me.
8. I would find it easy to get [this product] to do what I want it to do.*
9. My interaction with [this product] would be clear and understandable.*
10. I would find [this product] to be flexible to interact with.
11. It would be easy for me to become skillful at using [this product].
12. I would find [this product] easy to use.*

*indicate items that are used in later TAM extensions

4. Response scales can be changed. The first study described by Davis used a 7-point Likert agree/disagree scale, similar to the PSSUQ. For the second study, the scale was changed to a 7-point likelihood scale (from extremely likely to extremely unlikely) with all scale points labeled.

Figure 2: Example of the TAM response scale from Davis, 1989.

Jim Lewis recently tested (in press) four scale variations with 512 IBM users of Notes (yes, TAM and IBM have a long and continued history!). He modified the TAM items to measure actual rather than anticipated experience (see Figure 3 below) and compared different scaling versions. He found no statistical differences in means between the four versions and all predicted likelihood to use equally. But he did find significantly more response errors when the “extremely agree” and “extremely likely” labels were placed on the left. Jim recommended the more familiar agreement scale (with extremely disagree on the left and extremely agree on the right) as shown in Figure 3.

Figure 3: Proposed response scale change by Lewis (in press).

5. It’s an evolving model and not a static questionnaire. The M is for “Model” because the idea is that multiple variables will affect technology adoption, and each is measured using different sets of questions. Academics love models and the reason is that science relies heavily on models to both explain and predict complex outcomes, from the probability of rolling a 6, gravity, and human attitudes. In fact, there are multiple TAMs: the original TAM by Davis, a TAM 2 that includes more constructs put forth by Venkatesh (2000) [pdf], and a TAM 3 (2008) that accounts for even more variables (e.g. subjective norm, job relevance, output quality, and results demonstrability). These extensions to the original TAM model show the increasing desire to explain the adoption (or lack thereof) of technology and to define and measure the many external variables. One finding that has emerged across multiple TAM studies has been that usefulness dominates and ease of use functions through usage. Or as Davis said, “users are often willing to cope with some difficulty of use in a system that provides critically needed functionality.” This can be seen in the original model of TAM in Figure 1 where ease of use operates through usefulness in addition to usage attitudes.

6. Items and scales have changed. In the development of the TAM, Davis winnowed the items from 14 to 6 for the ease and usefulness constructs. The TAM 2 and TAM 3 use only four items per construct (the ones with asterisks above and a new “mental effort” item). In fact, another paper by Davis et al. (1989) also used only four. There’s a need to reduce the number of items because as more variables get added, you have to add more items to measure these constructs and having an 80-item questionnaire gets impractical and painful. This again emphasizes the TAM as more of a model and less of a standardized questionnaire.

7. It predicts usage (predictive validity). The foundational paper (Davis, 1989) showed a correlation between the TAM and higher self-reported current usage (r = .56 for usefulness and r = .32 for ease of use), which is a form of concurrent validity. Participants were also asked to predict their future usage and this prediction had a strong correlation with ease and usefulness in the two pilot studies (r = .85 for usefulness and r = .59 for ease). But these correlations were derived from the same participants at the same time (not a longitudinal component) and this has the effect of inflating the correlation. (People say they will use things more when they rate them higher.) But another study by Davis et al. (1989) actually had a longitudinal component. It used 107 MBA students who were introduced to a word processor and answered four usefulness and four ease of use items; 14 weeks later the same students answered the TAM again and self-reported usage questions. Davis reported a modest correlation between behavioral intention and actual self-reported usage (r = .35). A similar correlation was validated by explaining 45% of behavioral intention, which established some level of predictive validity. Later studies by Venkatesh et al. (1999) also found a correlation of around r = .5 between behavioral intention and both actual usage and self-reported usage.

8. It extends other models of behavioral prediction. The TAM was an extension of the popular Theory of Reasoned Action (TRA) by Ajzen and Fishbein but applied to the specific domain of computer usage. The TRA is a model that suggests that voluntary behavior is a function of what we think (beliefs), what we feel (attitudes), our intentions, and subjective norms (what others think is acceptable to do). The TAM posits that our beliefs about ease and usefulness affect our attitude toward using, which in turn affects our intention and actual use. You can see the similarity in the TRA model in Figure 4 below compared to TAM in Figure 1 above.

Figure 4: The Theory of Reasoned Action (TRA), proposed by Ajzen and Fishbein, of which the TAM is a specific application for technology use.

9. There are no benchmarks. Despite its wide usage, there are no published benchmarks available on TAM total scores nor for the usefulness and ease of use constructs. Without a benchmark it becomes difficult to know whether a product (or technology) is scoring at a sufficient threshold to know whether potential or current users find it useful (and will adopt it or continue to use it).

10. The UMUX-Lite is an adaptation of the TAM. We discussed the UMUX-Lite in an earlier article. It has only two items which offer similar wording to items in the original TAM items: [This system’s] capabilities meet my requirements (which maps to the usefulness component), and [This system] is easy to use (which maps to the ease component). Our earlier research has found even single items are often sufficient to measure a construct (like ease of use). We expect the UMUX-Lite to increase in usage in the UX industry and help generate benchmarks (which we’ll help with too!).

Thanks to Jim Lewis for providing a draft of his paper and commenting on an earlier draft of this article.

(function() {
if (!window.mc4wp) {
window.mc4wp = {
listeners: [],
forms : {
on: function (event, callback) {
event : event,
callback: callback

Sign-up to receive weekly updates.


Source: 10 Things to Know about the Technology Acceptance Model by analyticsweek

IBM Invests to Help Open-Source Big Data Software — and Itself

The IBM “endorsement effect” has often shaped the computer industry over the years. In 1981, when IBM entered the personal computer business, the company decisively pushed an upstart technology into the mainstream.

In 2000, the open-source operating system Linux was viewed askance in many corporations as an oddball creation and even legally risky to use, since the open-source ethos prefers sharing ideas rather than owning them. But IBM endorsed Linux and poured money and people into accelerating the adoption of the open-source operating system.

On Monday, IBM is to announce a broadly similar move in big data software. The company is placing a large investment — contributing software developers, technology and education programs — behind an open-source project for real-time data analysis, called Apache Spark.

The commitment, according to Robert Picciano, senior vice president for IBM’s data analytics business, will amount to “hundreds of millions of dollars” a year.

Photo courtesy of Pingdom via Flickr
Photo courtesy of Pingdom via Flickr

In the big data software market, much of the attention and investment so far has been focused on Apache Hadoop and the companies distributing that open-source software, including Cloudera, Hortonworks and MapR. Hadoop, put simply, is the software that makes it possible to handle and analyze vast volumes of all kinds of data. The technology came out of the pure Internet companies like Google and Yahoo, and is increasingly being used by mainstream companies, which want to do similar big data analysis in their businesses.

But if Hadoop opens the door to probing vast volumes of data, Spark promises speed. Real-time processing is essential for many applications, from analyzing sensor data streaming from machines to sales transactions on online marketplaces. The Spark technology was developed at the Algorithms, Machines and People Lab at the University of California, Berkeley. A group from the Berkeley lab founded a company two years ago, Databricks, which offers Spark software as a cloud service.

Spark, Mr. Picciano said, is crucial technology that will make it possible to “really deliver on the promise of big data.” That promise, he said, is to quickly gain insights from data to save time and costs, and to spot opportunities in fields like sales and new product development.

IBM said it will put more than 3,500 of its developers and researchers to work on Spark-related projects. It will contribute machine-learning technology to the open-source project, and embed Spark in IBM’s data analysis and commerce software. IBM will also offer Spark as a service on its programming platform for cloud software development, Bluemix. The company will open a Spark technology center in San Francisco to pursue Spark-based innovations.

And IBM plans to partner with academic and private education organizations including UC Berkeley’s AMPLab, DataCamp, Galvanize and Big Data University to teach Spark to as many as 1 million data engineers and data scientists.

Ion Stoica, the chief executive of Databricks, who is a Berkeley computer scientist on leave from the university, called the IBM move “a great validation for Spark.” He had talked to IBM people in recent months and knew they planned to back Spark, but, he added, “the magnitude is impressive.”

With its Spark initiative, analysts said, IBM wants to lend a hand to an open-source project, woo developers and strengthen its position in the fast-evolving market for big data software.

By aligning itself with a popular open-source project, IBM, they said, hopes to attract more software engineers to use its big data software tools, too. “It’s first and foremost a play for the minds — and hearts — of developers,” said Dan Vesset, an analyst at IDC.

IBM is investing in its own future as much as it is contributing to Spark. IBM needs a technology ecosystem, where it is a player and has influence, even if it does not immediately profit from it. IBM mainly makes its living selling applications, often tailored to individual companies, which address challenges in their business like marketing, customer service, supply-chain management and developing new products and services.

“IBM makes its money higher up, building solutions for customers,” said Mike Gualtieri, a analyst for Forrester Research. “That’s ultimately why this makes sense for IBM.”

To read the original article on The New York Times, click here.

Source by analyticsweekpick

The Difference Between Big Data and Smart Data in Healthcare

“Physicians are baffled by what feels like the ‘physician data paradox,’” Slavitt said earlier this spring.

“They are overloaded on data entry and yet rampantly under-informed. And physicians don’t understand why their computer at work doesn’t allow them to track what happens when they refer a patient to a specialist when their computer at home connects them everywhere.”

Spotty health information exchange and insufficient workflow integration are two of the major concerns when it comes to accessing the right data at the right time within the EHR.

A new survey from Quest Diagnostics and Inovalon found that 65 percent of providers do not have the ability to view and utilize all the patient data they need during an encounter, and only 36 percent are satisfied with the limited abilities they have to integrate big data from external sources into their daily routines.

On the surface, it appears that more data sharing should be the solution.  If everyone across the entire care continuum allows every one of its partners to view all its data, shouldn’t providers feel more equipped to make informed decisions about the next steps for their patients?

Yes and no.  As the vast majority of providers have already learned to their cost, more data isn’t always better data – and big data isn’t always smart data.  Even when providers have access to health information exchange, the data that comes through the pipes isn’t always very organized, or may not be in a format they can easily use.

“We’re going through very profound business model changes in healthcare right now, and providers are targeting  processes that will help them with the transition from volume to value.”

Scanning through endless PDFs recounting ten-year-old blood tests and x-rays for long-healed fractures won’t necessarily help a primary care provider diagnose a patient’s stomach ailment or figure out why they are reacting negatively to a certain medication.

Actionable insights are the key to using big data analytics effectively, yet they are as rare and elusive as a patient who always takes all her medications on time and never misses a physical. (more…)

Source by analyticsweek

Predictive Workforce Analytics Studies: Do Development Programs Help Increase Performance Over Time?

Greta Roberts, CEO, Talent Analytics, Corp.
Chair – Predictive Analytics World for Workforce

At Talent Analytics, our predictive workforce assignments yield staggering results; saving / making businesses millions of real, measurable dollars. Often this yield is in a single project. Business ROI with these predictive projects is so significant, I wanted to share some of our findings as they may challenge some concepts we hold so closely.

Large organizations spend millions on training, coaching, mentoring, re-training and competency development programs. We do this believing employee development programs can train someone to better or even great performance. It seems to make sense. And, the only option a business has after hiring someone is try to develop employees to greatness. Better training, better managers, better raises, better perks, better culture, better benefits, more time off, better work life balance, more and more and more that the organization needs to do, to prop up and hopefully “develop” the employee into being a better performer. But does it work?

Here’s the thing.

Data Science Studies: Do Development Programs Help Increase Performance Over Time?

We recently completed two analytics studies quantitatively analyzing sales rep. performance. Among other goals, we analyzed sales rep. performance over time – both before and after training is completed. We documented their sales performance as new hires, during training, and finally after they reached full self-sufficiency in their role.

  • Both sets of analyses were for sales role at vastly different types of companies
  • The first sales group consisted of financial advisors
  • The second sales group sold web / email and internet connectivity into consumer homes
  • Each business measured real sales performance (not performance ratings) over time, during and beyond formal and informal training.
  • Both companies had made a significant, long term financial investment in formal training and coaching development programs, to help sales reps to optimize their sales performance

Results: Two Data Science Studies Show Sales Rep. Performance Does Not Measurably Increase Over Time — Even With Substantial Training and Development by the Business

What we found is perhaps shocking to many, though we see this time and again in our predictive analytics work across many roles. For the purposes of this paper, we’ll show graphs from the work we did studying Underwriter sales performance.

  • Figure 1 shows one example of Insurance Underwriters. In this case, performance is measured by sales and bookings – so a lovely quantitative measure. This is not performance review information, this is actual sales performance in the job.
  • Breaking out data by Performance Clusters, we begin to see the true stratification that exists inside this group.  Figure 1 shows the five levels, from the 5% percentile to the 95% percentile performers.Important lessons are evident:
  • Relative Performance Levels are evident very early in tenure, even in the first quarters
  • Growth after initial training is only gradual (if at all) – with the exception of the top performers after quite some time
  • At the individual level, we do not see underwriters crossing categories or increasing after the first year. Performance patterns are set in very early.
  • The first year, or even six months of performance appears to lay the baseline for the rest of an underwriter’s performance

Figure 1

Figure 1 – click to view full size image

In both of sales projects referenced here, analytics results showed that sales rep performance did not measurably increase over time – despite multi-millions being spent on development efforts including:  training, coaching, competency development and the like.

Top performers began as top performers – and continued to be top performers.

  • Note that the development and experience only made a significant impact on top performers sales and bookings over time.
  • Average performers began and continued to be average
  • Bottom performers began and continued to be bottom.
  • Training might have had a short term impact, or a small impact.  But long term business performance was “consistent” from the beginning throughout their tenure – with the single exception of the top performers who excelled even more once nurturing was applied to the ideal nature for the role.

Business Cost, of Developing “Bottom Performers” With Hopes of Turning them Around

In many of today’s businesses around the world, when a bottom performer unfortunately enters as an employee, massive support systems are engaged to prop up, support, train, coach, prompt, and cajole that bad hire into some kind of average performance.

The support systems required are extraordinary — and very, very, very expensive. The best possible outcome is that you can nurture them to averageness, not to greatness.

The only thing your organization can do once hired, is to try to develop that low or average performer, hoping to squeeze some kind of value out of them. It’s all you can do once they are hired.

The greatest cost to the business, seen in Figure 2, is what could literally lead a company to either mediocrity or wild success.

  • After 5 years, differing performance adds up significantly. Figure 2 illustrates the many $Million spread between the trajectories of the top and bottom quantiles of underwriter performance. This is a hypothetical plot as we did not have 12 years worth of transaction data. The plot is based on the average book for each performance group, extended linearly to 12 years.
  • The results illustrate the power of top underwriters. The top performers each achieve 6.79 times as much revenue for this client as the bottom performers over the course of 12 years. (This despite the same training and development being applied and available to all underwriters).

This difference is worth mega $millions (purposely not revealing too much so as to protect the identity of our client).

Figure 2 – click to view full size image

Options?  Predict Top and Bottom Performers – Before You Hire Them

The financially optimal solution is to predict and screen in top performers and screen out bottom performers before they enter as an employee. Today there are people in your organization that are performing very, very well without need of an expensive and extensive support organization. Yes, they need some managing and coaching here and there. They needed time to ramp up to full productivity, but guidance they need now is minimal. They don’t need propping up. You are not their crutch.

You and your teams can tell that it’s in their nature to excel in this role. Despite your competitor being able to pay more, despite not having a manager for a few months, despite not having training on their first day or a raise in more than a year, or time off, or or or or – –  they continue to consistently outperform.

You need more of these employees and predictive models can find them – pre-hire.

Conversely, there are employees in this same role that require extensive coaching, training, mentoring, special perks and other types of support. Our data consistently shows that all this development will have little impact on their ability to perform on their own, without the extensive support network being provided to them.

Predicting top / bottom / average performance is a perfect situation to apply data science. A data science approach helps investigate potential differences in the nature of the successful and unsuccessful performers. Findings help predict the performance you’re looking for, pre-hire.

Nature and Nurture are Both Important.  But Nature Comes First

Nurture is irrelevant – if the nature of your employees continue to fight your nurturing.  (People don’t want to be changed). You can’t change the nature of your employees. Period. If you could, you could stop interviewing. You could hire anyone, for any job and train them to be top performers at anything.

Marketeers and Political Candidates would laugh at this concept.

  • Imagine trying to convince a consumer to care about buying based on price – when their nature is to be brand sensitive.
  • Imagine a political candidate trying to convince voters to care about a different issue. Preposterous.
  • It doesn’t work. Yet we do this all the time with employees.

You can’t fix an employee attrition or performance problem by simply increasing the size of your development / support / training / mentoring / managing departments. That simply increases the financial spend in this area.

The key is to start with predicting the nature that is optimized for your role – and develop from there. Every other human domain area uses this approach, except for the employee / job candidate domain.

Consumers are people. Voters are people. Employees are people too. Predicting behavior and optimizing performance is the same in all these domains. Begin by understanding the human’s nature, then align the offer (a coupon, a political candidate, or a job) with their nature and finally nurture – the right nature – to greatness.

Predicting Pre-hire Solves Attrition and Performance Problems

Our work consistently shows that top and bottom performers in a specific role have different natures. Not only sales reps, but call center reps, bank tellers, financial advisors, truck drivers, insurance agents, engineers and the like. It’s not random that people excel in their role. They excel because it is their nature to excel in the role. They’re built for it, learn quickly, feel valued and satisfied.

They gobble up the training and quickly implement. They love that the nature they are is valued – just as it is.

We’re data scientists. We analyze data, see what we see and build predictive models when there is a strong prediction. Predictions are deployed quickly on our light touch, cloud solution Advisor(™) making it easy for recruiters and hiring managers to predict performance pre-hire.  Using machine learning, our algorithms learn and get smarter over time (like the recommendation engine in Netflix or Amazon).

Imagine . . .

  • Imagine finally solving the attrition problem with your high turnover, high volume roles
  • Imagine reducing your expensive, bloated support and training organizations
  • Imagine classes of new hires who have the nature and desire to perform in this role vs. people fighting your development efforts each step of the way.
  • Imagine succession plans that allow you to predict the optimal role for your top performers

Leading organizations are using a predictive workforce analytics approach now, to solve these challenges. They are competing on talent analytics. It’s easier than you think. It’s more respectful of your employees. It’s less costly and stops the farce of thinking we are powerful enough to develop anyone, to be anything we need them to be (whether they like it or not).

Greta Roberts is the CEO & co-founder of Talent Analytics, Corp., Chair of Predictive Analytics World for Workforce and Faculty member of the International Institute for Analytics. Follow her on twitter @gretaroberts.

Source: Predictive Workforce Analytics Studies: Do Development Programs Help Increase Performance Over Time?

7 Steps to Start Your Predictive Analytics Journey

Predictive analytics is transforming software in every industry. Healthcare applications use predictive analytics to reduce hospital readmissions and prioritize high-risk patients. In the manufacturing supply chain, it’s forecasting demand and reducing machine downtime. Businesses of all types are using predictive analytics to reduce customer churn.

Predictive analytics makes applications infinitely more valuable, separates software from competitors, and offers new revenue streams. But it’s also a vastly complex undertaking. Embedding accurate, effective predictive solutions in your application requires expert resources and taking the time to get it right. 

>> Related: What Is Predictive Analytics? <<

Where should you start your predictive analytics journey? Follow these seven steps to add valuable insights to your application.

Step 1: Find a promising predictive use case

This is an important aspect of the project. Pick a business use case that your company already recognizes as a problem requiring a solution. Start by identifying the top three priorities of your executives and pick the one that’s most realistic to achieve in your timeframe. If you don’t have a specific use case in mind, start with a common business issue such as customer churn or late payments. You can also use the PADS framework to identify a strong predictive analytics initiative for your company.

Step 2: Identify the data you need

In many cases, the data you need for your chosen use case is either not readily available or may have quality issues. Consider using a predictive analytics tool to do auto cleansing for common data problems—but don’t feel like you have to wait for everything to align before you get started. Take an 80/20 approach: If 80 percent of your data is clean, move forward with what you have and optimize from there. You can always roll out an update after the initial release.

Step 3: Gather a team of beta testers

Beta testers are the end users of your product. If you’re working on a customer-facing project, engage directly with a few key customers/partners who will be using the predictive analytics in your application. If you have an enterprise application used by internal employees or partners, gather a core group of your top users across a variety of teams or departments. One important note: Take care to pick the right combination of people so you get a variety of feedback. Don’t just pick the customers who will agree with you on everything. If all you get is positive feedback your project will never succeed.

In addition to your end users, talk to the users who will be maintaining and deploying the predictive analytics solution in your software. Conduct focus group interviews and hands-on training sessions with your development and product management teams. If the application team isn’t on board with the solution, it will end up going nowhere.

Step 4: Create rapid proofs of concept

Create simple prototypes and get them to your end users/stakeholders for feedback. Your first few iterations will most likely be way off mark—but that’s to be expected. It’s common to take a dozen or more iterations to find the right design.

Step 5: Integrate predictive analytics in your operations 

The most valuable predictive analytics solutions are integrated in existing workflows, processes, and decision making steps. Users get future insights in context of the applications they already use—and they can act on those insights without jumping into another system. As more users benefit from predictions, even more users will want to adopt the application. The key is to make it easy for end users to see predictions and take action all within the same application.

Step 6: Partner with stakeholders

Stakeholders may be skeptical of predictive analytics at first. But by doing your due diligence and gathering feedback from end users, sample prototypes, and examining the competitive landscape, you can build a business case to move your project forward. Partner with stakeholders at every step of the journey: You need them to succeed, and they need you to achieve successful outcomes as well.

Step 7: Update regularly

End users like to see new features added and bugs fixed at a reasonable pace. Plan a predictive analytics roadmap to make small updates every two to three months and significant updates every six to nine months. Maintain constant communication with end users and continuously respond to their needs to keep your project on the right path.

Logi Predict is the only predictive analytics solution built for product managers and developers—and made to embed in existing applications. See Logi Predict in action yourself in a free demo >


Source: 7 Steps to Start Your Predictive Analytics Journey by analyticsweek

Analytics And The IRS: A New Way To Find Cheaters


While most of us are worried about staying within the law at tax time, the IRS itself has the opposite problem. Its staff is responsible for identifying those who cheat the system, figuring out what’s owed and collecting.

In 2014 alone, the IRS identified and prevented payment of refunds formore than 2 million confirmed cases of identity theft or fraud in tax returns. That’s about 1 in every 100 returns, and a total of more than $15 billion dollars. An audit also found that millions of dollars of refunds were issued for potentially fraudulent returns.

Who knows how much more fraud may go undetected, cheating the public out of money and public services? The IRS wants to know, and it’s using advanced analytics to seek out hidden fraud.

Sophisticated analytics is nothing new to the agency. It maintains a staff of economists and statisticians, and has done so for decades. SAS founder Jim Goodnight points out that the IRS uses SAS analytics products for fraud detection, as do the Medicaid programs of every state. Nor is the agency a one-product shop; it’s also a long-time user of IBM’s SPSS tools…


Read the full article at

Originally Posted at: Analytics And The IRS: A New Way To Find Cheaters

Doctors store 1,600 digital hearts for big data study

Doctors in London have stored 1,600 beating human hearts in digital form on a computer.

The aim is to develop new treatments by comparing the detailed information on the hearts and the patients’ genes.

It is the latest project to make use of advances in storing large amounts of information.

The study is among a wave of new “big data” ventures that are transforming the way in which research is carried out.

Scientists at the Medical Research Council’s Clinical Sciences Centre at Hammersmith Hospital are scanning detailed 3D videos of the hearts of 1,600 patients and collecting genetic information from each volunteer.

Dr Declan O’Regan, who is involved in the heart study, said that this new approach had the potential to reveal much more than normal clinical trials in which relatively small amounts of health information is collected from patients over the course of several years.

He added: “There is a really complicated relationship between people’s genes and heart disease, and we are still trying to unravel what that is. But by getting really clear 3D pictures of the heart we hope to be able to get a much better understanding of the cause and effect of heart disease and give the right patients the right treatment at the right time.”

Subtle signs

The idea of storing so much information on so many hearts is to compare them and to see what the common factors are that lead to illnesses. Dr O’Regan believes that this kind of analysis will increasingly become the norm in medicine.

“There are often subtle signs of early disease that are really difficult to pick up even if you know what to look for. A computer is very sensitive to picking up subtle signs of a disease before they become a problem.”

The Square Kilometre Array will collect 150 times more data each year than currently flows through the internet

The Big Data idea is sweeping across a range of scientific research fields, and, as you would expect, there are some very large numbers involved.

Computers at the European Bioinfomatics Institute (EBI) in Cambridge store the entire genetic code of tens of thousands of different plants and animals. The information occupies the equivalent of more than 5,000 laptops.

And to find out how the human mind works, researchers at the Institute for Neuroimaging and Informatics at the University of Southern California are storing 30,000 detailed 3D brain scans, requiring the space equivalent to 10,000 laptops.

The Square Kilometre Array, a radio telescope being built in Africa and Australia, will collect enough data in one year to fill 300 million million laptops. That is 150 times the current total annual global internet traffic.

Data revolution

Researchers at the American Association for the Advancement of Science (AAAS) meeting in San Jose are discussing just how they are going to store and sift through this mass of data.

According to Prof Ewan Birney at the EBI, big data is already beginning to transform the way research is done.

“Suddenly, we don’t have to be afraid of measuring lots and lots of things – about humans, about oceans, about the Universe – because we know we can be confident that we can collect that data and extract some knowledge from it,” he told BBC News.

Brain Scan
Detailed scans of the wiring of the human brain from just one project in California need the computer storage equivalent to 10,000 laptops.

The falling cost of storage has helped those developing systems to manage big data research, but when faced with an imminent tsunami of information, they will have to run to stand still and find ever more intelligent ways to compress and store the information.

The other main issue is how to organise and label the data.

Just as librarians have found ways to classify books by subject or by author, a whole new science is emerging in how to classify research data logically so that teams can find the things they want to find. But when one considers the trillions of pieces of information involved and the complexity of the scientific fields involved, the task is much harder than organising a library.

The UK’s Biotechnology and Biological Sciences Research Council (BBSRC) has announced a £7.45m investment at the AAAS meeting in the design of big data infrastructures.

The emergence of big data can be thought of as being similar to the development of the microscope: a powerful new tool for scientists to study intricate processes in nature that they have never been able to see before.

Approaching omniscience

Those involved in developing big data infrastructure believe that the investment will lead to a radical shift in the way research across a variety of disciplines is carried out. They sense that a step toward omniscience is within reach; a way of seeing the Universe as it really is rather than the distorted view even scientists have through the filter of our limited brains and senses.

According to Paul Flicek of the EBI, big data could potentially lift a veil that has been shrouding important avenues of research.

“One of the things about science is that you don’t always discover the important things; you discover what you can discover. But by using larger amounts of data, we can discover new things, and so what will be found? That is an open question,” he told BBC News.

The challenge is for scientists to find new ways to manage this data and new ways to analyse it. Just collecting data does not solve any problems by itself.

But properly organised and managed it could enable scientists to identify rare subtle events that only occur every so often in nature but have a big effect on our lives. The Higgs boson was discovered in this way.

“We are not going to slow down generating new data,” says Prof Flicek. “The fact that we have demonstrated that we can generate a lot of this data; we can sequence these genomes. We are never going to stop doing that and so it opens up so many more exciting things.

“We can learn new things and we can see things we have never seen before.”

Originally posted via “Doctors store 1,600 digital hearts for big data study”

Originally Posted at: Doctors store 1,600 digital hearts for big data study by analyticsweekpick

To Trust A Bot or Not? Ethical Issues in AI

Given we see fake profiles and potentially chatbots that misfire and miscommunicate we would like your thoughts on whether there should be some sort of Government registry for robots so that consumers know they are legitimate or not. If we had a registry for trolls and or chatbots would that ensure that people could feel more comfortable that they are dealing with a legitimate business or would know if the profile or troll or bot is fake? Is it time for a good housekeeping seal of approval for AI?

These are all provocative questions and questions that are so new I am not sure there is one answer as they are so undefined. What do you think? Who should create such standards? Perhaps we should start by categorizing the types of AI?

Source: To Trust A Bot or Not? Ethical Issues in AI by tony