Big Data’s big libertarian lie: Facebook, Google and the Silicon Valley ethical overhaul we need

The tech world talks of liberty and innovation while invading privacy and surveilling us all. It must end now.

Why has Big Data so quickly become a part of industry dogma? Beyond the tremendous amounts of money being thrown at Big Data initiatives, both in research dollars and marketing efforts designed to convince enterprise clients of Big Data’s efficacy, the analytics industry plays into long-held cultural notions about the value of information. Despite Americans’ overall religiosity, our embrace of myth and superstition, our surprisingly enduring movements against evolution, vaccines, and climate change, we are a country infatuated with empiricism. “A widespread revolt against reason is as much a feature of our world as our faith in science and technology,” as Christopher Lasch said. We emphasize facts, raw data, best practices, instruction manuals, exact directions, instant replay, all the thousand types of precise knowledge. Even our love for gossip, secrets, and conspiracy theories can be seen as a desire for more privileged, inside types of information— a truer, more rarified knowledge. And when this knowledge can come to us through a machine—never mind that it’s a computer program designed by very fallible human beings—it can seem like truth of the highest order, computationally exact. Add to that a heavy dollop of consumerism (if it can be turned into a commodity, Americans are interested), and we’re ready to ride the Big Data train.

Information is comforting; merely possessing it grounds us in an otherwise unstable, confusing world. It’s a store to draw on, and we take threats to it seriously. Unlike our European brethren, we evince little tolerance for the peculiarities of genre or the full, fluid spectrum between truth and lies. We regularly kick aside cultural figures (though, rarely, politicians) who we’ve determined have misled us.

Our bromides about information—it’s power, it wants to be free, it’s a tool for liberation—also say something about our enthusiasm for it. The smartphone represents the coalescing of information into a single, personal object. Through the phone’s sheer materiality, it reminds us that data is now encoded into the air around us, ready to be called upon. We live amid an atmosphere of information. It’s numinous, spectral, but malleable. This sense of enchantment explains why every neoliberal dispatch from a remote African village must note the supposedly remarkable presence of cell phones. They too have access to information, that precious resource of postindustrial economies.
All of this is part of what I call the informational appetite. It’s our total faith in raw data, in the ability to extract empirical certainties about life’s greatest mysteries, if only one can deduce the proper connections. When the informational appetite is layered over social media, we get the messianic digital humanitarianism of Mark Zuckerberg. Connectivity becomes a human right; Facebook, we are told, can help stop terrorism and promote peace. (Check out to see this hopeless naïveté in action.) More data disclosure leads to a more authentic self. Computerized personal assistants and ad networks and profiling algorithms are here to learn who we are through information. The pose is infantilizing: we should surrender and give them more personal data so that they can help us. At its furthest reaches, the informational appetite epitomizes the idea that we can know one another and ourselves through data alone. It becomes a metaphysic. Facebook’s Graph Search, arguably the first Big Data–like tool available to a broad, nonexpert audience, shows “how readily what we ‘like’ gets translated into who we are.”

Zuckerberg is only one exponent of what has become a folkway in the age of digital capitalism. Ever connected, perhaps fearing disconnection itself more than the fear of missing out, we live the informational appetite. We have internalized and institutionalized it by hoarding photos we’ll never organize, much less look at again; by tracking ourselves relentlessly; by feeling a peculiar anxiety whenever we find ourselves without a cell phone signal. We’ve learned to deal with information overload by denying its existence or adopting it as a sociocultural value, sprinkled with a bit of the martyrdom of the Protestant work ethic. It’s a badge of honor now to be too busy, always flooded with to-do items. It’s a problem that comes with success, which is why we’re willing to spend so much time online, engaging in, as Ian Bogost called it, hyperemployment.

There’s an inherent dissonance to all this, a dialectic that becomes part of how we enact the informational appetite. We ping-pong between binge-watching television and swearing off new media for rustic retreats. We lament our overflowing in-boxes but strive for “in-box zero”—temporary mastery over tools that usually threaten to overwhelm us. We subscribe to RSS feeds so as to see every single update from our favorite sites—or from the sites we think we need to follow in order to be well-informed members of the digital commentariat—and when Google Reader is axed, we lament its loss as if a great library were burned. We maintain cascades of tabs of must-read articles, while knowing that we’ll never be able to read them all. We face a nagging sense that there’s always something new that should be read instead of what we’re reading now, which makes it all the more important to just get through the thing in front of us. We find a quotable line to share so that we can dismiss the article from view. And when, in a moment of exhaustion, we close all the browser tabs, this gesture feels both like a small defeat and a freeing act. Soon we’re back again, turning to aggregators, mailing lists, Longreads, and the essential recommendations of curators whose brains seem somehow piped into the social-media firehose. Surrounded by an abundance of content but willing to pay for little of it, we invite into our lives unceasing advertisements and like and follow brands so that they may offer us more.

In the informational appetite, we find the corollary of digital detox and its fetishistic response to the overwhelming tide of data and stimulus. Information is power, particularly in this enervated economy, so we take on as much of it as we can handle. We succumb to what Evgeny Morozov calls “the many temptations of information consumerism.” This posture promises that anything is solvable—social and environmental and economic problems, sure, but more urgent, the problem of our (quantified) selves, since the appetite for information is ultimately a self-serving one. How can information make us richer? Smarter? Happier? Safer? How can we get better deals on gadgets and kitchenware? The informational appetite is the hunger for self-help in disguise.

Viral media thrives because it insists on both its newness and relevance—two weaknesses of the informational glutton. A third weakness: because it’s recent and seemingly unfiltered, it must be accurate. Memes have the lifespan and cultural value of fruit flies, but they’re infinite, and they satisfy our obsession with shared reference and cheap parody. We consume them with the uncaring aggression of those who blow up West Virginia Mountains to get at coal underneath. They exhaust themselves quickly, messily, when the glistening viral balloon is deflated by the revelation that the ingredients of the once-tidy story don’t add up. But no matter. There is always another to move onto, as well as someone (Wikipedia, Know Your Meme, Urban Dictionary, et al.) to catalog it.

The informational appetite is the never-ending need for more page views. It’s the irresistible compulsion to pull out your phone in the middle of a conversation to confirm some point of fact, because it’s intolerable not to know right now. It’s the smartphone as a salve for loneliness amid the crowd. It’s the “second screen” habit, in which we watch TV while playing games on our iPhone, tweeting about what we’re seeing, or looking up an actor on IMDB. It’s Google Glass and the whole idea of augmented reality, a second screen over your entire life. It’s the phenomenon of continuous partial attention, our focus split among various inputs because to concentrate on one would reduce our bandwidth, making us less knowledgeable citizens.

The informational appetite, then, is a cultural and metaphysical attitude as much as it is a business and technological ethic. But it also has three principal economic causes: the rapid decrease in the cost of data storage, the rising belief that all data is potentially useful, and the consolidation of a variety of media and communication systems into one global network, the Internet. With the ascension of Silicon Valley moguls to pop culture stardom, their philosophy has become our aspirational ideal—the key to business success, the key to self-improvement, the key to improving government and municipal services (or doing away with them entirely). There is seemingly no problem, we are told, that cannot be solved with more information and no aspect of life that cannot be digitized. As Katherine Losse noted, “To [Zuckerberg] and many of the engineers, it seemed, more data is always good, regardless of how you got it. Social graces—and privacy and psychological well-being, for that matter—are just obstacles in the way of having more information.”

The CIA’s chief technology officer isn’t immune. “We fundamentally try to collect everything and hang on to it forever,” he said at a 2013 conference sponsored by GigaOm, the technology Web site. He too preached the Big Data gospel, telling the crowd: “It is really very nearly within our grasp to be able to compute on all human generated information.” How far gone must you be to see this as beneficial?

Compared to this kind of talk, Google’s totalizing vision—“to organize the world’s information and make it universally accessible and useful”—sounds like a public service, rather than a grandiose, privacy-destroying monopoly. Google’s mission statement, along with its self-inoculating “Don’t Be Evil” slogan, has made it acceptable for other companies to speak of world-straddling ambitions. LinkedIn’s CEO describes his site thusly: “Imagine a platform that can digitally represent every opportunity in the world.” Factual wants to identify every fact in the world. Whereas once we hoped for free municipal Wi-Fi networks, now Facebook and Cisco are providing Wi-Fi in thousands of stores around the United States, a service free so long as you check into Facebook on your smartphone and allow Facebook to know whenever you’re out shopping. “Our vision is that every business in the world who have people coming in and visiting should have Facebook Wi-Fi,” said Erick Tseng, Facebook’s head of mobile products.* Given that Mark Zuckerberg has said that connectivity is a human right, does requiring patrons to log into Facebook to get free Wi-Fi impinge on their rights, or does it merely place Facebook access on the same level of humanitarianism?

All of the world’s information, every opportunity, every fact, every business on earth. Such widely shared self-regard has made it seem embarrassing to claim more modest goals for one’s business. A document sent out to members of Y Combinator, the industry’s most sought-after start-up incubator, instructed would-be founders: “If it doesn’t augment the human condition for a huge number of people in a meaningful way, it’s not worth doing.”

As long as we have the informational appetite, more data will always seem axiomatic—why wouldn’t one collect more, compute more? It’s the same absolutism found in the mantra “information wants to be free.” We won’t consider whether some types of data should be harder to find or whether the creation, preservation, and circulation of some data should be subjected to a moral calculus. Nor will we be able to ignore the data sitting in front of us; it would be irresponsible not to leverage it. If you think, as the CEO of ZestFinance does, that “all data is credit data, we just don’t know how to use it yet,” then why would you not incorporate anything—anything— you can get on consumers into your credit model? As a lender, you’re in the business of risk management, precisely the field which, in David Lyon’s view, is so well attuned to Big Data. The question of collecting and leveraging consumer data, and making decisions based on it, passes from the realm of ethics to a market imperative. It doesn’t matter if you’re forcing total transparency on a loan recipient while keeping your algorithm and data-collection practices totally hidden from view. It’s good for the bottom line. You must protect your company. That’s business.

This is how Big Data becomes a religion, and why, as long as you’re telling yourself and your customers that you’re against “evil,” you can justify building the world’s largest surveillance company. You’re doing it for the customers. The data serves them. It adds relevance to their informational lives, and what could be more important than that?

Ethics, for one thing. A sense of social responsibility. A willingness to accept your own fallibility and that, just as you can create some pretty amazing, world-spanning technological systems, these systems might produce some negative outcomes, too—outcomes that can’t be mitigated simply by collecting more data or refining search algorithms or instituting a few privacy controls.

Matt Waite, the programmer who built the mugshots site for the Tampa Bay Times, only later to stumble upon some complications, introduced some useful provocations for every software engineer. “The life of your data matters,” he wrote. “You have to ask yourself, Is it useful forever? Does it become harmful after a set time?”

Not all data is equal, nor are the architectures we create to contain them. They have built-in capacities—“affordances” is the sociologist’s term d’art—and reflect the biases of their creators. Someone who’s never been arrested, who makes $125,000 a year, and who spends his days insulated in a large, suburban corporate campus would approach programming a mugshot site much differently from someone who grew up in an inner-city ghetto and has friends behind bars. Whatever his background, Waite’s experience left him cognizant of these disparities, and led him to some important lessons. “What I want you to think about, before you write a line of code, is what does it mean to put your data on the Internet?” he said. “What could happen, good and bad? What should you do to be responsible about it?”

The problems around reputation, viral fame, micro-labor, the lapsing of journalism into a page-view- fueled horse race, the intrusiveness of digital advertising and unannounced data collection, the erosion of privacy as a societal value—nearly all would be improved, though not solved, if we found a way to stem our informational appetite, accepting that our hunger for more information has social, economic, and cultural consequences. The solutions to these challenges are familiar but no more easier to implement: regulate data brokers and pass legislation guarding against data-based discrimination; audit Internet giants’ data collecting practices and hand down heavy fines, meaningful ones, for unlawful data collection; give users more information about where their data goes and how it’s used to inform advertising; don’t give municipal tax breaks (Twitter) or special privileges at local airfields (Google) to major corporations that can afford to pay their share of taxes and fees to the cities that have provided the infrastructure that ensures their success. Encrypt everything.

Consumers need to educate themselves about these industries and think about how their data might be used to their disadvantage. But the onus shouldn’t lie there. We should be savvy enough, in this stage of late capitalism, to be skeptical of any corporate power that claims to be our friend or acting in our best interests. At the same time, the rhetoric and utility of today’s personal technology is seductive. One doesn’t want to think that a smartphone is also a surveillance device, monitoring our every movement and communication. Consumers are busy, overwhelmed, lacking the proper education, or simply unable to reckon with the pervasiveness of these systems. When your cell carrier offers you a heavily subsidized, top-of-the-line smartphone—a subsidy that comes from locking you into a two-year contract, throughout which they’ll make a bundle off of your data production—then you take it. It’s hard not to.

“If you try to keep up with this stuff in order to stop people from tracking you, you’d go out of your mind,” Joseph Turow said. “It’s very complicated and most people have a difficult time just getting their lives in order. It’s not an easy thing, and not only that, many people when they get online, they just want to do what they do and get off.”

He’s right, except that, when we now think we’re offline, we’re not. The tracking and data production continues. And so we shouldn’t be too hard on one another, particularly those less steeped in this world, when they suddenly raise fears over privacy or data collection. Saying “What did you expect?” is not the response of a wizened technology user. It’s a cynical displacement of blame.

It’s long past time for Silicon Valley to take an ethical turn. The industry’s increasing reliance on bulk data collection and targeted advertising is at odds with its rhetoric of individual liberty and innovation. Adopting other business models is both an economic and ethical imperative; perpetual surveillance of customers cannot endure, much less continue growing at such a pace.

The tech companies might have to give up something in turn, starting with their obsession with scale. Katherine Losse, the former Facebook employee, again: “The engineering ideology of Facebook itself: Scaling and growth are everything, individuals and their experiences are secondary to what is necessary to maximize the system.” Scale is its own self-sustaining project. It can’t be sated, because it doesn’t need to be. The writer and environmentalist Edward Abbey said, “Growth for the sake of growth is the ideology of the cancer cell.” The informational appetite is cancerous; it devours.

When the CEO of Factual speaks of forming a repository of all of the world’s facts, including all human beings’ “genetic information, what they ate, when and where they exercised,” he may indeed be on the path to changing the world, as he claims to be. But for whose benefit? If the world’s information is measured, collected, computed, and privatized en masse, what is being improved, beyond the bottom lines of a coterie of venture capitalists and tech industry executives? What social, economic, or political problem can be solved by this kind of computation? Will Big Data end a war? Will it convince us to convert our economy to sustainable energy sources, even if it raises gas prices?

Will it stop right-wing politicians from gutting the social safety net? Which data-driven insight will convince voters and legislators to accept gay marriage or to stop imprisoning nonviolent drug offenders? For all their talk of changing the world, technology professionals rarely get into even this basic level of specificity. Perhaps it’s because “changing the world” simply means creating a massive, rich company. These are small dreams. The dreamers “haven’t even reached the level of hypocrisy,” as the avuncular science fiction author Bruce Sterling told the assembled faithful at SXSW Interactive, the industry’s premier festival, in March 2013. “You’re stuck at the level of childish naïveté.”

Adopting a populist stance, some commentators, such as Jaron Lanier, say that to escape the tyranny of a data-driven society, we must expect to be paid for our data. We should put a price on it. Never mind that this is to give into the logic of Big Data. Rather than trying to dismantle or reform the system—one which, as Lanier acknowledges in his book Who Owns the Future?, serves the oligarchic platform owners at the expense of customers—they wish to universalize it. We should all have accounts from which we can buy and sell our data, they say. Or companies will be required to pay us market rates for using our private information. We could get personal data accounts and start entertaining bids. Perhaps we’d join Miinome, “the first member-controlled, portable human genomics marketplace,” where you can sell your genomic information and receive deals from retailers based on that same information. (Again, I think back to HSBC’s “Your DNA will be your data” ad, this time recognizing it not as an attempt at imparting a vaguely inspirational, futuristic message, but as news of a world already here.) That beats working with 23andMe, right? That company already sells your genetic profile to third parties—and that’s just in the course of the (controversial, non-FDA- compliant) testing they provide, for which they also charge you.

Tellingly, a version of this proposal for a data marketplace appears in the World Economic Forum (WEF) paper that announced data as the new oil. Who could be more taken with this idea than the technocrats of Davos? The paper, written in collaboration with Bain & Company, described a future in which “a person’s data would be equivalent to their ‘money.’ It would reside in an account where it would be controlled, managed, exchanged and accounted for just like personal banking services operate today.”

Given that practically everything we do now produces a digital record, this model would make all of human life part of one vast, automated dataveillance system. “Think of personal data as the digital record of ‘everything a person makes and does online and in the world,’ ” the WEF says. The pervasiveness of such a system will only increase with the continued development and adoption of the “Internet of things”—Internet- connected, sensor-rich devices, from clothing to appliances to security cameras to transportation infrastructure. No social or behavioral act would be immune from the long arms of neoliberal capitalism. Because everything would be tracked, everything you do would be part of some economic exchange, benefiting a powerful corporation far more than you. This isn’t emancipation through technology. It’s the subordination of life, culture, and society to the cruel demands of the market and making economic freedom unavailable to average people, because just walking down the street produces data about them.

Signs of this future have emerged. A company called Datacoup has offered people $8 per month to share their personal information, which Datacoup says it strips of identifying details before selling it on to interested companies. Through its ScreenWise program, Google has offered customers the ability to receive $5 Amazon gift cards in exchange for installing what is essentially spy software. The piece of software, a browser extension, was only available for Google’s own Chrome browser, which surely didn’t hurt its market share. Setting its sights on a full range of a family’s data, Google also offered a more expansive version of the program. Working with the market research firm GfK, Google began sending brochures presenting the opportunity to “be part of an exciting and very important new research study.” Under this program, Google would give you a new router and install monitoring software on anything with a Wi-Fi connection, even Blu-Ray players. The more you added, the more money you’d make. But it doesn’t add up to much. One recipient estimated that he could earn $945 from connecting all of his family’s devices for one year. A small sum for surrendering a year of privacy, for accepting a year of total surveillance. It is, however, more than most of us are getting now for much the same treatment.

Making consumers complicit—yet far from partners—in the data trade would only increase these inequities. We would experience data-based surveillance in every aspect of our lives (a dream for intelligence agencies). Such a scheme would also be open to manipulation or to the kind of desperate micro-labor and data accumulation exemplified by Mechanical Turk and other online labor marketplaces. You’d wonder if your friend’s frequent Facebook posts actually represented incidents from his life or whether he was simply trying to be a good, profitable data producer. If we were to lose our jobs, we might not go on welfare, if it’s even still available; we’d accept a dole in the form of more free devices and software from big corporations to harvest our data, or more acutely personal data, at very low rates. A gray market of bots and data-generating services would appear, allowing you to pay for automated data producers or poorly paid, occasionally blocked ones working, like World of Warcraft gold miners, in crowded apartments in some Chinese industrial center.

Those already better off, better educated, or technically adept would have the most time, knowledge, and resources to leverage this system. In another paper on the data trade, the WEF (this time working with the Boston Consulting Group) spoke of the opportunity for services to develop to manage users’ relations with data collectors and brokers—something on the order of financial advisors, real estate agents, and insurers. Risk management and insurance professionals should thrive in such a market, as people begin to take out reputational and data-related insurance; both fields depend as much on the perception of insecurity as the reality. Banks could financialize data and data-insurance policies, creating complex derivatives, securities, and other baroque financial instruments. The poor, and those without the ability to use the tools that’d make them good data producers, would lose out. Income inequality, already driven in part by the collapse of industrial economies and the rise of postindustrial, post-employment information economies, would increase ever more.

This situation won’t be completely remedied by more aggressive regulation, consumer protections, and eliminating tax breaks. Increasing automation, fueled by this boom in data collection and mining, may lead to systemic unemployment of a kind we’ve never seen. Those contingent workers laboring for tech companies through Elance or Mechanical Turk will soon enough be replaced by automated systems. It’s clear that, except for an elite class of managers, engineers, and executives, human labor is seen as a problem that technology can solve. In the meantime, those whose sweat this industry still relies upon find themselves submitting to exploitative conditions, whether as a Foxconn worker in Shenzhen or a Postmates courier in San Francisco. As one Uber driver complained to a reporter: “We have a real person performing a function, not a Google automatic car. We have become the functional end of the app.” It might not be long before he is traded in for a self-driving car. They don’t need breaks, they don’t worry about safety conditions or unions, they don’t complain about wages. Compared to a human being, automatic cars are perfectly efficient.

And who will employ him then? Who will be interested in someone who’s spent a few years bouncing between gray-market transportation facilitation services, distributed labor markets, and other hazy digital makework? He will have no experience, no connections, and little accrued knowledge. He will have lapsed from subsistence farming in the data fields to something worse and more desultory—a superfluous machine.

Automation, then, should ensure that power, data, and money continue to accrue in tandem. Through ubiquitous surveillance and data collection, now stretching from computers to cell phones to thermostats to cars to public spaces, a handful of large companies have successfully socialized our data production on their behalves. We need some redistribution of resources, which ultimately means a redistribution of power and authority. A universal basic income, paid for in part by taxes and fees levied on the companies making fabulous profits out of the quotidian materials of our lives, would help to reintroduce some fairness into our technologized economy. It’s an idea that’s had support from diverse corners—liberals and leftists often cite it as a pragmatic response to widespread inequality, while some conservatives and libertarians see it as an improvement over an imperfect welfare system. As the number of long-term unemployed, contingent, and gig workers increases, a universal basic income would restore some equity to the system. It would also make the supposed freedom of those TaskRabbit jobs actually mean something, for the laborer would know that even if the company cares little for his welfare or ability to make a living, someone else does and is providing the resources to make sure that economic precarity doesn’t turn into something more dire.

These kinds of policies would help us to begin to wean ourselves off of our informational appetite. They would also foment a necessary cultural transformation, one in which our collective attitude toward our technological infrastructure would shift away from blind wonder and toward a learned skepticism. I’m optimistic that we can get there. But to accomplish this, we’ll have to start supporting and promoting the few souls who are already doing something about our corrupted informational economy. Through art, code, activism, or simply a prankish disregard for convention, the rebellion against persistent surveillance, dehumanizing data collection, and a society administered by black-box algorithms has already begun.

Excerpted from “Terms of Service: Social Media and the Price of Constant Connection” by Jacob Silverman. Published by Harper, a division of HarperCollins. Copyright 2015 by Jacob Silverman. Reprinted with permission of the author. All rights reserved.

Originally posted via “Big Data’s big libertarian lie: Facebook, Google and the Silicon Valley ethical overhaul we need”

Source: Big Data’s big libertarian lie: Facebook, Google and the Silicon Valley ethical overhaul we need by analyticsweekpick

Sep 28, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Fake data  Source


 Microsoft Azure to Feature New Big Data Analytics Platform – MeriTalk (blog) Under  Big Data Analytics

 Digital Guardian Declares a New Dawn for Data Loss Prevention – insideBIGDATA Under  Big Data Security

 The DCIM tool and its place in the modern data center – TechTarget Under  Data Center

More NEWS ? Click Here


Statistical Thinking and Data Analysis


This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and n… more


How to Create a Mind: The Secret of Human Thought Revealed


Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more


Data Analytics Success Starts with Empowerment
Being Data Driven is not as much of a tech challenge as it is an adoption challenge. Adoption has it’s root in cultural DNA of any organization. Great data driven organizations rungs the data driven culture into the corporate DNA. A culture of connection, interactions, sharing and collaboration is what it takes to be data driven. Its about being empowered more than its about being educated.


Q:What is: lift, KPI, robustness, model fitting, design of experiments, 80/20 rule?
A: Lift:
It’s measure of performance of a targeting model (or a rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. Lift is simply: target response/average response.

Suppose a population has an average response rate of 5% (mailing for instance). A certain model (or rule) has identified a segment with a response rate of 20%, then lift=20/5=4

Typically, the modeler seeks to divide the population into quantiles, and rank the quantiles by lift. He can then consider each quantile, and by weighing the predicted response rate against the cost, he can decide to market that quantile or not.
“if we use the probability scores on customers, we can get 60% of the total responders we’d get mailing randomly by only mailing the top 30% of the scored customers”.

– Key performance indicator
– A type of performance measurement
– Examples: 0 defects, 10/10 customer satisfaction
– Relies upon a good understanding of what is important to the organization

More examples:

Marketing & Sales:
– New customers acquisition
– Customer attrition
– Revenue (turnover) generated by segments of the customer population
– Often done with a data management platform

IT operations:
– Mean time between failure
– Mean time to repair

– Statistics with good performance even if the underlying distribution is not normal
– Statistics that are not affected by outliers
– A learning algorithm that can reduce the chance of fitting noise is called robust
– Median is a robust measure of central tendency, while mean is not
– Median absolute deviation is also more robust than the standard deviation

Model fitting:
– How well a statistical model fits a set of observations
– Examples: AIC, R2, Kolmogorov-Smirnov test, Chi 2, deviance (glm)

Design of experiments:
The design of any task that aims to describe or explain the variation of information under conditions that are hypothesized to reflect the variation.
In its simplest form, an experiment aims at predicting the outcome by changing the preconditions, the predictors.
– Selection of the suitable predictors and outcomes
– Delivery of the experiment under statistically optimal conditions
– Randomization
– Blocking: an experiment may be conducted with the same equipment to avoid any unwanted variations in the input
– Replication: performing the same combination run more than once, in order to get an estimate for the amount of random error that could be part of the process
– Interaction: when an experiment has 3 or more variables, the situation in which the interaction of two variables on a third is not additive

80/20 rule:
– Pareto principle
– 80% of the effects come from 20% of the causes
– 80% of your sales come from 20% of your clients
– 80% of a company complaints come from 20% of its customers



Data-As-A-Service (#DAAS) to enable compliance reporting

 Data-As-A-Service (#DAAS) to enable compliance reporting

Subscribe to  Youtube


Everybody gets so much information all day long that they lose their common sense. – Gertrude Stein


#DataScience Approach to Reducing #Employee #Attrition

 #DataScience Approach to Reducing #Employee #Attrition


iTunes  GooglePlay


We are seeing a massive growth in video and photo data, where every minute up to 300 hours of video are uploaded to YouTube alone.

Sourced from: Analytics.CLUB #WEB Newsletter


Geeks Vs Nerds [Infographics]

I came across this infographic in one of the friend’s post on facebook and got hooked. It is certainly a nicely compiled infographics covering Geek viz-a-viz nerd.

Some Factoids:

  • 17% of Americans identify as geeks
  • 65% of video game designers identify as geeks
  • 50% of technology engineers identify as geeks
  • 37% of bloggers identify as geeks
  • 87% of people prefer the word “geek” over “nerd”
  • 66% of millennials think “geek” is a compliment
  • 45% of people believe geeks are early adopters
  • 31% of people believe geeks have a higher chance of being successful
  • On average, self-identifed geeks have a better view of themselves than others view geeks
  • 41% of people would be comfortable called a geek while only 24% would be comfortable called a nerd
  • A geek would rather be called a geek over a hipster (23% are OK with being called hipster while 41% are OK with being called a geek)

So what do you guys think? Do you agree the “facts” listed below? And according to the infographic, are you a geek or a nerd?

Geeks vs Nerds

Originally Posted at: Geeks Vs Nerds [Infographics]

The missing element of GDPR: Reciprocity

GDPR day has come and gone, and the world is still turning, just about. Some remarked that it was like the Y2K day we never had; whereas the latter’s impact was a somewhat damp squib, the former has caused more of a kerfuffle: however much the authorities might say, “It’s not about you,” it has turned out that it is about just about everyone in a job, for better or worse.

I like the thinking behind GDPR. The notion that your data was something that could be harvested, processed, bought and sold, without you having a say in the matter, was imbalanced to say the least. Data monetisers have been good at following the letter of the law whilst ignoring its spirit, which is why its newly expressed spirit — of non-ambiguous clarity and agreement — that is so powerful.

Meanwhile, I don’t really have a problem with the principle of advertising. A cocktail menu in a bar could be seen as context-driven, targeted marketing, and rightly so as the chances are the people in the bar are going to be on the look-out for a cocktail. The old adage of 50% of advertising being wasted (but nobody knows which 50%) helps no-one so, sure, let’s work together on improving its accuracy.

The challenge, however, comes from the nature of our regulatory processes. GDPR has been created across a long period of time, by a set of international committees with all of our best interests at heart. The resulting process is not only slow but also and inevitably, a compromise based on past use of technology. Note that even as the Cambridge Analytica scandal still looms, Facebook’s position remains that it acted within the law.

Even now, our beloved corporations are looking to how they can work within the law and yet continue to follow the prevailing mantra of the day, which is how to monetise data. This notion has taken a bit of a hit, largely as now businesses need to be much clearer about what they are doing with it. “We will be selling your information” doesn’t quite have the same innocuous ring as “We share data with partners.”

To achieve this, most attention is on what GDPR doesn’t cover, notably around personal identifiable information (PII). In layperson’s terms, if I cannot tell who the specific person is that I am marketing to, then I am in the clear. I might still know that the ‘target’ is a left-leaning white male, aged 45-55, living in the UK, with a  propensity for jazz, an iPhone 6 and a short political fuse, and all manner of other details. But nope, no name and email address, no pack-drill.

Or indeed, I might be able to exchange obfuscated details about a person with another provider (such as Facebook again), which happen to match similarly obfuscated details — a mechanism known as hashing. As long as I am not exchanging PII, again, I am not in breach of GDPR. Which is all well and good apart from the fact that it just shows how advertisers don’t need to know who I am in order to personalise their promotions to me specifically.

As I say, I don’t really have a problem with advertising done right (I doubt many people do): indeed, the day on which sloppy retargeting can be consigned to the past (offering travel insurance once one has returned home, for example) cannot come too soon. However I do have a concern, that the regulation we are all finding so onerous is not actually achieving one of its central goals.

What can be done about this? I think the answer lies in renewing the contractual relationship between supplier and consumer, not in terms of non-ambiguity over corporate use of data, but to recognise the role of consumer as a data supplier. Essentially, if you want to market to me, then you can pay for it — and if you do, I’m prepared to help you focus on what I actually want.

We are already seeing these conversations start to emerge. Consider the recent story about a man selling his Facebook data on eBay; meanwhile at a recent startup event I attended, an organisation was asked about how a customer could choose to reveal certain aspects of their lifestyle, to achieve lower insurance premiums.

And let’s not forget AI. I’d personally love to be represented by a bot that could assess my data privately, compare it to what was available publicly, then perhaps do some outreach on my behalf. Remind me that I needed travel insurance, find the best deal and print off a contract without me having to fall on the goodwill of the corporate masses.

What all of this needs is the idea that individuals are not simply hapless pawns to be protected (from where comes the whole notion of privacy), but active participants in an increasingly algorithmic game. Sure, we need legislation against the hucksters and tricksters, plus continued enforcement of the balance between provider and consumer which is still tipped strongly towards “network economy” companies.

But without a recognition that individuals are data creators, whose interests extend beyond simple privacy rights, regulation will only become more onerous for all sides, without necessarily delivering the benefits they were set out to achieve.

P.S. Cocktail, anyone? Mine’s a John Collins.

Follow @jonno on Twitter.


BOB in The Netherlands

I recently gave the keynote address at the Custon Customer Congress in The Netherlands. The presentation highlighted how B2B companies can improve their customer loyalty. A copy of the presentation can be found here. For the Dutch reader, you can find a summary of my presentation online at ITCommercie (note regarding my picture for the article: I was totally sober for the presentation).


Why On-Premises App Analytics Is the Way to Go

The app market is thriving, and that is opening up more opportunities for app development. However, despite a significant uptick in app downloads, this doesn’t mean users actually interact with your product. According to Statista, about 24% of apps downloaded are used only once. The problem lies in user engagement. It turns out that creating a great app is just part of the game.

That’s why marketers turn to app analytics. It lets them measure app usage and optimize the user experience to boost customer acquisition and retention. And if you want to acquire an accurate picture of your app engagement while mapping customers’ journeys across web and mobile, you need the right data.

Considering the growing concern about data protection and privacy restrictions in today’s digital ecosystem, this means you’re heading down a bumpy road. The good news is that you can buckle up and drive smoothly if you choose a reliable and comprehensible solution. That solution is self-hosted app analytics.

If you’re wondering how it does the trick, then we have a simple answer: security and control. You maintain ownership of data and know exactly where it’s stored and who can access it. But that’s just the beginning.

In this post we will walk you through all the twists and turns of on-premises app analytics to demonstrate how it can benefit your organization. Of course, we understand that your organization is like no other and has specific requirements, but some functionalities are bread and butter for every company. So, off we go!

1. Data is in your hands

Maintaining high safety and privacy standards is no easy task. Your organization most probably operates within strict protocols and policies that determine who can access certain data and information you collect. By choosing self-hosting deployment, you make your job easier as you can store, process, and archive data on your own servers or other ones of your choice. This way, all processing takes place under your watchful eye.

What’s more, developing software in your stack and introducing new technologies to an organization involves the burden of more security issues. This is because the responsibility of handling data is shared with someone else. However, with app analytics hosted on your own infrastructure there’s far less need to worry. Of course you should be careful, but after all, you keep 100% control and ownership of your data. It’s entirely up to you how the data is managed. So you can be sure no data is shared with third parties, and you can establish robust security standards and policies.

Piwik PRO vs. Google Analytics

Compare Google Analytics and Piwik PRO and Find the Analytics Tool to Fit Your Business’s Needs

Download FREE Comparison

2. Stand guard over your data

If you’re a part of a digital ecosystem, you realize that data security is a valuable commodity. It should be a cardinal principle of your company. The problem of data safety is especially acute with organizations operating in the banking, finance, health, and governmental sectors. Not only do they handle a wealth of personal data, but also sensitive data – and that comes at a higher risk.

Bear in mind that any data breach or leak could have a tremendous impact on the whole organization. It’s not only a matter of your company’s reputation, but also development of your software and services.

70% of financial institutions regard “security concerns” as one of the biggest impediments to mobile banking adoption, as revealed in a survey taken in five of the Federal Reserve Bank districts in the United States.

Safety issues concerning data haunt many fields of business. In this area, prevention seems to be the best cure. Though data security can be approached in various ways, it’s vital to ensure that access to your app analytics data is highly restricted.

If your choose a reliable vendor for your app analytics software, you can rest assured that your data is thoroughly protected. Wondering how can you accomplish that? Have a look at the Piwik PRO approach.

First of all, as we’ve already said, you can store all the data on your own servers. So you know where it’s located and who has access to it. What’s more, as you store data on-premises you can properly adjust your app analytics setup to internal data security policies and procedures.

To be more precise, you gain full access control. It means that you’re in charge of your infrastructure configuration and can adjust it to comply with your internal security rules. For instance, you decide whether you will be able to change the settings of certain sites or mobile apps analytics.

Then, you can apply Single Sign On. With this functionality you can manage all your users in a central database, and employ SSO (Single Sign On) for logging into your app analytics module using LDAP (Lightweight Directory Access Protocol), SAML (Security Assertion Markup Language), or other enterprise standards.

Finally, you can take advantage of Audit Logs. These provide you with detailed logs of all activities happening on your platform. They let you closely monitor and review:

  • login attempts
  • modification of instance settings
  • password updates
  • any reporting API requests

Keep in mind that the safety of your customers’ data lies not only in protection and setting access rights to it. You also need to consider what kind of data you aggregate and how you manage it. Once you know all the requirements, you can apply further security measures like encryption, pseudonymization, and anonymization.

These methods are your allies in the struggle to achieve legal compliance. However, you don’t have to do the work by yourself. Some vendors, like Piwik PRO, provide automatic app analytics data anonymization and support implementation of all your security procedures.

All in all, with a holistic approach and an array of functionalities you can sleep sound knowing that you’re achieving top-level security.

3. Make regulators and privacy-concerned clients happy

For organizations operating within a digital ecosystem, data privacy legislation and security are of the utmost importance. But at the same time, they create serious challenges. Privacy laws vary across countries, and beside local ones there are international rules, like GDPR.

That’s why you need to consider solutions that take the burden off your shoulders and help you follow the letter of the law. On-premises app analytics is a reliable partner in your compliance drive. Multinational enterprises go for this deployment because of the privacy regulations in force in countries they operate in.

For instance, China and Russia only allow for storage of their citizens’ personally identifiable information within their respective borders. The same rule applies to the European Economic Area (including EU Member States plus Iceland, Liechtenstein, and Norway).

It means that if you want to collect and process the data of these countries’ residents, you have to have a data center in each of these countries. In the case of certain European Union countries, it may be enough to have a server located within the EU’s borders.

Finally and importantly, compliance with various privacy laws and regulations requires you to maintain control over data and its storage. That’s why it’s vital for you to find the right app analytics vendor. In the case of Piwik PRO, you have full ownership of data by design and know exactly where it’s located.

4. Wield analytics reports with raw data

Analytics demands precision. As your app analytics orchestrates your marketing strategy, you need to be sure it has all the instruments in line; in other words, you acquire accurate data. That is the key to good reporting. You need data that you can trust, otherwise you simply won’t get any benefit from it.

Experts in various IT fields stress that among the numerous dimensions that impacts data quality, the most prominent ones are accuracy and completeness. You need to be sure that the data you collect represents exactly what happens on your mobile app or website. Also, you should be able to tap into precise information from across your whole digital business so you can better measure, analyze, and grasp it.

When we talk about data accuracy we mean raw, unsampled data. Unfortunately, not all vendors offer this.

However, you can find partners on the market, like Piwik PRO, that offer on-premises app analytics to deliver full data sets to you, not just samples. Access to raw data gives you precise and in-depth reports, a prerequisite for a sound strategy and decision-making process.

Piwik PRO vs. Google Analytics

Compare Google Analytics and Piwik PRO and Find the Analytics Tool to Fit Your Business’s Needs

Download FREE Comparison

To sum up: on-premises app analytics should be a key tool in your marketing arsenal. But maybe you want to know more about how to optimize the customer journey across web and mobile – because of the complexity of the issue, we’ve only provided an introduction here. So if you have any questions, don’t hesitate to drop us a line. We’ll be more than happy to fill you in on all the details.

Contact us

The post Why On-Premises App Analytics Is the Way to Go appeared first on Piwik PRO.

Source: Why On-Premises App Analytics Is the Way to Go by analyticsweek

How Predictive Analytics Is Fueling Subscription-Based Businesses

These days, you can’t go 10 minutes without hearing about a subscription service. We have video streaming services (Netflix, Hulu), meal delivery services (Blue Apron, Hello Fresh), movie passes (MoviePass), and subscription “boxes” for everything from beauty items (Birchbox) to pet supplies (BarkBox). Subscription-based business models are everywhere—and they’re tapping into an increasing stream of data. In today’s digital world, information on people’s buying habits is readily available. And thanks to predictive analytics, that data can be used to grow a business.

>> Related: Predictive Analytics 101 <<

Today’s most successful subscription network businesses are those that keep user acquisition costs low and quickly scale in order to negotiate premiums with retail establishments. As an example, let’s look at MoviePass, a $9.95-a-month subscription service that lets users watch up to three movies per month in a variety of theaters.

Leveraging Predictive Data

MoviePass collects demographics on its subscribers during signup, and then collects even more information when a subscriber uses the app to buy a ticket (time, location, theater, movie, etc.). Once they’ve scaled to a sizable number of subscribers, MoviePass can employ predictive analytics to answer questions such as:

  • What kind of movie does this person typically see?
  • Will this person see this movie this Friday at 7:00 pm?
  • How many people will see this movie this Friday at 7:00 pm at this specific location?

Armed with this predictive information, MoviePass has the power to navigate users to a specific theater—and therefore can more easily negotiate lucrative deals. For example, if there are a three theaters within a five-mile radius, the theater that gives the best deal to MoviePass could be the lucky one at the top of MoviePass subscribers’ lists to see a particular movie at 7:00 pm.

Expanding Offerings

Why stop at movies? Many moviegoers may also go to a local restaurant for dinner. MoviePass could again use the power of numbers in their ecosystem to negotiate deals with retail businesses and rapidly expand its offerings. Predictive data can answer questions such as:

  • Which subscribers go to a movie and have dinner at a nearby restaurant?
  • Which subscribers travel to a movie using Uber?

MoviePass then essentially becomes “RetailPass,” where any related retail service—dinner, transportation, etc.—can be packaged as part of the subscription for an additional one-time or monthly cost. In kind, these transactions would provide even more datathat could be used to tweak the offers that make the most sense for consumers. 

We might see more subscription businesses cropping up soon—maybe an UberPass, DinnerPass, LunchPass, and so on.Every service industry is a target for the subscription model as companies use the power of predictive analytics to drive user adoption and future business.

See how Logi can help with your predictive analytics needs. Sign up for a free demo of Logi Predict today >


Originally Posted at: How Predictive Analytics Is Fueling Subscription-Based Businesses by analyticsweek

True Test of Loyalty – Article in Quality Progress

Read the study by Bob E. Hayes, Ph.D. in the June 2008 edition of Quality Progress magazine titled The True Test of Loyalty. This Quality Progress article discusses the measurement of customer loyalty. Despite its importance in increasing profitability, customer loyalty measurement hasn’t kept pace with its technology. Using advocacy, purchasing and retention indexes to manage loyalty is statistically superior to using any single question alone. These indexes helped predict the growth potential of wireless service providers and PC manufacturers. You can download the article here.

Originally Posted at: True Test of Loyalty – Article in Quality Progress

QlikSense Set Analysis – Creating and Using Variables

Patrick McCaffrey
Practice Director, Business Analytics
John Daniel Associates, Inc. 
Patrick’s Profile

The majority of Qlik application developers will tell you that a large portion of their time is spent doing rework for customers who consistently change the scope of work. This can be very frustrating, especially after hours of work have been spent placing business logic in numerous complex expressions and the customer comes to you and asks you for something as simple as the addition of a “Product Type.” To the customer this may not seem like a lot of work, but developers, even at a beginner level, know that the addition of a Product Type may require hours of additional work depending on how many expressions need updated to reflect this change. The truth of the matter is that the customer is correct – this should not be a lot of work! Throughout the remainder of this blog I’d like to give you an experienced developer tip that can save you tremendous amounts of time and headaches while creating or updating your set analysis expressions.

An experienced developer should plan ahead and predict what variables may frequently change throughout the development process based on their previous experiences during customer engagements. They use these past experiences to take a proactive approach when developing their set analysis expressions, which they do by implementing variables. The use of variables in set analysis expressions diminishes the need to constantly go back and modify each and every expression every time a change is requested.

Imagine that your customer, a sporting goods store named ‘XYZ Sports’, has just told you that they want an application that shows the sales of their current outdoor sports inventory, but they only want to see the sales that relate to basketball, football, soccer, and baseball. The image below shows a general set analysis expression you would use in this situation. In the image you can see that we are summing sales of ‘Basketball’, ‘Football’, ‘Soccer’, and ‘Baseball’ for the month of ‘July’ and the year ‘2016’.

XYZ Sports then tells you that they want to see the sales of these same Product Types, but they would like to see the sales over the entire year of 2015, the entire year of 2016, and each quarter of 2016, all of which need to be in a different chart. You would now have a total of 7 expressions that include the Product Types field in your application. A few examples of these expressions are below.

A week after you have the application built and all expressions working for these Product Types, the Project Manager at XYZ Sports comes to you and explains that they are now considering ‘Hiking’, ‘Fishing’, and ‘Camping’ as outdoor sports and this inventory will also need to be counted towards the sales in the charts you have just created. The first thought from many novice developers is to go back into each expression and make the requested changes to the ‘Product Type’ field as shown below.

Updating all 7 charts with the new “Product Type” fields as shown above is a solution, but it is not the most effective solution. However, the use of variables will allow for this change to be made in one central location, which would apply the changes to all expressions containing that variable. In order to do this you would need to follow the steps below:

Step 1:

Open the Variable Overview and create a Product Type variable.

  • In QlikView press Ctrl+Alt+V
  • In Qlik Sense Hit Edit then click  at the bottom left of the screenIn the variable, place the entire string that is expected between your two Product Type curly brackets, or braces.

Step 2:

In the variable, place the entire string that is expected between your two Product Type curly brackets, or braces.

Product Type = {  }

Do not include an ‘=’ in front of the variable, as it will cause issues with the format in the next step (Step 3). Depending on your variable format, it may look as if you have spelling errors due to the red underscores, but you can ignore this warning.

Step 3: 

Place the newly created variable in your set analysis expressions in the format shown below. This variable will input the entire string you created in step 2 above.

Now each time you need to change, add, or remove a ‘Product Type’ you simply have to go to the Variable Overview screen and make the modification in that one central location. Once the change is made in the Variable Overview screen all expressions containing that variable will be updated. You no longer have to go into each expression and make the changes one at a time! This same approach can be taken for ‘Sales Month’, ‘Sales Year’, or any other field that you feel will be consistently changing throughout the front end development process.

  • It’s important to note that variable usage is much more practical when you have numerous expressions using similar fields and values throughout your application. Because of this you should evaluate your set analysis approach on a case-by-case basis.
  • Sometimes the need for a variable is evident from the onset of the application build which makes it easier to plan ahead and use variable implementation from the start. Other times you may not see the need for a variable until you are well into the development process. Identifying these scenarios will come with practice and experience.

I can already hear some of you say, “But, Master Measures in Qlik Sense….” Yes, Qlik Sense has helped to alleviate some of this pressure when utilizing the same expression over multiple charts with the introduction of Master Measures, but many Master Measures may still have some Set Analysis in common. The use of variables in those cases will accomplish the same thing. If that Set Analysis ever changes, you can simply update a variable for all, rather than adjust each one individually.

Front end application development can be tricky, cumbersome, and time consuming, and while variable usage can be a tremendous help in remedying these issues you will learn that there are many other tricks and tips that will help you become a more advanced and efficient developer. Developers should consistently be looking for ways to sharpen their skills and make their own work more efficient. No matter what level developer you are, I suggest that you do the same. Taking past and current experiences, learning from them, and building upon your current skill set is a great way to increase your efficiency in future endeavors. Knowing and understanding how customers operate can keep you a step ahead of the game while saving invaluable amounts of time, as evidenced above by the use of a simple variable. Process improvement skills, such as effective variable use, are highly valuable in the business intelligence field and will be acquired over time as you gain more experience.



Patrick is our Business Analytics Practice Director, a valued member of our John Daniel Associates Leadership Team in Pittsburgh, PA.


The post QlikSense Set Analysis – Creating and Using Variables appeared first on John Daniel Associates, Inc..

Source: QlikSense Set Analysis – Creating and Using Variables by analyticsweek

Setting Metric Targets in UX Benchmark Studies


benchmark-targetIn Benchmarking the User Experience, I write about the importance of a regular plan for quantifying the user experience of your websites, apps, or devices.

This involves collecting metrics, usually at both task and study levels.

But the point of benchmarking isn’t just to collect metrics to put on a dashboard, it’s to ultimately improve them.

A common question we receive when conducting benchmark studies is what to set the metric targets to. That is, what values should organizations aim for in their next benchmark? While it depends a bit on the context and consequences of the experience (and the metric itself), here are the target options I discuss that provide reasonable goals.

Above Average

A logical target to start with is having all metrics be at least above average. Let’s call this the Lake Wobegon target. We’ve collected data for the most common UX metrics to provide context and define what “average” is, at least for a broad set of contexts and products. Here are five common averages to use:

  • Completion rate average: 78%
  • Single Ease Question average: ~5.1
  • SUS average: 68
  • SUPR-Q average: 50%
  • Net Promoter Score for consumer software: 21%

Above Industry Average

To get more specific with using an average, narrow in on a relevant industry. Industry averages can be found from external reports or your own benchmarking analysis. We report many industry averages for websites and software. Here are a few averages by industry for SUPR-Q and SUS scores:

  • Hotel average SUPR-Q: 76%
  • Airline & aggregator website SUPR-Q: 83%
  • Retail websites SUPR-Q: 78%
  • Consumer software SUS: 75
  • Business software SUS: 66

Above a Competitor

If your website or product has a clear competitor, you should strive to at least meet or exceed it on key metrics (like the metrics listed above). Competitor data can be collected from conducting your own competitive benchmark or from published reports or databases. For example, Dropbox has a SUS score of 78 in our consumer software report and Netflix has a SUPR-Q score at the 95th percentile. A competitor, more generally speaking, can also be an old or existing product and any new version should at least meet or exceed the benchmark metrics of the legacy experience.

Above a Percentile Rank

For some measures that have enough data points, raw scores can be converted into percentiles ranks. Percentiles tell you where a score falls relative to all the scores in a database (often after a transformation to make the data normally distributed). This is a key characteristic of the SUPR-Q (hence the percentile rank in its name). A SUPR-Q percentile rank of 75% means the score is higher than 75% of the websites in the database.

For high-traffic retail websites, setting a target of above 90% makes sense as a target because it’s both above average for the retail website industry and a superior UX experience here has a closer association to revenue (and ROI)—meaning you want to aim high. Even if you aren’t a retail website, it makes sense to be at least above the 50th percentile (average) as a target for any metric.

Above a Grade

Percentile ranks can themselves be translated into letter grades (like the ones you got in school), which may help in interpretation and set additional target thresholds. Jim Lewis and I did this with the System Usability Scale as shown in Table 1. To achieve an “A” grade, you need to obtain a SUS score in the 90th percentile or above, which translates to at least a raw SUS score of 81. To have at least a “passing grade” you should target at minimum a SUS score of 52. But like a “D” in school, it’s hardly anything to be proud of!

Grade SUS Percentile Range
A+ 84.1-100 96-100
A 80.8-84.0 90-95
A- 78.9-80.7 85-89
B+ 77.2-78.8 80-84
B 74.1-77.1 70-79
B- 72.6-74.0 65-69
C+ 71.1-72.5 60-64
C 65.0-71.0 41-59
C- 62.7-64.9 35-40
D 51.7-62.6 15-34
F 0-51.6 0-14

Table 1: SUS scores, grades, and percentile ranks.

Use Norms, Context, and Competitors Together to Set Targets

Averages, competitors, percentiles, and grades provide comparison points to set targets, but the target your organization settles on usually involves a combination of them all and often starts with the context of the interface. For example, certain business-to-business software applications— like accounting software—are inherently more complicated than, say, a consumer website. We see this difference between our consumer and business software benchmarks too. For example, the average SUS score is 66 for B2B software products (a “C” grade) and 75 for B2C software products (a “B” grade). While it would be good to set a target of 90% for a SUS score (an A grade), it might be unachievable for enterprise accounting software compared to a search engine (although the users would appreciate it!).

This importance of product type and context is also supported by research from Kortum and Bangor. They conducted a large retrospective benchmark of everyday products (for example, microwaves, search engines, and Excel spreadsheets) using the SUS. The SUS for Excel in their dataset was 56 (a “D”) while web browsers (for example, Google Chrome) scored 88 (A+). While their Excel SUS was much lower than our Excel SUS (likely because they used a within- rather than between-subjects approach and users had low experiences), it does illustrate the different standards between classes of products. Excel is by many measures a very commercially successful product even though it scores much lower than Google Chrome.

When setting a target for your benchmark, aim high (higher than average, higher than your competitors and previous versions) but not so high as to set unrealistic targets.

Originally Posted at: Setting Metric Targets in UX Benchmark Studies by analyticsweek