Dec 28, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Insights  Source

[ AnalyticsWeek BYTES]

>> Why Entrepreneurship Should Be Compulsory In Schools by v1shal

>> Big Data Hyped, but Not Deployed by Businesses by analyticsweekpick

>> Hacking journalism: Data science in the newsroom by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 How Adobe Analytics Cloud Is Helping Brands Grow – Forbes Under  Analytics

>>
 10 Questions Executives Should Be Asking Before Hiring A Data Scientist – Forbes Under  Data Scientist

>>
 3 Retirement Statistics That Should Scare You — The Motley Fool – Motley Fool Under  Statistics

More NEWS ? Click Here

[ FEATURED COURSE]

Applied Data Science: An Introduction

image

As the world’s data grow exponentially, organizations across all sectors, including government and not-for-profit, need to understand, manage and use big, complex data sets—known as big data…. more

[ FEATURED READ]

Thinking, Fast and Slow

image

Drawing on decades of research in psychology that resulted in a Nobel Prize in Economic Sciences, Daniel Kahneman takes readers on an exploration of what influences thought example by example, sometimes with unlikely wor… more

[ TIPS & TRICKS OF THE WEEK]

Finding a success in your data science ? Find a mentor
Yes, most of us dont feel a need but most of us really could use one. As most of data science professionals work in their own isolations, getting an unbiased perspective is not easy. Many times, it is also not easy to understand how the data science progression is going to be. Getting a network of mentors address these issues easily, it gives data professionals an outside perspective and unbiased ally. It’s extremely important for successful data science professionals to build a mentor network and use it through their success.

[ DATA SCIENCE Q&A]

Q:Explain what a local optimum is and why it is important in a specific context,
such as K-means clustering. What are specific ways of determining if you have a local optimum problem? What can be done to avoid local optima?

A: * A solution that is optimal in within a neighboring set of candidate solutions
* In contrast with global optimum: the optimal solution among all others

* K-means clustering context:
It’s proven that the objective cost function will always decrease until a local optimum is reached.
Results will depend on the initial random cluster assignment

* Determining if you have a local optimum problem:
Tendency of premature convergence
Different initialization induces different optima

* Avoid local optima in a K-means context: repeat K-means and take the solution that has the lowest cost

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with  John Young, @Epsilonmktg

 #BigData @AnalyticsWeek #FutureOfData #Podcast with John Young, @Epsilonmktg

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Numbers have an important story to tell. They rely on you to give them a voice. – Stephen Few

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Nathaniel Lin (@analytics123), @NFPA

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

14.9 percent of marketers polled in Crain’s BtoB Magazine are still wondering ‘What is Big Data?’

Sourced from: Analytics.CLUB #WEB Newsletter

The biggest names in the world of big data are set to help New Orleans crunch numbers

mitch-landrieu-2311befcb026322d

 

Some of the sharpest minds in public policy will soon be helping New Orleans bureaucrats harness the power of big data to find new ways of making government more effective and efficient, thanks to a partnership with Bloomberg Philanthropies announced Wednesday (Aug. 5).

The city is one of eight selected to participate in the first round of Bloomberg’s program, dubbed What Works Cities.

Big business has been mining data for years, scouring enormous data sets in search of patterns that could help them shave a few seconds off of delivery times or improve their sales pitches, anything to give them a leg up on the competition.

What Works will draw on some of the biggest names in public policy innovation to help 100 mid-sized cities apply similar techniques to garbage pickup, bill collection, permit processing and any other government task that can be improved by a data-driven approach.

Some high-profile examples of such data mining yielding huge savings at not much cost to the taxpayer. Andy Kopplin, the city’s chief administrative officer, cited a famousexample in England. In an effort to improve tax collection, researchers there experimented with the insertion of different sentences into letters to deadbeats. Sentences that implied social responsibility, such as reminders that “Nine out of ten people in Britain pay their tax on time,” helped increased the government’s clearance rate by 29 percent in one year.

Mayor Mitch Landrieu has made improved use of data a priority, and the city has already used analytics to address things like blight reduction, Kopplin said, but there is still room for improvement. “We are not nearly as advanced as we could be. We are leading the way in a lot of ways, but compared to Amazon or Google or Netflix, there is a lot more than we could do,” he said.

Kopplin rattled off possible applications:

  • Looking at emergency medical dispatches to better map out where ambulances spend the 30 percent of the time they aren’t in service.
  • Comparing crime maps to the addresses of liquor-license holders to see which businesses are generating crime.
  • Finding which of the city’s 80,000 drainage catch basins generate the most flooding, allowing the city’s limited cleaning researches to make the biggest impact.

Bloomberg has committed $42 million to fund the program. Much of that money will be used to provide grants to think-tanks that will serve as technical advisors, Kopplin said. For 18-months, the city will have access to institutions such as the Government Performance Lab at Harvard, the Center for Government Excellence at Johns Hopkins University, Results for America, the Sunlight Foundation; and The Behavioral Insights Team, a partnership between the British government and philanthropy.

Sharman Stein, a spokeswoman for the Bloomberg program, praised New Orleans history of using data to address problems. “With its strong commitment to the use of data and evidence, New Orleans is a leader in the field. Its commitment and desire to participate helped to speed its application along,” she said.

This is the second major partnership between the Landrieu administration and Bloomberg Philanthropies. Bloomberg previously gave the city $3.7 million to fund an innovation team focused on attacking the city’s murder rate.

The city has also partnered with national foundations to fund a racial reconciliation program and improved resilience strategies.

Note: This article originally appeared in NOLA. Click for link here.

Originally Posted at: The biggest names in the world of big data are set to help New Orleans crunch numbers

July 24, 2017 Health and Biotech analytics news roundup

UK Biobank partners with the EGA: Biobank has collected extensive genome and other data from 500,000 people, and the European Genome-phenome archive will use its present infrastructure to distribute this data quickly and with high security to people who wish to do studies on it.

Digital SAGE assessment accurate for early dementia: A tablet-based version of a current paper-based test performs just as well, and can potentially save resources currently used for dementia diagnosis.

Alzheimer’s Proteomics Treasure Trove? Nick Seyfried and Allan Levey developed techniques to find protein levels for over 10,000 genes in patients with Alzheimer’s and controls. They found ‘suites of proteins’ whose levels correlated with the disease.

Pharmacogenomics Is Ushering in a New Era of Personalized Prescriptions: Two examples of abnormal metabolism causing serious health problems illustrate the need for personalized genetic testing. Costs and privacy concerns are still holding the technology back, however.

Originally Posted at: July 24, 2017 Health and Biotech analytics news roundup

Dec 21, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Human resource  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> The Value of Opinion versus Data in Customer Experience Management by bobehayes

>> Firing Up Innovation in Big Enterprises by d3eksha

>> 3 Big Data Stocks Worth Considering by analyticsweekpick

Wanna write? Click Here

[ NEWS BYTES]

>>
 REVEALED: How your choice of HOBBY is tied to the size of your income (and it looks like we all need to take up golf!) – Daily Mail Under  Statistics

>>
 Transition to value-based care supported by data analytics – TechTarget Under  Health Analytics

>>
 InsightXM Launches Event Data Analytics Package Tailored to Event Planners – TSNN Trade Show News Under  Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

R, ggplot, and Simple Linear Regression

image

Begin to use R and ggplot while learning the basics of linear regression… more

[ FEATURED READ]

Rise of the Robots: Technology and the Threat of a Jobless Future

image

What are the jobs of the future? How many will there be? And who will have them? As technology continues to accelerate and machines begin taking care of themselves, fewer people will be necessary. Artificial intelligence… more

[ TIPS & TRICKS OF THE WEEK]

Analytics Strategy that is Startup Compliant
With right tools, capturing data is easy but not being able to handle data could lead to chaos. One of the most reliable startup strategy for adopting data analytics is TUM or The Ultimate Metric. This is the metric that matters the most to your startup. Some advantages of TUM: It answers the most important business question, it cleans up your goals, it inspires innovation and helps you understand the entire quantified business.

[ DATA SCIENCE Q&A]

Q:How do you test whether a new credit risk scoring model works?
A: * Test on a holdout set
* Kolmogorov-Smirnov test

Kolmogorov-Smirnov test:
– Non-parametric test
– Compare a sample with a reference probability distribution or compare two samples
– Quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution
– Or between the empirical distribution functions of two samples
– Null hypothesis (two-samples test): samples are drawn from the same distribution
– Can be modified as a goodness of fit test
– In our case: cumulative percentages of good, cumulative percentages of bad

Source

[ VIDEO OF THE WEEK]

@SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

 @SidProbstein / @AIFoundry on Leading #DataDriven Technology Transformation #FutureOfData #Podcast

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Torture the data, and it will confess to anything. – Ronald Coase

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with Joe DeCosmo, @Enova

 #BigData @AnalyticsWeek #FutureOfData #Podcast with Joe DeCosmo, @Enova

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Brands and organizations on Facebook receive 34,722 Likes every minute of the day.

Sourced from: Analytics.CLUB #WEB Newsletter

Patient Experience Differences Between Acute Care and Critical Access Hospitals

Hospitals are focusing on improving the patient experience as well as clinical outcomes measures.  The Centers for Medicare & Medicaid Services (CMS) will be using patient feedback about their care as part of their reimbursement plan for Acute Care Hospitals. Under the Hospital Value-Based Purchasing Program (beginning in FY 2013 for discharges occuring on or after October 1, 2012), CMS will make value-based incentive payments to acute care hospitals, based either on how well the hospitals perform on certain quality measures or how much the hospitals’ performance improves on certain quality measures from their performance during a baseline period.  The higher the score/greater the improvement, the higher the hospital’s incentive payment for that fiscal year.

Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS)

The patients’ feedback will be collected by a survey known as HCAHPS (Hospital Consumer Assessment of Healthcare Providers and Systems). HCAHPS (pronounced “H-caps“) is a national, standardized survey of hospital patients and was developed by a partnership of public and private organizations.

The development of HCAHPS was funded by the Federal government, specifically the Centers for Medicare & Medicaid Services (CMS) and the Agency for Healthcare Research and Quality (AHRQ). HCAHPS was created to publicly report the patient’s perspective of hospital care.

The survey asks a random sample of recently discharged patients about important aspects of their hospital experience. The data set includes patient survey results for over 3800 US hospitals on ten measures of patients’ perspectives of care. The 10 measures are:

  1. Nurses communicate well
  2. Doctors communicate well
  3. Received help as soon as they wanted (Responsive)
  4. Pain well controlled
  5. Staff explain medicines before giving to patients
  6. Room and bathroom are clean
  7. Area around room is quiet at night
  8. Given information about what to do during recovery at home
  9. Overall hospital rating
  10. Recommend hospital to friends and family (Recommend)

For questions 1 through 7, respondents were asked to provide frequency ratings about the occurrence of each attribute (Never, Sometimes, Usually, Always). For question 8, respondents were provided a Y/N option. For question 9, respondents were asked to provide an overall rating of the hospital on a scale from 0 (Worst hospital possible) to 10 (Best hospital possible). For question 10, respondents were asked to provide their likelihood of recommending the hospital (Definitely no, Probably no, Probably yes, Definitely yes). For the current analysis, I created a patient loyalty metric by averaging questions 9 (Overall hospital quality rating) and 10 (Recommend hospital). In my prior analysis, this Patient Advocacy Index (PAI) had a reliability estimate (Cronbach’s alpha) of .95.

The HCAHPS data set is available to anybody who wants to download it. This data set contains a variety of metrics (patient feedback, health outcome measures, safety measures) for over 4800 US hospitals. I will focus on the patient experience metrics. Top box scores are used as the metric of patient experience. Top box scores for the respective rating scales are defined as: 1) Percent of patients who reported “Always”; 2) Percent of patients who reported “Yes”; 3) Percent of patients who gave a rating of 9 or 10; 4) Percent of patients who said “Definitely yes.” Top box scores provide an easy-to-understand way of communicating the survey results for different types of scales. Even though there are four different rating scales for the survey questions, using a top box reporting method puts all metrics on the same numeric scale. Across all 10 metrics, hospital scores can range from 0 (bad) to 100 (good).

Based on HCAHPS reporting schedule, it appears the current survey data were collected from Q3 2010 through Q2 2011 and represent the latest publicly available patient survey data.

Comparing Acute Care Hospitals and Critical Access Hospitals

The HCAHPS survey is administered to two types of hospitals, acute care and critical access. Acute Care Hospitals (ACH) are hospitals that provide short-term patient care. For Medicare payment purposes, they are generally defined as having an average inpatient length of  stay greater than 25 days. Critical Access Hospitals (CAH) are small facilities that give limited outpatient and inpatient hospital services to people in rural areas that receive cost-based reimbursement.

Acute Care vs. Critical Access
Table 1. Difference between Acute Care Hospitals and Critical Access Hospitals

Results

I first looked at the mean differences across the two hospital types. The descriptive statistics for each type of hospitals are located in Table 1.

 

  • Critical Access Hospitals receive higher patient ratings on all of the metrics compared to Acute Care Hospitals.  The differences between the two groups are statistically significant and are substantial; the largest differences between the two hospital types was for “Responsive” and “Room and bathroom are clean.”

 

Acute Care Hospitals and Critical Access Hospitals have much room for improvement in delivering a better patient experience (average top-box scores for patient experience metrics are 70 and 76, respectively). Improving the patient experience will necessarily have a positive impact on patient advocacy/loyalty toward the hospital. Determining the patient experience areas to improve is a function of their importance in driving patient advocacy; some patient experience areas are more important than others. Next, I conducted loyalty driver analysis for each hospital type.

Figure 1. Driver Matrix is a Business Intelligence Solution

Loyalty Driver Analysis is a business intelligence solution that helps companies understand and improve customer loyalty, an indicator of future business growth. In loyalty driver analysis, we look at two key pieces of information of each business area: 1) Performance of the business area and 2) Impact of that business area on patient loyalty. The results are typically presented in a driver matrix (see Figure 1).

Loyalty Driver Matrix for Acute Care Hospitals
Figure 2. Loyalty Driver Matrix for Acute Care Hospitals

The driver matrices using the patient survey data appear in Figures 2 (Acute Care) and 3 (Critical Access). We see that there are clear differences between these two driver matrices.

Loyalty Driver Matrix for Critical Access Hospitals
Figure 3. Loyalty Driver Matrix for Critical Access Hospitals
  • The patient experience is more important for driving patient advocacy for acute care hospitals compared to critical access hospitals.  For acute care hospitals, the average correlation of patient experience metrics with patient advocacy  is .62. For critical access hospitals, the average correlation of patient experience metrics with patient advocacy is .50.
  • Key drivers for each type of hospital are different. For acute care hospitals, three key drivers of patient advocacy are: 1) Staff explains meds, 2) Responsiveness and 3) Pain well controlled. For critical access hospitals, we see a very different picture. Only two patient experience attributes appear to be key drivers of patient advocacy: 1) Pain well controlled and 2) Responsiveness. Also, for critical access hospitals, the importance of these two areas (e.g., pain well controlled and responsiveness) are less important to patient advocacy (r = .51) compared to acute care hospitals (r = .67). These areas appear in the upper left quadrant and suggest that these areas are important to patient advocacy and have much room for improvement
  • For both hospital types, the biggest driver of patient advocacy is the patients’ perception of the quality of nurses’ communication effectiveness. Because nurses are likely involved with most of the day-to-day dealings with patient care, their performance impacts many different facets of the patient experience (e.g., Responsiveness, staff explains med).

Summary and Conclusions

There are a few important points we can conclude based on the analyses.

  1. Acute Care Hospitals (ACH) receive lower patient experience ratings compared to Critical Access Hospitals (CAHs) across all patient experience metrics. Also, ACHs receive lower patient loyalty/advocacy ratings compared to CAHs. Identifying the causes for these differences across hospital types might shed light on how ACHs can improve/explain their lower patient experience ratings.
  2. While the patient experience is related to patient loyalty/advocacy across both hospital types, it appears that patient loyalty/advocacy is more closely associated with the patient experience for ACHs than for CAHs.
  3. The key areas for improving patient loyalty/advocacy differ across hospital types. ACHs need to focus on 1) Staff explains meds, 2) Responsiveness and 3) Pain management. CAHs need to focus on 1) Pain management and 2) Responsiveness.
  4. To improve patient loyalty toward ACHs, hospitals might consider focusing on three areas: 1) Pain management, 2) Responsiveness and 3) Staff explaining meds to patients. As an industry, these three patient experience areas appear as key drivers of patient loyalty; that is, each area has much room for improvement and has a relatively big impact on patient loyalty/advocacy.

Improving the patient loyalty/advocacy starts with understanding the patient experience factors that impact it. The present analysis showed how two different types of hospitals might approach this problem.  Patient experience management is the process of understanding and managing your patients’ interactions with and attitudes about your hospital. The concept of customer experience management (and all its tools, processes and practices) can be a source to help hospitals tackle the problem of poor patient experience. Individual hospitals can use the methodology presented here (driver analysis, correlational analysis) at the patient level of analysis to understand patient experience improvements for their hospital’s specific needs.

Originally Posted at: Patient Experience Differences Between Acute Care and Critical Access Hospitals by bobehayes

What is data security For Organizations?

What is Data Security in the Cloud?

What is data securityThe past two decades have seen rapid progress in technology. While the internet revolution has connected businesses around the world, cloud computing technologies have optimized resources. The Internet of Things (IoT) brings a versatile range of devices into the network. Gone are the days when communication was only possible between computers. The IoT revolution makes it possible to transmit data across a range of devices. Unfortunately, the advances in technology are accompanied by data security threats.

According to the Cisco Visual Networking Index, global IP network traffic is more than 1 zettabyte per year, or 91.3 exabytes per month. This value is expected to reach 1.6 zettabytes per year by 2018, equivalent to 45 million DVDs per hour. With such huge volumes of data traveling on the network, hackers have the incentive to develop scripts to capture data. The Identity Theft Resource Center (ITRC) reports that 666 data breach cases were identified between January 1, 2014 and November 12, 2014, the medical and health-care segments being worst affected. Whether big or small, data breaches can severely affect the revenues of a company. Until recently, 80% of data loss was caused by company insiders. However, this situation is changing. With the ever-evolving internet trends, data security threats are increasing exponentially. Data security must be addressed in many dimensions.

What is Data Security Within Organizations?

Within an organization, the entire network must be securely deployed, so that unauthorized users cannot gain access. Moreover, it is important to hire reliable personnel to manage databases and system administration. When managing data, it is advisable to streamline procedures, so that different privileges are assigned to different users based on their job roles. Data management has to be augmented with efficient technology that enforces system policies properly for secure access to data, and its storage, retrieval or manipulation.

What is Data Security Outside Organizations?

The internet revolution has created integrated business systems whereby employees, clients and customers can access corporate information from anywhere, at any time. While this flexibility creates more opportunities, data security is at risk. Data traveling between networks may be subjected to tampering, eavesdropping, identity theft, and unauthorized access. Network encryption and access controls that are augmented with a higher level of authentication are required to securely transmit data.

What is Data Security in the Private and Public Cloud?

Today, everything resides in the cloud. In 2012, Gartner predicted the transition of offline PC systems to the cloud by 2014. The prediction was accurate. The majority of enterprises use at least one model of cloud computing technologies to carry out business procedures. However, increased agility and economic benefits come at a price. With the cloud and virtualization technologies, businesses have logical control over the data, but the actual data reside on servers managed by third party providers. When multi-tenants share the infrastructure, data integrity is compromised. Moreover, data compliance issues may arise when data reside away from company premises. Customer privacy needs to be maintained. Data segregation techniques matter. Without clear visibility into operational intelligence, companies have to rely on third parties’ security solutions. In case of data disaster, businesses should be able to retrieve data and services. If a cloud provider is acquired, data and services should still be securely maintained.

The traditional network-centric security solutions, such as intrusion detection systems and firewalls, cannot protect your data from hacking by privileged users and advanced persistent threats (APTs). There are other methods, such as security information and event management (SIEM) and database audit and protection (DAP), for event correlation. With stringent data regulations in place and increased data breaches, businesses have to move from network-centric solutions to data-centric solutions by integrating data security intelligence and data firewalls to create a veritable firewall around the data. Strong access controls, key management and encryption that are augmented with security intelligence are required, because once you move everything into the cloud, you only have a web browser as an interface.

What are Data Security Law and Policy?

The Data Protection Act 1998 is a British law that regulates the processing of data on identifiable living people. It controls how organizations, businesses and the government use the personal information of users. While businesses have to cope with rapidly exploding big data, they have to work in compliance with data protection laws, which are more stringent when sensitive information such as ethnic background, religious beliefs and criminal records are involved. As opposed to Britain and the European Union, the United States does not yet have a consolidated data protection law, instead adopting privacy legislation on an ad hoc basis. The Video Privacy Protection Act of 1988 and the Massachusetts Data Privacy Regulations of 2010 are a couple of examples.

When it comes to the cloud, there are no borders. A company located in one country might use CRM solutions offered by another company that is based in a different country. In such cases, it is not easy to know where the data are stored, how they are processed and what data protection laws govern them. Businesses that are moving into the cloud should enquire about data management by the cloud provider.

What is Data Security in a Private Cloud Solution?

While resource allocation and data security are the prime aspects of concern in the public cloud, deployment of a private cloud is a totally different ball game. In a private cloud, data are stored within your company’s perimeter, behind a dedicated firewall, and are securely accessed through encrypted connections. Data are always stored on your server, and remote users only get projections of data on their devices. Moreover, a private cloud provides greater control over redundancy, because you address your redundancy requirements when designing your data center environment. With the hardware being on-site, businesses have more control over data monitoring and management. Data compliance is effectively met. While businesses can enjoy the scalability, agility and mobility offered by the cloud, security and business continuity are maintained at the highest level. Applications hosted in the private cloud require less administrative overhead and reduced customer support, while ensuring that only the latest versions of applications are used. However, higher costs, capacity ceiling, and on-site maintenance are a few aspects that should be considered. The key is to choose the right tool that delivers a secure cloud environment.

2X Remote Application Server (2X RAS) is a leading software solution that allows companies to manage and deliver virtual applications and desktops from a private cloud. The flexibility of the product allows companies to leverage different hypervisors, such as Hyper-V, VMware and Citrix. With 2X RAS, organizations can guarantee secure access to corporate applications and data from any device. The SSL encryption secures transmission of data between the device and the server farm.

The wide range of compatible devices makes 2X RAS one of the most effective solutions available. 2X RDP Clients and Apps for 2X RAS are available for Windows, Mac, Linux, Android, iOS, Windows Phone and HTML5. Click here to read  more about 2X RAS.

Originally posted via “What is data security For Organizations?”

Source: What is data security For Organizations?

Dec 14, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Insights  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ NEWS BYTES]

>>
 3 machine learning success stories: An inside look | CIO – CIO Under  Machine Learning

>>
 Your data science career starts here: a $15 online course – Mashable – Mashable Under  Data Science

>>
 1.8 million workforce shortage in cyber security by 2022: Study – ETCIO.com Under  cyber security

More NEWS ? Click Here

[ FEATURED COURSE]

Probability & Statistics

image

This course introduces students to the basic concepts and logic of statistical reasoning and gives the students introductory-level practical ability to choose, generate, and properly interpret appropriate descriptive and… more

[ FEATURED READ]

How to Create a Mind: The Secret of Human Thought Revealed

image

Ray Kurzweil is arguably today’s most influential—and often controversial—futurist. In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse… more

[ TIPS & TRICKS OF THE WEEK]

Keeping Biases Checked during the last mile of decision making
Today a data driven leader, a data scientist or a data driven expert is always put to test by helping his team solve a problem using his skills and expertise. Believe it or not but a part of that decision tree is derived from the intuition that adds a bias in our judgement that makes the suggestions tainted. Most skilled professionals do understand and handle the biases well, but in few cases, we give into tiny traps and could find ourselves trapped in those biases which impairs the judgement. So, it is important that we keep the intuition bias in check when working on a data problem.

[ DATA SCIENCE Q&A]

Q:What do you think about the idea of injecting noise in your data set to test the sensitivity of your models?
A: * Effect would be similar to regularization: avoid overfitting
* Used to increase robustness

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Numbers have an important story to tell. They rely on you to give them a voice. – Stephen Few

[ PODCAST OF THE WEEK]

Using Analytics to build A #BigData #Workforce

 Using Analytics to build A #BigData #Workforce

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

Estimates suggest that by better integrating big data, healthcare could save as much as $300 billion a year — that’s equal to reducing costs by $1000 a year for every man, woman, and child.

Sourced from: Analytics.CLUB #WEB Newsletter

Intel bets on real-time analytics with high-end processor refresh

The chip company launches its new generation of its Xeon E7 processor that it hopes will spur companies to crunch data as it is collected.

Intel today refreshed its Xeon E7 family of processors with a line-up aimed at helping businesses carry out real-time analytics on big datasets.

The chip giant launched the third generation of its high-end server chips, based on its Haswell-EX microarchitecture.

Intel sees a use for these new processors in carrying out analytics on large datasets as they are collected by enterprise

Scott Pendrey, Intel server product manager for EMEA, said exponential growth in data being collected by firms will drive renewal of analytics infrastructure.

“We continue to see larger and larger volumes of data every year. By 2020 we’re forecasting something like 50 billion devices and 44 zettabytes of data – huge amounts of information.”

Intel is betting that firms will have an appetite for carrying out real-time analytics on that data – necessitating machines with the compute and memory capacity of E7-based systems.

“It’s the real-time shift into analytics that is the big business case that we see continuing to grow.”

To boost performance for these types of analytic workloads, the new E7 range includes a feature called TSX (Transactional Synchronisation Extensions).

In tests, TSX has helped the SAP Hana in-memory analytics platform achieve six times faster data transaction times than when using previous generation processors, according to Pendrey.

On average, Intel claims the new generation of processors could deliver a 40 percent improvement in performance over their predecessors when handling “mainstream workloads”.

Other optimisations in the processor family are able to boost code written to run in parallel, with Pendrey saying such code could execute 70 percent faster when run on the top of the range offerings in the new E7 line-up, due to the higher core count, processor cache and support for AVX2 extensions.

Intel has upped the maximum number of processor cores in the E7 line, from 15 to 18, and increased cache per socket from 37.5MB to 45MB. Also new is support for DDR4 alongside DDR3 memory. DDR4 allows for a greater memory density with a lower power consumption than DDR3, as well as operating at a higher speed. However, DDR4 is also about 20 percent more expensive than the equivalent amount of DDR3 memory.

Each Xeon E7 processor has native support for up to eight sockets per system, with each socket supporting up to 1.5TB, which takes the total memory per eight-way system up to 12TB.

To help satisfy Intel’s goal of the E7 being used for mission-critical workloads the new processors also offer 40 redundancy, availability and serviceability features – including memory sparing to support back-up memory modules and various provisions for error-checking.

Like the E5, the E7 also includes the AES-NI instruction set to accelerate data encryption.

Intel and its partners – such as Dell, HP and Fujistu – will launch about 15 systems based on the new processors today, with that number rising to 40 within 30 days.

The cost of the processors will match the previous generation E7 but systems may be more expensive due to the use of DDR4 memory.

Originally posted via “Intel bets on real-time analytics with high-end processor refresh”

Originally Posted at: Intel bets on real-time analytics with high-end processor refresh by analyticsweekpick

Data Science and Big Data: Two very Different Beasts

2-beastsIt is difficult to overstate the importance of data in today’s economy.  The tools we use and actions we take consume and generate a digital version of our world, all captured, waiting to be used. Data have become a real resource of interest across most industries and is rightly considered the gateway to competitive advantage and disruptive strategy.

Along with the rise of data has come two distinct efforts concerned with harnessing its potential. One is called Data Science and the other Big Data. These terms are often used interchangeably despite having fundamentally different roles to play in bringing the potential of data to the doorstep of an organization.

Although some would argue there is still confusion over the terms Data Science and Big Data, this has more to do with marketing interests than an honest look at what these terms have come to mean on real-world projects.  Data Science looks to create models that capture the underlying patterns of complex systems, and codify those models into working applications. Big Data looks to collect and manage large amounts of varied data to serve large-scale web applications and vast sensor networks.

Although both offer the potential to produce value from data, the fundamental difference between Data Science and Big Data can be summarized in one statement:

Collecting Does Not Mean Discovering

Despite this declaration being obvious, its truth is often overlooked in the rush to fit a company’s arsenal with data-savvy technologies.  Value is too often framed as something that increases solely by the collection of more data. This means investments in data-focused activities center around tools instead of approaches. The engineering cart gets put before the scientific horse, leaving an organization with a big set of tools, and a small amount of knowledge on how to convert data into something useful.

Bringing Ore to an Empty Workshop

Since the onset of the Iron Age, Blacksmiths have used their skills and expertise to turn raw extracted material into a variety of valuable products. Using domain specific tools, the Blacksmith forges, draws, bends, punches and welds the raw material into objects of great utility.  Through years of research, trial and error the Blacksmith learned to use choice gases, specific temperatures, controlled atmospheres and varied ore sources to yield a tailored product bespoke to its unique application.
Forging Iron

With the Industrial Revolution came the ability to convert raw material into valuable products more efficiently and at scale.  But the focus on scaling wasn’t the acquisition of more material. It was on building tools that scaled and mechanized the expertise in converting. With this mechanization came an even greater need to understand the craft since to effectively operate, maintain and innovate at scale one had to deeply understand the process of converting raw material into products that answered to the always-changing demands of the market.

In the world of data this expertise in converting is called Data Science. The reason it takes a science to convert a raw resource into something of value is because what is extracted from the ‘ground’ is never in a useful form. ‘Data in the raw’ is littered with useless noise, irrelevant information, and misleading patterns. To convert this into that precious thing we are after requires a study of its properties and the discovery of a working model that captures the behavior we are interested in. Being in possession of a model despite the noise means an organization now owns the beginnings of further discovery and innovation. Something unique to their business that has given them the knowledge of what to look for, and the codified descriptions of a world that can now be mechanized and scaled.

Conversion should Scale before Collection

No industry would invest in the extraction of a resource without the expertise in place to turn that resource into value. In any industry, that would be considered a bad venture.  Loading the truck with ore only to have it arrive at an empty workshop adds little strategic benefit.

An unfortunate aspect of Big Data is that we look to the largest companies to see what solutions they have engineered to compete in their markets. But these companies hardly represent the challenges faced by most organizations. Their dominance often means they face very different competition and their engineering is done predominantly to serve large-scale applications.  This engineering is critical for daily operations, and answering to the demands of high throughput and fault-tolerant architectures. But it says very little about the ability to discover and convert what is collected into valuable models that capture the driving forces behind how their markets operate. The ability to explain and predict an organization’s dynamic environment is what it means to compete using data.

Understanding the distinction between Data Science and Big Data is critical to investing in a sound data strategy. For organizations looking to utilize their data as a competitive asset, the initial investment should be focused on converting data into value. The focus should be on the Data Science needed to build models that move data from raw to relevant. With time, Big Data approaches can work in concert with Data Science. The increased variety of data extracted can help make new discoveries or improve an existing model’s ability to predict or classify.

Fill the workshop with the skills and expertise needed to convert data into something useful. The ore brought here will become the products that define a business.

To read the original article on KD nuggets, click here.

Source

Dec 07, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)

[  COVER OF THE WEEK ]

image
Complex data  Source

[ LOCAL EVENTS & SESSIONS]

More WEB events? Click Here

[ AnalyticsWeek BYTES]

>> Reimagining the role of data in government by v1shal

>> #FutureOfData with @CharlieDataMine, @Oracle discussing running analytics in an enterprise by v1shal

>> Dipping Customer Satisfaction? 5 Lessons from a cab driver by v1shal

Wanna write? Click Here

[ NEWS BYTES]

>>
 BlueTalon to Provide Unified Data Access Control Across Dell EMC Elastic Data PlatformBlueTalon Steadily Emerging … – Markets Insider Under  Data Security

>>
 Ten Things Everyone Should Know About Machine Learning – Forbes Under  Machine Learning

>>
 The Value Of Real-Time Data Analytics – Forbes Under  Analytics

More NEWS ? Click Here

[ FEATURED COURSE]

CS109 Data Science

image

Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data managem… more

[ FEATURED READ]

The Signal and the Noise: Why So Many Predictions Fail–but Some Don’t

image

People love statistics. Statistics, however, do not always love them back. The Signal and the Noise, Nate Silver’s brilliant and elegant tour of the modern science-slash-art of forecasting, shows what happens when Big Da… more

[ TIPS & TRICKS OF THE WEEK]

Fix the Culture, spread awareness to get awareness
Adoption of analytics tools and capabilities has not yet caught up to industry standards. Talent has always been the bottleneck towards achieving the comparative enterprise adoption. One of the primal reason is lack of understanding and knowledge within the stakeholders. To facilitate wider adoption, data analytics leaders, users, and community members needs to step up to create awareness within the organization. An aware organization goes a long way in helping get quick buy-ins and better funding which ultimately leads to faster adoption. So be the voice that you want to hear from leadership.

[ DATA SCIENCE Q&A]

Q:How do you test whether a new credit risk scoring model works?
A: * Test on a holdout set
* Kolmogorov-Smirnov test

Kolmogorov-Smirnov test:
– Non-parametric test
– Compare a sample with a reference probability distribution or compare two samples
– Quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution
– Or between the empirical distribution functions of two samples
– Null hypothesis (two-samples test): samples are drawn from the same distribution
– Can be modified as a goodness of fit test
– In our case: cumulative percentages of good, cumulative percentages of bad

Source

[ VIDEO OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @ScottZoldi, @FICO

Subscribe to  Youtube

[ QUOTE OF THE WEEK]

Data are becoming the new raw material of business. – Craig Mundie

[ PODCAST OF THE WEEK]

#BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

 #BigData @AnalyticsWeek #FutureOfData #Podcast with @MPFlowersNYC, @enigma_data

Subscribe 

iTunes  GooglePlay

[ FACT OF THE WEEK]

This year, over 1.4 billion smart phones will be shipped – all packed with sensors capable of collecting all kinds of data, not to mention the data the users create themselves.

Sourced from: Analytics.CLUB #WEB Newsletter