Smart Data Modeling: From Integration to Analytics

There are numerous reasons why smart data modeling, which is predicated on semantic technologies and open standards, is one of the most advantageous means of effecting everything from integration to analytics in data management.

  • Business-Friendly—Smart data models are innately understood by business users. These models describe entities and their relationships to one another in terms that business users are familiar with, which serves to empower this class of users in myriad data-driven applications.
  • Queryable—Semantic data models are able to be queried, which provides a virtually unparalleled means of determining provenance, source integration, and other facets of regulatory compliance.
  • Agile—Ontological models readily evolve to include additional business requirements, data sources, and even other models. Thus, modelers are not responsible for defining all requirements upfront, and can easily modify them at the pace of business demands.

According to Cambridge Semantics Vice President of Financial Services Marty Loughlin, the most frequently used boons of this approach to data modeling is an operational propensity in which, “There are two examples of the power of semantic modeling of data. One is being able to bring the data together to ask questions that you haven’t anticipated. The other is using those models to describe the data in your environment to give you better visibility into things like data provenance.”

Implicit in those advantages is an operational efficacy that pervades most aspects of the data sphere.

Smart Data Modeling
The operational applicability of smart data modeling hinges on its flexibility. Semantic models, also known as ontologies, exist independently of infrastructure, vendor requirements, data structure, or any other characteristic related to IT systems. As such, they can incorporate attributes from all systems or data types in a way that is aligned with business processes or specific use cases. “This is a model that makes sense to a business person,” Loughlin revealed. “It uses terms that they’re familiar with in their daily jobs, and is also how data is represented in the systems.” Even better, semantic models do not necessitate all modeling requirements prior to implementation. “You don’t have to build the final model on day one,” Loughlin mentioned. “You can build a model that’s useful for the application that you’re trying to address, and evolve that model over time.” That evolution can include other facets of conceptual models, industry-specific models (such as FIBO), and aspects of new tools and infrastructure. The combination of smart data modeling’s business-first approach, adaptable nature and relatively rapid implementation speed is greatly contrasted with typically rigid relational approaches.

Smart Data Integration and Governance
Perhaps the most cogent application of smart data modeling is its deployment as a smart layer between any variety of IT systems. By utilizing platforms reliant upon semantic models as a staging layer for existing infrastructure, organizations can simplify data integration while adding value to their existing systems. The key to integration frequently depends on mapping. When mapping from source to target systems, organizations have traditionally relied upon experts from each of those systems to create what Loughlin called “ a source to target document” for transformation, which is given to developers to facilitate ETL. “That process can take many weeks, if not months, to complete,” Loughlin remarked. “The moment you’re done, if you need to make a change to it, it can take several more weeks to cycle through that iteration.”

However, since smart data modeling involves common models for all systems, integration merely includes mapping source and target systems to that common model. “Using common conceptual models to drive existing ETL tools, we can provide high quality, governed data integration,” Loughlin said. The ability of integration platforms based on semantic modeling to automatically generate the code for ETL jobs not only reduces time to action, but also increases data quality while reducing cost. Additional benefits include the relative ease in which systems and infrastructure are added to this process, the tendency for deploying smart models as a catalog for data mart extraction, and the means to avoid vendor lock-in from any particular ETL vendor.

Smart Data Analytics—System of Record
The components of data quality and governance that are facilitated by deploying semantic models as the basis for integration efforts also extend to others that are associated with analytics. Since the underlying smart data models are able to be queried, organizations can readily determine provenance and audit data through all aspects of integration—from source systems to their impact on analytics results. “Because you’ve now modeled your data and captured the mapping in a semantic approach, that model is queryable,” Loughlin said. “We can go in and ask the model where data came from, what it means, and what conservation happened to that data.” Smart data modeling provides a system of record that is superior to many others because of the nature of analytics involved. As Loughlin explained, “You’re bringing the data together from various sources, combining it together in a database using the domain model the way you described your data, and then doing analytics on that combined data set.”

Smart Data Graphs
By leveraging these models on a semantic graph, users are able to reap a host of analytics benefits that they otherwise couldn’t because such graphs are focused on the relationships between nodes. “You can take two entities in your domain and say, ‘find me all the relationships between these two entities’,” Loughlin commented about solutions that leverage smart data modeling in RDF graph environments. Consequently, users are able to determine relationships that they did not know existed. Furthermore, they can ask more questions based on those relationships than they otherwise would be able to ask. The result is richer analytics results based on the overarching context between relationships that is largely attributed to the underlying smart data models. The nature and number of questions asked, as well as the sources incorporated for such queries, is illimitable. “Semantic graph databases, from day one have been concerned with ontologies…descriptions of schema so you can link data together,” explained Franz CEO Jans Aasman. “You have descriptions of the object and also metadata about every property and attribute on the object.”

Modeling Models
When one considers the different facets of modeling that smart data modeling includes—business models, logical models, conceptual models, and many others—it becomes apparent that the true utility in this approach is an intrinsic modeling flexibility upon which other approaches simply can’t improve. “What we’re actually doing is using a model to capture models,” Cambridge Semantics Chief Technology Officer Sean Martin observed. “Anyone who has some form of a model, it’s probably pretty easy for us to capture it and incorporate it into ours.” The standards-based approach of smart data modeling provides the sort of uniform consistency required at an enterprise level, which functions as means to make data integration, data governance, data quality metrics, and analytics inherently smarter.

Source by jelaniharper

Jul 06, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Human resource  Source

[ AnalyticsWeek BYTES]

>> The First and Only – Big Data Search Engine Powered by Apache® Spark™ by analyticsweekpick

>> New Mob4Hire Report “The Impact of Mobile User Experience on Network Operator Customer Loyalty” Ranks Performance Of Global Wireless Industry by bobehayes

>> How big data can improve manufacturing by anum

Wanna write? Click Here


Pattern Discovery in Data Mining


Learn the general concepts of data mining along with basic methodologies and applications. Then dive into one subfield in data mining: pattern discovery. Learn in-depth concepts, methods, and applications of pattern disc… more


Superintelligence: Paths, Dangers, Strategies


The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but … more


Save yourself from zombie apocalypse from unscalable models
One living and breathing zombie in today’s analytical models is the pulsating absence of error bars. Not every model is scalable or holds ground with increasing data. Error bars that is tagged to almost every models should be duly calibrated. As business models rake in more data the error bars keep it sensible and in check. If error bars are not accounted for, we will make our models susceptible to failure leading us to halloween that we never wants to see.


Q:Do you know / used data reduction techniques other than PCA? What do you think of step-wise regression? What kind of step-wise techniques are you familiar with?
A: data reduction techniques other than PCA?:
Partial least squares: like PCR (principal component regression) but chooses the principal components in a supervised way. Gives higher weights to variables that are most strongly related to the response

step-wise regression?
– the choice of predictive variables are carried out using a systematic procedure
– Usually, it takes the form of a sequence of F-tests, t-tests, adjusted R-squared, AIC, BIC
– at any given step, the model is fit using unconstrained least squares
– can get stuck in local optima
– Better: Lasso

step-wise techniques:
– Forward-selection: begin with no variables, adding them when they improve a chosen model comparison criterion
– Backward-selection: begin with all the variables, removing them when it improves a chosen model comparison criterion

Better than reduced data:
Example 1: If all the components have a high variance: which components to discard with a guarantee that there will be no significant loss of the information?
Example 2 (classification):
– One has 2 classes; the within class variance is very high as compared to between class variance
– PCA might discard the very information that separates the two classes

Better than a sample:
– When number of variables is high relative to the number of observations



#BigData @AnalyticsWeek #FutureOfData #Podcast with  John Young, @Epsilonmktg

 #BigData @AnalyticsWeek #FutureOfData #Podcast with John Young, @Epsilonmktg

Subscribe to  Youtube


Getting information off the Internet is like taking a drink from a firehose. – Mitchell Kapor


#BigData @AnalyticsWeek #FutureOfData #Podcast with  John Young, @Epsilonmktg

 #BigData @AnalyticsWeek #FutureOfData #Podcast with John Young, @Epsilonmktg


iTunes  GooglePlay


According to Twitter’s own research in early 2012, it sees roughly 175 million tweets every day, and has more than 465 million accounts.

Sourced from: Analytics.CLUB #WEB Newsletter

Social Media Haiku: Everything you Need to Know about Social Media

I don’t have a lot to say today. Like most Americans, I am honoring Labor Day, a day that represents the labor movement,  by not working. Besides, I found it very difficult to blog with my hands focused on shoving food into my pie/hotdog/beer hole.

To leave you with something, anything, I thought I would, at least, share what I have learned so far about social media. To keep it simple, I have summarized my findings in Haiku form. My deep insights are based on my personal experience on such social media platforms as Facebook, Twitter and LinkedIn… and I’m confident that my conclusions apply to most social media platforms.


Haiku on Social Media

Noise noise noise noise noise
Noise noise noise signal noise noise
Noise noise noise noise noise


This blog post is an example of noise.

Source by bobehayes

Jun 29, 17: #AnalyticsClub #Newsletter (Events, Tips, News & more..)


Data Storage  Source

[ AnalyticsWeek BYTES]

>> Unraveling the Mystery of Big Data by v1shal

>> How could Watson and Big data help pick a better US president by v1shal

>> Is Big Data The Most Hyped Technology Ever? by bobehayes

Wanna write? Click Here


 Hortonworks and IBM double down on Hadoop – CIO New Zealand Under  Hadoop

 VisualVault Release New Business Analytics Functionality – Read IT Quik Under  Business Analytics

 Intel Corporation’s Internet of Things Business Delivers Strong … – Motley Fool Under  Internet Of Things

More NEWS ? Click Here


Learning from data: Machine learning course


This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applicati… more


Introduction to Graph Theory (Dover Books on Mathematics)


A stimulating excursion into pure mathematics aimed at “the mathematically traumatized,” but great fun for mathematical hobbyists and serious mathematicians as well. Requiring only high school algebra as mathematical bac… more


Data aids, not replace judgement
Data is a tool and means to help build a consensus to facilitate human decision-making but not replace it. Analysis converts data into information, information via context leads to insight. Insights lead to decision making which ultimately leads to outcomes that brings value. So, data is just the start, context and intuition plays a role.


Q:What is POC (proof of concept)?
A: * A realization of a certain method to demonstrate its feasibility
* In engineering: a rough prototype of a new idea is often constructed as a proof of concept



Advanced #Analytics in #Hadoop

 Advanced #Analytics in #Hadoop

Subscribe to  Youtube


I’m sure, the highest capacity of storage device, will not enough to record all our stories; because, everytime with you is very valuable da


#FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership

 #FutureOfData Podcast: Peter Morgan, CEO, Deep Learning Partnership


iTunes  GooglePlay


29 percent report that their marketing departments have ‘too little or no customer/consumer data.’ When data is collected by marketers, it is often not appropriate to real-time decision making.

Sourced from: Analytics.CLUB #WEB Newsletter

Could Big Data Be the New Gender Equality Tool?

Even in the age of big data, some numbers pertaining to women are glaringly missing: numbers on global maternal mortality rates are incomplete, statistics regarding women and unpaid work are flawed and conflict-related gender-based violence figures are lacking.

To begin tackling this problem from the bottom up, Data2X—a joint project of the Clinton Foundation, the United Nations Foundation, the William and Flora Hewlett Foundation and the Bill & Melinda Gates Foundation that was first introduced in 2012—announced new regional and topical partnerships at a press event in New York City Monday. By partnering with a multitude of organizations, the Data2X platform says it hopes to start a “gender data revolution,” which will allow policymakers to recognize problems more clearly and better create informed policy.

“I have been championing the rights of women and girls around the world and here at home for many years,” former Secretary of State Hillary Clinton said at the event, “and I got tired of seeing…foreign leaders, business executives, even senior officials in our own government…smile and nod when I raised these issues… ‘Oh right, I knew she was going to raise women and girls, I will just sit here and smile, it will pass, and then we’ll talk about really important things.’”

Clinton said this scenario played out countless times over the years and inspired her involvement in the Data2X project. “You can’t understand what the problem is if you don’t have a good grasp of what the facts and figures are,” she said.

The new partnerships will focus on six categories of data:

Civil Registration and Vital Statistics (CRVS)

Civil registration, which is the continuous recording of vital life events like birth, marriage and death, is incomplete around the world. To address missing records, Data2X is teaming up with organizations in Africa and Asia, like the United Nations Economic Commission for Africa and U.N. Economic and Social Commission for Asia and the Pacific.

Though the births of girls and boys are almost equally registered globally, ensuring women an individual legal identity by maintaining a quality CRVS system could provide a better account of early and forced marriages and help women retain their share of assets in the event of a divorce.

Women’s Work and Employment

In partnership with the International Labour Organization, Data2X is redefining what is considered work to include unpaid work, including work for the home. Expanding the work framework will give the global community a better idea of how women contribute to the economy.

For example, Clinton said that data collection and analysis in India showed that women spend an average of six hours per day doing unpaid labor. If these women were to participate in the formal workforce at the same rate as men, she contended, India’s gross domestic product would increase by $1.7 trillion.

Supply Side Data on Financial Services

Women face both financial service access and service gaps globally—they have trouble proving their business case and struggle to get the resources they need. But data pertaining to these experiences with financial services is not kept. “If you can’t measure it, you can’t manage it,” former New York City mayor Michael Bloomberg told the audience.

Now, the Global Banking Alliance for Women and the Inter-American Development Bank are teaming up to incentivize collection of this data as a first step to close the gender gap in financial services. Proposals for how to support this venture will be presented at the Global Data Symposium in September 2015.

Women’s Subjective Well-Being and Poverty

The lack of gender-segregated data has clumped women’s poverty with household poverty, meaning the exact poverty numbers for women and girls is unknown. Data2X is joining with the Government of Mexico’s National Institute of Statistics and Geography (INEGI), which already has a comprehensive approach to discovering gender differences in well-being. One of INEGI’s tactics to understanding women’s subjective well-being is analyzing Twitter feeds to determine who and where posts with positive sentiments are coming from.

Big Data and Gender

Data2X is collaborating with U.N. Global Pulse, U.N. Women and academic researchers to analyze cell phone data usage patterns to infer women’s socioeconomic welfare, mobility patterns and financial activity. The project also plans to use remote sensors to reveal epidemiological trends and provide information on women’s access to services.

Improved Gender Data on U.S. Foreign Assistance Programs

Not only does Data2X want more complete data to be included when setting the global agenda, but the project aims to inform U.S. development policy and investment. In conjunction with the U.S. President’s Emergency Plan for AIDS Relief and Millennium Challenge Corporation, sex and age specific data will be released to ensure those most in need of aid are reached and to assess the impact of current U.S. assistance programs.

Data collection, Clinton said, could “build a case strong enough to convince the skeptics, based on hard data and clear-eyed analysis, that creating opportunities for women and girls across the globe directly supports everyone’s security and prosperity, and therefore should be an enduring part of our diplomacy and development work.

Lauren Walker

Originally posted via “Could Big Data Be the New Gender Equality Tool?”

Source: Could Big Data Be the New Gender Equality Tool?

Fortune 100 CEOs And Their Path To Success

Fortune 100 CEOs And Their Path To Success
Fortune 100 CEOs And Their Path To Success

Ever cared to know what it takes to be fortune 100 CEO? What it took, how their career path progressed? If yes, thanks to N2Growth for an amazing infographics. It spreads some lights on how these CEOs grew up and what differentiates them and put them almost in the similar silos among themselves. Take a peak at the infographics and see what it will require to be a CEO, how you could set your path to such progression. Surely, a good place to start if you are not running already.

Fortune 100 CEOs | Demographics, Education, and Career Path

Explore more infographics like this one on the web’s largest information design community – Visually.


Originally Posted at: Fortune 100 CEOs And Their Path To Success by v1shal

Map of US Hospitals and their Patient Experience Ratings

Hospital RatingsHospitals are focusing on improving the patient experience.  The Centers for Medicare & Medicaid Services (CMS) will be using patient feedback about their care as part of their reimbursement plan for Acute Care Hospitals. Under the Hospital Value-Based Purchasing Program (beginning in FY 2013 for discharges occuring on or after October 1, 2012), CMS will make value-based incentive payments to acute care hospitals, based either on how well the hospitals perform on certain quality measures or how much the hospitals’ performance improves on certain quality measures from their performance during a baseline period.  The higher the score/greater the improvement, the higher the hospital’s incentive payment for that fiscal year.

Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS)

Patient feedback is being collected using a survey known as HCAHPS (Hospital Consumer Assessment of Healthcare Providers and Systems). HCAHPS (pronounced “H-caps“) is a national, standardized survey of hospital patients and was developed by a partnership of public and private organizations. I recently wrote about HCAHPS in a prior post. The survey asks a random sample of recently discharged patients about important aspects of their hospital experience. The data set includes patient survey results for over 3800 US hospitals on ten measures of patients’ perspectives of care.

The site indicates that the data files were updated in 5/30/2012. Based on HCAHPS reporting schedule, it appears the current survey data were collected from Q3 2010 through Q2 2011 and represent the latest publicly available patient survey data.

Map of US Hospitals and their Patient Experience Ratings

As consumers of healthcare, you need to understand how well hospitals are delivering a good patient experience. Using the HCAHPS data, I developed a map to help you easily identify and understand how your hospital ranks in patient experience. In the map below (see Figure 1), the colors are based on the patient advocacy index I created (average of top box scores for two questions: Overall hospital quality rating and Recommend hospital. In my prior analysis, this Patient Advocacy Index (PAI) had a reliability estimate (Cronbach’s alpha) of .95, suggesting that it is a reliable measure.

The colors for each hospital are based on their PAI (red = 0 – 20; purple = 21-40; yellow = 41-60; blue = 61-80; green = 81-100). If you click on one of the buttons, you will see detailed information about the patient experience metrics (if survey data were collected for that hospital) as well as response rates, sample sizes and other notes (if available). NOTE: Some hospitals do not have any ratings (those are typically red).

[iframe_loader src=”″]
Figure 1. Map of US Hospitals and their Patient Experience Ratings

Originally Posted at: Map of US Hospitals and their Patient Experience Ratings by bobehayes

Big data offers telcos new revenue streams

Telcos need to find ways to process and analyse large amounts of customer data quickly and in real-time in order to create new revenue streams and engage with customers effectively.

So said Rams Srinivasan, telco industry value engineer at SAP Middle East and North Africa, speaking yesterday at the SAP Forum in Sandton.

According to Srinivasan, telecommunications companies are struggling to break through and engage with their customers because of legacy systems that make it difficult to have a complete view of customers and deliver efficient service across multiple channels.

He explains telcos must adapt in order to attract and retain a new generation of customers who are highly informed, socially connected and very mobile.

Telcos can track conversations on social media to understand what customers are saying about their products and services, and take proactive measures to defend their brand image and reputation in real-time using analytics, said Srinivasan.

Operators are looking for new ways to increase revenues and profits – but few have shown the know-how needed to make the most of new digital channels, said Srinivasan.

According to a Frost & Sullivan report, with an increasingly saturated market, and declining traditional mobile voice and SMS revenues, operators are exploring new business models as part of their growth strategy.

Also, the declining cost of devices is driving the ecosystem to explore new possibilities through connected devices, adds Frost & Sullivan.

By analysing varied and unformatted digital data from digital channels like social media and through mobile phones, telcos can reveal new sources of business economic value and provide fresh insights into customer behaviour, said Srinivasan.

He pointed out operators need a strategy to accurately mine and analyse both structured and unstructured data. This will give them an opportunity to get deeper insights into customer behaviour, their service usage patterns and preferences in real-time.

“To be able to remain relevant and competitive in a world where technology giants like Google, Facebook and Apple are causing disruption, telcos need to engage with the customer in a better way by improving operational efficiency through data analytics.”

Today’s lifestyle demands continuous connectivity and excellence of service, leading to a highly competitive and diverse market, said Srinivasan. There are more digital connected devices than human beings and the numbers are growing.

Another new source of revenue for telcos is selling insights about customers to third parties, said Sherif Hamoudah, industry head, telecom, at SAP Africa, Middle East and Pakistan.

By leveraging the data stream, telecommunications companies can tailor their marketing campaigns to individual customers using location-based and social networking technologies, he said. Also, big data offers event-based marketing campaigns that use geolocation and social media, allowing differentiated response.

Big data offers telecom operators a real opportunity to gain a much more complete picture of their operations and customers, and to further their innovation efforts, he said.

Originally posted via “Big data offers telcos new revenue streams”

Source: Big data offers telcos new revenue streams