The deployment of deep learning is frequently accompanied by a singular paradox which has traditionally proved difficult to redress. Its evolving algorithms are intelligent enough to solve business problems, but utilizing those algorithms is based on data science particularities business users donât necessarily understand.
The paucity of data scientists exacerbates this situation, which traditionally results in one of two outcomes. Either deep learning is limited in the amount of use cases for which itâs deployed throughout the enterprise, or the quality of its effectiveness is compromised. Both of these situations fail to actualize the full potential of deep learning or data science.
According to Mitesh Shah, MapR Senior Technologist, Industry Solutions: âThe promise of AI is about injecting intelligence into operations so you are actively making customer engagement more intelligent.â Doing so productively implicitly necessitates business user involvement with these technologies.
In response to this realization, a number of different solutions have arisen to provision self-service data science so laymen business users understand how to create deep learning models, monitor and adjust them accordingly, and even explain their results while solving some of their more intractable domain problems.
Most convincingly, there are a plethora of use cases in which deep learning facilitates these boons for âfolks who are not data scientists by education or training, but work with data throughout their day and want to extract more value from data,â noted indico CEO Tom Wilde.
Labeled Training Data
The training data required for building deep learningâs predictive models pose two major difficulties for data science. They require labeled output data and massive data quantities to suitably train models for useful levels of accuracy. Typically, the first of these issues was addressed when âthe data scientists would say to the subject matter experts or the business line, give us example data labeled in a way you hope the outcome will be predicted,â Wilde maintained. âAnd the SME [would] say I donât know what you mean; what are you even asking for?â Labeled output data is necessary for models to use as targets or goals for their predictions. Today, self-service platforms for AI make this data science requisite easy by enabling users to leverage intuitive means of labeling training data for this very purpose. With simple browser-based interfaces âyou can use something youâre familiar with, like Microsoft Word or Google Docs,â Wilde said. âThe training example pops up in your screen, you underline a few sentences, and you click on a tag that represents the classification youâre trying to do with that clause.â
For instance, when ensuring contracts are compliant with the General Data Protection Regulation, users can highlight clauses for personally identifiable data with examples that both adhere to, and fail to adhere to, this regulation. âYou do about a few dozen of each of those, and once youâve done it youâve built your model,â Wilde mentioned. The efficiency of this process is indicative of the effect of directly involving business users with AI. According to Shah, such involvement makes âproduction more efficient to reduce costs. This requires not only AI but the surrounding data logistics and availability to enable thisâ¦in a time-frame that enables the business impact.â
Feature Engineering and Transfer Learning
In the foregoing GDPR example, users labeled output training data to build what Wilde referred to as a âcustomized modelâ for their particular use case. They are only able to do so this quickly, however, by leveraging a general model and the power of transfer learning to focus the formerâs relevant attributes for the business userâs taskâwhich ultimately affects the modelâs feature detection and accuracy. As previously indicated, a common data science problem for advanced machine learning is the inordinate amounts of training data required. Wilde commented that a large part of this data is required for âfeaturization: thatâs generally why with deep learning you need so much training data, because until you get to this critical mass of featurization, it doesnât perform very robustly.â However, users can build accurate custom models with only negligible amounts of training data because of transfer learning. Certain solutions facilitate this process with âa massive generalized model with half a billion labeled records in it, which in turn created hundreds and hundreds of millions of features and vectors that basically creates a vectorization of language,â Wilde remarked. Even better, such generalized models are constructed âacross hundreds of domains, hundreds of verticals, and hundreds of use casesâ Wilde said, which is why they are readily applicable to the custom models of self-service business needs via transfer learning. This approach allows the business to quickly implement process automation for use cases with unstructured data such as reviewing contracts, dealing with customer support tickets, or evaluating resumes.
Another common data science issue circumscribing deep learning deployments is the notion of explainability, which can even hinder the aforementioned process automation use cases. As Shah observed, âAI automates tasks that normally require human intelligence, but does not remove the need for humans entirely. Business users in particular are still an integral part of the AI revolution.â This statement applies to explainability in particular, since itâs critical for people to understand and explain the results of deep learning models in order to gauge their effectiveness. The concept of explainability alludes to the fact that most machine learning models simply generate a numerical outputâusually a scoreâindicative of how likely specific input data will achieve the model’s desired output. With deep learning models in particular, those scores can be confounding because deep learning often does its own feature detection. Thus, itâs exacting for users to understand how models create their particular scores for specific data.
Self-service AI options, however, address this dilemma in two ways. Firstly, they incorporate interactive dashboards so users can monitor the performance of their models with numerical data. Additionally, by clicking on various metrics reflected on the dashboard âit opens up the examples used to make that prediction,â Wilde explained. âSo, you actually can track back and see what precisely was used as the training data for that particular prediction. So now youâve opened up the black box and get to see whatâs inside the black box [and] what itâs relying on to make your prediction, not just the number.â
Business Accessible Data Science
Explainability, feature engineering, transfer learning, and labeled output data are crucial data science prerequisites for deploying deep learning. The fact that there are contemporary options for business users to facilitate all of these intricacies suggests how essential the acceptance, and possibly even mastery, of this technology is for the enterprise today. Itâs no longer sufficient for a few scarce data scientists to leverage deep learning; its greater virtue is in its democratization for all users, both technical and business ones. This trend is reinforced by training designed to educate usersâbusiness and otherwiseâabout fundamental aspects of analytics. âThe MapR Academy on-demand Essentials category offers use case-driven, short, non-lab courses that provide technical topic introductions as well as business context,â Shah added.Â âThese courses are intended to provide insight for a wide variety of learners, and to function as stepping off points to further reading and exploration.â
Ideally, options for self-service data science targeting business users could actually bridge the divide between the technically proficient and those who are less so. âThere are two types of people in the market right now,â Wilde said. âYou have one persona that is very familiar with AI, deep learning and machine learning, and has a very technical understanding of how do we attack this problem. But then thereâs another set of folks for whom their first thought is not how does AI work; their first thought is I have a business problem, how can I solve it?â
Increasingly, the answers to those inquires will involve self-service data science.
Source by jelaniharper