Artificial intelligence (AI) is powering more and more services and devices that we use on a daily basis, such as personal voice assistants, movie recommendation services, or driving assistance systems. And while AI has become a lot more sophisticated, we all know the situations where we wonder: “Why did I get this weird recommendation?” or “Why did the assistant do this?” Often, after a restart and some trial and error, we get our AI systems back on track, but we never completely and blindly trust our AI-powered future.
One or the reasons for this distrust is that most current AI systems operate as a ‘black box’, with limited interaction capabilities, human context understanding and explanations. These limitations have inspired the call for a new phase of AI, which will create a more collaborative partnership between humans and machines. Dubbed Contextual AI, this new technology is already getting multi-billion dollar investments. Contextual AI is technology that is embedded in and understands human context and is capable of interacting with humans. In this article, I’ll explore how contextual AI works, how it compares to previous phases of AI, the challenges we need to overcome, and the progress we’re making at Adobe.
Contextual Artificial Intelligence: The building blocks of a successful relationship between humans and AI
Contextual AI does not refer to a specific algorithm or machine learning method – instead, it takes a human-centric view and approach to AI. The core is the definition of a set of requirements that enable a symbiotic relationship between AI and humans. Contextual AI needs to be intelligible, adaptive, customizable and controllable, and context-aware. Here’s what that looks like in the real world:
Intelligibility in AI refers to the requirement that a system needs to be able to explain itself, to represent to its users what it knows, how it knows it, and what it is doing about it. Intelligibility is required for trust in AI systems.
Adaptivity refers to the ability of an AI system, when trained or conceived for a specific situation or environment, to be adaptive enough, so it can be run similarly in a different situation or environment and meet the user’s expectations. For example, a smart home assistant that controls my house knows my preferences, but will it be able to translate those to my mom’s home when I visit?
An AI system must be adaptable and customizable by the user. And the user must be able to gain and maintain equal control over all functions of the system. Obviously, this goes together with intelligibility because the user needs to understand the basis of the system’s decisions.
Finally, context-awareness is a core requirement referring to the capacity of the system to ‘see’ at the same level as a human does, i.e. that it has sufficient perception of the user’s environment, situation, and context to reason properly. A smart home assistant cannot learn my behavior and preferences, and control my house with only one camera on the front porch as input.
While true Contextual AI doesn’t exist yet, we are getting closer to it. Self-driving cars are a good example: they are a first attempt to understand more of the human context (in this case the road, the state of passengers, or dangerous situations). However, the current understanding is still very limited and narrow. In the 1980s TV series Knight Rider, for example, the car (KITT) demonstrates the principles of true contextual AI, as it was able to interact seamlessly with the driver, understand everything that was going on (and even beyond), and help in dangerous situations. Obviously, it was far-fetched and fictional, but the essence is that contextual AI needs to have a deeper understanding of a human’s situation and be able to interact and explain itself.
What differentiates Contextual AI from previous phases of AI?
Contextual AI addresses many of the shortcomings of previous AI developments or phases. Historically, AI started as handcrafted knowledge. This rule-based AI had no learning capability and was mostly designed by engineers. Think of chess computers (remember when Deep Blue beat Garry Kasparov?) or expert systems. They had their first successful applications from the 1980s to the early 2000s. However, as a machine doesn’t have the same perception as a human, it fell short when a clear specification of the rules, in particular for sensor signal input (audio and video), wasn’t possible.
Statistical learning, particularly deep learning, addressed some of these shortcomings by inferring statistical patterns (that a human might not see or know) from very large datasets and raw signals. This led to the recent success of AI in image recognition, voice, conversational interfaces and many more applications. However, large scale statistical training has downsides as well. For one, statistical models such as deep learning models can be easily attacked or confused. Adversarial examples can be generated and tuned to make a production-grade machine learning system. Minor changes to the pixels in the input image, barely visible to the human eye, can yield very different recognition results. You can even generate your own adversarial examples to fool the algorithm. Additionally, as most AI approaches rely on large-scale data, unconscious bias can creep into AI algorithms based on the (positive and negative) examples with which they’ve been trained.
While the hype around AI is still powered by statistical learning, leading researchers have started questioning the real “intelligence” of the industry’s current AI approaches. While statistical algorithms helped with the context-awareness and adaptivity that is needed for a Contextual AI system, they do fall short on the requirements for humans to understand what is going on, and to customize and control it. A ‘black box’ algorithm cannot be trusted in critical situations. It’s unclear what structures the statistical AI algorithms really learn and whether the algorithms just separate data examples or have a true understanding of the content.
As an AI architect at Adobe, I’m working on initiatives that will bring Contextual AI to our customer experiences. Here are some of the things we’re working on:
Innovating AI with Adobe Sensei
One of the focus areas for Adobe Sensei, our AI and machine learning technology, is Creative Intelligence, defined as the augmentation of creators’ skills and capabilities using AI. Here, the creative human will interact and form a team with the AI, which needs to have a deep understanding of the creative’s intent, background, behavior and needs, and even be able to explain to the human what it does and why. Creative Intelligence is the application of Contextual AI to the creative world.
As stated above, intelligibility and explanation are important aspects of Creative Intelligence as well, which means the AI needs to be able to represent and explain (in layman terms) what it has learned. Technically, it needs to rely much more on knowledge representations and ontologies that represent what is learned. Here are some examples of projects in development at Adobe:
1. Deep learning content understanding
Adobe Sensei’s deep learning technology for content understanding goes beyond just image tagging, and instead aligns with how a human would perceive an image. Looking at the example below, simple image tagging would just recognize three faces, the ocean and the beach in this image. However, a richer taxonomy and representation enables Adobe Sensei to capture human-level concepts such as “Entertainment” and “Family Life” that aren’t as explicit.
Image courtesy of Samarth Gulati.
Emotions such as “Happiness” that used to be entirely in the human realm are partially decrypted by the AI algorithms. This makes the retrieval of specific images much more understandable and customizable by the human. It also enables a richer, more contextual customer experience around image content and search on Adobe Stock, the company’s collection of millions of royalty-free images. As a result, the image search yields stronger results in less time.
2. Image search using voice commands
Another project in development at Adobe goes beyond faceted search and illustrates the natural language refinement of image search using voice commands. In this example below, leveraging images from Adobe Stock, the user is casually interacting with the “search algorithm,” contextually adding and removing search criteria as well as referring to broad human-level concepts such as “authenticity” and “diversity.”
Image courtesy of Brett Butterfield.
The voice assistant tracks where the search is at, and allows various human understandable refinements, including backtracking of search results. Adobe Sensei understands the context, specifically what the user refers to and might look for, and evolves the search accordingly.
Achieving a deeper understanding of human-machine interactions
We’ve come a long way in the journey towards true Contextual AI. We now understand human-level concepts in images, and AI can more naturally interact with the human using these concepts. However, we still require a deeper understanding of language as well as new human-computer interaction paradigms. How should an AI system and humans interact in the future, for example? Through voice, gestures, or even implants?
More importantly, the representation and recognition of what humans think and do is still very limited. For example, millions of creatives use Adobe’s products every day and while we’re familiar with how they’re using our tools as part of their work, we are still working toward fully representing “creative intent.” What does the creative user want to do? What are the steps in the process? And what may he or she require for success? And how could a user even teach an AI system what his or her creative intent is?
Some future technical directions that are currently being explored are explainable AI modelsand common sense reasoning. How could we teach the common sense of a five-year-old to an AI system? And how could we further make it explain itself and make it fully contextual? At Adobe, we believe that AI enhances human creativity and intelligence when it comes to designing, optimizing and delivering digital experiences (AI doesn’t replace it). Therefore, it is important to leverage the power of Contextual AI to help move the industry forward and harness its power to continually innovate.
These are some of the challenges we’re tackling at Adobe, and to give you a sneak peek of what we are working on, here is a proof-of-concept demo of our contextual intelligent assistant — powered by Adobe Sensei — that enables natural interaction using voice and gestures. Pretty cool, right?
•Employee Skills Management and Development Planning Tool
•Context Aware TV Program Recommender System for Android TV
•Context Aware Service Recommendation Engine for Mobile Phone
•Predicting User Intention for a given user Context
•Real time object recognition and context Aware Service recommending app for Self-Driving Car
•App to generate music genres and benefits of the given classical music. (Minimum 1000 Music Data Analytics)
•App for Predicting Energy Consumption of SDMIT College and Hostel Buildings
•Performance Evaluation of Student and Teacher of Engineering College
•Agriculture Data Analytics of Dakshina Kannada
•Health Care Data Analytics of Dakshina Kannada
•Natural Resources Data Analytics of Dakshina Kannada
•Waste Data Analytics of specific Location
•App for Object Classification and Analytics in Retail Stores
•Event Detection and Notification App for CCTV mounted in Specific Location.
•Cloud based customer data analytics for the success of a given business (Customer behavior, satisfaction, etc)
•Chatbot for Campus Interview
•Chatbot for Tourism
•Chatbot for Customer Service
•Teaching Assistant Chatbot
•Sentimental Analysis of Colleges in Dakshina Kannada through Social Media Data.
•Prediction and diagnosis of any specific Diseases using multimodal data (Scans, sensor, audio data ,etc)
•App for Handwritten and Character based user Recognition
•Customer focused Ecommerce Site with AI Bot for Retail Shopping (Any Specific type of Shopping) using multimodal data (mobile, social media, location ,context, etc)
•App to Predict the suitable Skilled candidate for job through CV analysis for a Company.
•App for Data Analytics of Medicinal Plants in Dakshina Kannada
•Prediction and diagnosis of Agricultural Crop Disease ( Any one )
•User Context based Search topic recommender for search engine
•Waste Management in College Campus
•Personal Intelligent Virtual Assistants
•Context Aware Intruder/attacker/Hacker detecting system
•Sports Data Analytics (Fitness and Sports Skills assessment for identifying next talent) – For any specific game and country /state
•Financial Advice for Public
•Hacker detecting Tool
•Fraudulent Activity (Anomalies) Recognition in Shopping,
Business and Banking Transactions
•Customize crop growing techniques specific to individual plot characteristics and relevant realtime data (Optimize Pricing, Predict yield, Predict new high value crop, Predict product demand and Product Optimization , Forecasting ,etc)
•Energy Data Analytics and Predicting Energy Demand Trends in given location using multimodal data
•Evaluating the Quality of hospitals and data performance in the specific region
•Predicting the risk of illicit activity or terrorism using historical crime data, intelligence data and other available sources for a given location
•Shopping Centers Data Analytics of Dakshina Kannada
•Simulation model of the operation of the limit order book underlying a virtual currency exchange
•IoT based Home Automation 48. IoT based Smart Irrigation System
•AI based Robot (For any suitable Case Study)
•AI based Self Driving Toy Car
HEALTH CARE
•Drug discovery using Neural Networks
•Tumor detection from Brain MRI images
•Detection and Classification of cancer cells in MRI Images
•Organ Segmentation and Labelling in MRI Images
•Cancer cell detection and segmentation
•Blood flow detection and monitoring using Sensory data
•Diabetic Retinopathy Detection and Segmentation from MRI Images
•Personalized Treatment based on Patient History
•AI System for Prediction and Recommendation of Diabetes
•Recommendation of doctors and medicines using review mining
•Disease Prediction using patient treatment history and health data
•Real-time health monitoring using wearable devices
•Prediction of epidemic outbreaks using Social Media Data
•Design and Implementation of prediction for Medical Insurance
•Activity monitoring and unusual activity detection for elderly homes
•Recognizing exercises for physiotherapy videos
•Detecting Genes responsible for cancer development
•DNA/Gene classification using RNN Sequential analysis
AGRICULTURE
•Plant disease identification using leaf images
•Plants Recognition using Convolutional Neural Networks
•Fruits counting for automatic inventory management
•Weed plant detection from agricultural field images
•Predicting yield, soil moisture and weather using images processing
•Plant Gene Classification and Functionality Prediction
•Automated quality assessment of crops
Space Research & Satellite imagery
•Change detection for deforestation, water reserves from Satellite images
•Detection of Unmanned Vehicles (UMV) and Drones
•Segmenting Satellite Images for detection of road, buildings, natural resources etc.
•Target recognition in SAR images
•Scene segmentation in rural and urban regions from Remote Sensing Data
•Detection of Anomaly in SAR images
•Classification of Terrain from Satellite Imagery
•3D reconstruction from multimodal satellite data
•Classification of galaxies
•Simulation of Galaxies for Real World Scenario
•Detection and Segmentation of different structures on planet surface images
Cyber Security
•Intrusion detection in networks and servers
•Malware identification using deep learning
•Anomaly detection in network activities
•Virus/Malicious file detection in a shared environment
•Spam SMS filtering Using Machine Learning
•Advertisement Click Fraud Detection
•Webpage classification for safer browsing
Education
•Predicting Student Performance using Regression analysis
•Automatic scientific article summarization
•Feature based opinion mining on student feedback
•Face recognition based attendance system
Video Processing
•Real-time generic object detection & tracking
•Pedestrian Detection from low-resolution videos
•Detection and classification of vehicles
•Vehicle detection and speed tracking
•Detection of signals, and lane for self-driving cars
•Road Crack Detection and Segmentation For Autonomous Driving
•Unusual Activity & Anomaly detection in surveillance
•Human activity detection for surveillance video Compression
•Gesture recognition for Human Computer Interaction
•Real-time video to text transcription for visually challenged person
•Real-time speech recognition for regional languages
•Real-time OCR for Regional Languages
•Face Recognition & expression recognition Mobile App for Visually impaired person
•Action recognition for controlling electronic appliances in homes
•Place recognition app for visually impaired person
•Text to video generation for News Stories
•Salient region detection for targeted advertisements placement in videos
•Video quality Enhancement using super-resolution
Business
•Comparative sales analysis of different stores, customers, demographics
•Customer Classification Based on The Historical Purchase Data
•Personalized marketing and targeted advertising
•Predicting housing prices for real estate companies using Machine Learning or Deep Learning
•Predicting Product Development Time and Cost Using Production Data
•Question answering system for automated customer relationship management
•Sales prediction using Regression Analysis
•Text to Speech Generation for Regional Languages
Insurance
•Fraud/abuse detection for insurance companies
•Predicting risk for new Insurance using customer information
Banking
•Credit card fraud detection using historical transaction data
•Loan Risk Prediction using User transaction information
Crime
•Crime pattern detection using historical crime data
•Geographical crime rate prediction
•Criminal behavior analysis and segmentation
•Crowd counting and monitoring for surveillance videos
Social Media Analytics
•Product opinion mining for competitive market analysis
•Customer requirement analysis using User Generated Content
•Consumer behavior analysis using User Generated Content
•Rumor detection from Social Media
•Political opinion mining for popularity prediction
•Terrorism detection from social media
•Stock prediction using Twitter sentimental analysis
•Restaurant Review Classification And Recommender System
•Fake news detection in online social media
•Detection violent and abusive content in social media
Miscellaneous1
•Automated Machine Translation for Regional Languages
•Virtual Personal Assistant Apps
•Developing a Chatbot using sequence modelling
•Travel route suggestion based on pattern of travel and difficulties
Miscellaneous2
Detecting incidents of cyber bullying
•Input: text feed from social media conversations
•Output: cyber bullying victim and bully identified
Characterizing mental stress and suicidal tendencies
•Input: text feed from online profiles and conversations
•Output: people suspected to have stress or suicidal tendencies are flagged
Detecting click-fraud in online advertising
•Input: click data from online advertisement
•Output: fraudulent clicks and click patterns detected
•Input: Real time video feed from the drone camera
•Output: Navigation actions and real time physical motion to the target locations
UMV or Drone detection system for border security
•Input: Real time video feed from HQ surveillance cameras
•Output: detections of UMV or drones and their locations
Location recognition apps from Airplanes
•Input: Photos of land taken from airplanes
•Output: Recognized places and their information
Drone based security system
•Input: Real time video stream from drone cameras
•Output: Detected objects, people, animals, activities, accidents, intruders etc.
3D reconstruction of a building or land using Drone cameras
•Input: Aerial video taken from drone camera
•Output: 3D model of the location/buildings
Business
Chatbot development for regional languages
•Input: Chat commands written in regional languages
•Output: Automated responses in regional language
Robust face recognition system for loan/insurance fraud prediction
•Input: Face photos and related information of a loan applicant
•Output: Detection whether specific applicant has committed loan/insurance fraud
Question answering system for automated customer relationship management
•Input: Questions from customers spoken/written in regional languages
•Output: Answers (spoken/written) from the automated system in regional language
Face emotion detection for customer relationship management
•Input: Real-time video of customer in a services/customer care place
•Output: Detections of user emotion such as stress, happy for guidance to the service provide or customer care responder
Salient region detection for targeted advertisements placement
•Input: Image or streaming video of sports/movie etc.
•Output: Location inside video frame where ad will be posted
Customer emotion detection for telephony customer care
•Input: Real-time audio of the customer care call
•Output: Detections of user emotion such as stress, happy for guidance to the customer care responder
Product requirement analysis from social media
•Input: comments, reviews from users of particular topic or need
•Output: Detections of whether particular feature or product is currently needed for the customers
Scheduling and planning apps for sales person
•Input: Schedule, target, location of the sales person
•Output: Reminders, route recommendations, plans for sales execution
Mobile app for quick prediction of production time and cost
•Input: Requested number of quantity and specification of a product
•Output: Production time and cost to make the specific number of products
Work monitoring system for surveillance videos in production environment
•Input: Real time video feed from surveillance cameras in a product production environment
•Output: Detection of events, accidents, people activities, loitering etc
Journalism
News article summarization app
•Input: News article in text format
•Output: Summary of news as a short text
News text to video generation app
•Input: News article with text and images
•Output: Interestingly presented video with news elements, animations and attractive audio
Fake news alert app
•Input: News article with text and images
•Output: Detection results whether article is fake or from trustable source
Provocative article detection for safe surfing
•Input: News article with text and images
•Output: Detection results whether article contain controversial/violent content against religious views/national integrity that will induce violence or riot
Finding famous and relevant Tweets of news articles
•Input: News article with text and images
•Output: Neatly presented famous tweets from celebrities/active twitter users on specific issues that news article deals with
Personalized News Recommendation App
•Input: News articles, previous history of user, ratings etc
•Output: News articles matching interest and history of the user
Multisource news summarization for summarizing news on same topic
•Input: Multiple news articles dealing with same news
•Output: Summary of news content as a short text
User emotion detection for news article impact analysis
•Input: News article, face images of the user while reading news, history of articles read by user
•Output: Prediction of emotions of a user for different articles
News popularity detection in social media
•Input: News article and its relevant social media feed
•Output: Popularity level of a news story
News generation from tweets of certain topic
•Input: Twitter feed related to certain event or topic
•Output: Generated news story related to the famous tweets
No comments:
Post a Comment