Microsoft AI-900 Practice Test Questions

Stop wondering if you're ready. Our AI-900 practice test is designed to identify your exact knowledge gaps. Validate your skills with questions that mirror the real exam's format and difficulty. Build a personalized study plan based on your AI-900 exam questions performance, focusing your effort where it matters most.

Targeted practice like this helps candidates feel significantly more prepared for Microsoft AI-900 exam day.

22620 already prepared
Updated On : 15-Dec-2025
262 Questions
4.9/5.0

Page 1 out of 27 Pages

Topic 5: Describe features of conversational AI workloads on Azure

Which Azure service can use the prebuilt receipt model in Azure Al Document Intelligence?

A. Azure Al Computer Vision

B. Azure Machine Learning

C. Azure Al Services

D. Azure Al Custom Vision

C.   Azure Al Services

Summary:
This question tests your knowledge of the specific Azure AI service that offers prebuilt models for document processing. While several services handle different types of AI, only one is a unified service specifically designed to extract text, key-value pairs, and tables from documents using pre-trained models for common document types like receipts, invoices, and business cards.

Correct Option:

C. Azure AI Services
This is the correct answer. Azure AI Services is an umbrella portfolio that includes various cognitive services. Within this portfolio, Azure AI Document Intelligence (formerly Form Recognizer) is the specific service that provides prebuilt models, including the prebuilt receipt model. This model is designed to extract key information from receipts, such as merchant name, transaction date, tax, and total amount, without any need for custom training.

Incorrect Option:

A. Azure AI Computer Vision
Azure AI Computer Vision is a related but different service. It focuses on analyzing visual content within images, such as object detection, image categorization, and Optical Character Recognition (OCR) to read text. However, it does not contain the specialized, pre-trained models for understanding the structured semantic meaning of documents like receipts, which is the domain of Azure AI Document Intelligence.

B. Azure Machine Learning
Azure Machine Learning is a platform for data scientists to build, train, and deploy their own custom machine learning models. It is a tool for creating new models, not for using prebuilt, turn-key AI models like the receipt model. The prebuilt receipt model is a consumable service, not a platform for custom development.

D. Azure AI Custom Vision
Azure AI Custom Vision is a service for building and deploying custom image classification and object detection models. It is used to train a model to recognize specific objects or tags relevant to your business. It is not used for document analysis and does not include any prebuilt models for documents like receipts.

Reference:
What is Azure AI Document Intelligence? - Prebuilt Models

You need to implement a pre-built solution that will identify well-known brands in digital photographs. Which Azure Al sen/tee should you use?

A. Face

B. Custom Vision

C. Computer Vision

D. Form Recognizer

C.   Computer Vision

Summary:
This question asks for a pre-built solution to identify well-known brands in images. A pre-built solution means a service that works immediately without any need for you to collect data or train a model. The service should already contain a knowledge base of common brand logos and be able to detect them in provided photographs.

Correct Option:

C. Computer Vision
The Azure AI Computer Vision service includes a pre-built Brand Detection feature. This model is already trained to recognize thousands of well-known logos from commercial, cultural, and other domains. You simply send a digital photograph to the service's API, and it returns the names and locations of any detected brands, making it the correct "out-of-the-box" solution for this task.

Incorrect Option:

A. Face
The Face service is specifically designed to detect, recognize, and analyze human faces. It is trained on facial features and attributes, not on corporate logos or brand imagery. It would be ineffective for identifying brands in photographs.

B. Custom Vision
Custom Vision is used to build and train your own custom image classification or object detection models. You would use this if you needed to identify brands that are not well-known or are specific to your business. Since the requirement is for a pre-built solution for well-known brands, Computer Vision's built-in brand detection is the appropriate choice.

D. Form Recognizer (now Azure AI Document Intelligence)
Form Recognizer is designed to extract text, key-value pairs, and tables from documents like forms, invoices, and receipts. It is specialized for document processing and structured data extraction, not for identifying visual brand logos in general photographs.

Reference:
Computer Vision - Brand detection

Select the:


Summary:
This question asks for the Azure service used to train a custom object detection model with your own images. Object detection involves drawing bounding boxes around multiple objects within an image. While some services offer pre-built models, the requirement to use your own data for training points to a service specifically designed for custom model development.

Correct Option:

Custom Vision
Azure Custom Vision is a dedicated service for building custom image classifiers and object detection models. It provides a user-friendly interface and API that allows you to upload your own labeled images, train a model to recognize and locate specific objects within those images, and then deploy the model for use in your applications. It is specifically designed for this exact task.

Incorrect Option:

Computer Vision
Azure AI Computer Vision offers pre-built, general-purpose capabilities like brand detection, object tagging, and celebrity recognition. However, it is not a service that allows you to train a new object detection model using your own custom dataset. Its models are fixed and cannot be customized.

Form Recognizer (now Azure AI Document Intelligence)
Form Recognizer is specialized for extracting text, key-value pairs, and tables from documents like forms and invoices. While you can train a custom model with your own documents, its purpose is document understanding, not general object detection in photographs. It is not the right tool for detecting arbitrary objects in images.

Azure Video Analyzer for Media (now Azure AI Video Indexer)
This service is designed to analyze videos to extract insights, such as detecting faces, recognizing celebrities, and transcribing speech. It is focused on video content and uses pre-built models. It does not provide a platform for training a custom object detection model with your own static images.

Reference:
What is Custom Vision? - Azure AI services

For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.


Summary:
This question tests your understanding of the purpose and capabilities of QnA Maker, a service designed to create a conversational question-and-answer layer over existing information. It's crucial to know that QnA Maker works with static knowledge bases, not live databases, and is focused on finding the best answer from provided content rather than understanding complex user intents like a full-fledged bot framework.

Correct Option:
You can use QnA Maker to query an Azure SQL database.

Answer: No

Explanation:
This is false. QnA Maker is designed to extract question-and-answer pairs from semi-structured content like FAQs, product manuals, documents, and URLs to build a static knowledge base. It does not connect to or execute live queries against databases like Azure SQL. For dynamic data from a database, you would need to use a different approach, such as the Azure Bot Service with custom code.

You should use QnA Maker when you want a knowledge base to provide the same answer to different users who submit similar questions.

Answer: Yes

Explanation:
This is true and a core strength of QnA Maker. It uses natural language processing to match a user's question, even if phrased differently, to the most relevant question in its knowledge base. This ensures that different users asking semantically similar questions (e.g., "What are your hours?" and "When are you open?") will receive the same consistent answer.

The QnA Maker service can determine the intent of a user utterance.

Answer: No

Explanation:
This is false. While QnA Maker performs some natural language understanding to find the best answer, its primary function is answer retrieval, not intent classification. Determining the broader intent of a user's utterance (e.g., to book a flight, check a balance, or file a complaint) is the primary role of a service like Language Understanding (LUIS) within a bot framework. QnA Maker focuses on finding an answer, not classifying the user's goal.

Incorrect Option:
You can use QnA Maker to query an Azure SQL database.

Selecting "Yes" would be incorrect because QnA Maker does not have the capability to connect to or query live relational databases.

You should use QnA Maker when you want a knowledge base to provide the same answer to different users who submit similar questions.

Selecting "No" would be incorrect, as providing consistent answers to variably phrased questions is the fundamental purpose of a QnA system.

The QnA Maker service can determine the intent of a user utterance.

Selecting "Yes" would be incorrect, as it confuses the role of QnA Maker with that of an intent-classification service like LUIS.

Reference:

What is QnA Maker?
Compare QnA Maker and Language Understanding (LUIS)

Select the answer that correctly completes the sentence.


Summary:
This question tests your knowledge of a specific computer vision capability focused on text extraction. The task described involves converting non-digital text, whether handwritten or printed, into machine-encoded, readable text. This process is distinct from other vision tasks that identify objects, categorize images, or analyze faces.

Correct Option:

Optical character recognition (OCR)
Optical Character Recognition (OCR) is the technology designed specifically to extract text from images and documents. It can read text from various sources, including scanned paper documents, PDFs, and photos, and convert it into editable and searchable data. Modern OCR services, like those in Azure AI Vision, are capable of reading both printed and handwritten text.

Incorrect Option:
Object detection
Object detection is used to identify and locate multiple objects within an image by drawing bounding boxes around them (e.g., finding all the cars in a parking lot photo). It does not read or extract the textual content from within those objects or the image as a whole.

Facial recognition
Facial recognition is a specialized technology for identifying or verifying human faces. It detects facial features and compares them to a database. It is not used for processing or interpreting handwritten text.

Image classification
Image classification assigns a single label or category to an entire image (e.g., classifying an image as a "dog" or a "landscape"). It does not perform the detailed pixel-level analysis required to decipher and extract individual characters and words from text within the image.

Reference:
What is Optical Character Recognition? - Azure AI Vision

To complete the sentence, select the appropriate option in the answer area.


Summary:
This question hinges on the distinction between simply finding a face in an image versus analyzing its attributes. The scenario describes an AI that assesses qualitative aspects of a face—exposure (lighting), noise (image grain), and occlusion (blocking)—to provide photographic feedback. This goes beyond mere detection and requires a deeper analysis of the face's condition and appearance.

Correct Option:

facial analysis.
Facial analysis is the correct term. This technology goes beyond simple detection to examine the attributes of a detected face. It can identify features like head pose, emotional state, blur, exposure, and the presence of occlusion (e.g., hair or glasses covering the face). Providing feedback on exposure, noise, and occlusion directly aligns with this analytical capability.

Incorrect Option:

facial detection.
Facial detection is the process of finding and locating human faces in an image or video. It answers the question "Is there a face, and where is it?" It does not, however, analyze the qualitative aspects of that face, such as its lighting or image quality, which is required for the described feedback system.

facial recognition.
Facial recognition is a specific application of facial analysis used to identify or verify a person's identity by comparing their face to a database of known individuals. The photographer's tool is not trying to identify who is in the portrait, but rather to analyze the quality of the portrait itself, making recognition the wrong capability.

Reference:
Face analysis concepts - Azure AI services

Select the answer that correctly completes the sentence.


Summary:
This question tests your understanding of fundamental machine learning problem types. The scenario involves predicting a categorical outcome—specifically, one of two possible states ("repaid" or "not repaid"). This type of prediction, where the answer is a discrete label or category, falls under a specific and common branch of supervised learning.

Correct Option:

classification
This is a classic example of a classification problem. In machine learning, classification is used to predict a categorical label. Here, the model is predicting one of two distinct classes: the loan will be "repaid" (a positive class) or it will "default" (a negative class). This is a specific type of classification known as binary classification, which is a core task in supervised learning.

Incorrect Option:

clustering
Clustering is an unsupervised learning technique used to group similar data points together based on their features, without using pre-defined labels. The loan prediction task is a supervised problem because it relies on historical data with known outcomes (whether past loans were repaid or not) to train the model. There is a clear target label to predict.

regression
Regression is used to predict a continuous numerical value, not a category. For example, predicting the exact dollar amount of a loan repayment or the interest rate would be a regression task. However, predicting a simple yes/no or true/false outcome like "repaid" is a categorical prediction, which is the domain of classification.

Reference:
Machine learning algorithm cheat sheet - Classification

Match the types of computer vision workloads to the appropriate scenarios.
To answer, drag the appropriate workload type from the column on the left to its scenario on the right. Each workload type may be used once more than once, or not at all.
NOTE: Each correct match is worth one point.


Summary:
This question tests the ability to distinguish between common computer vision workloads. The key is to identify whether the task requires categorizing an entire image, finding and locating specific objects, or reading text. Each scenario has a primary technology designed to solve it effectively.

Correct Matches:
Generate captions for images.

Workload: Image classification

Explanation:
While advanced services can generate descriptive sentences (a task sometimes called "dense captioning"), the fundamental technology required to understand the content of an image well enough to generate a caption is based on image classification. The service must first identify and classify the main objects, scenes, and activities within the image to describe it. For the purpose of the AI-900 exam, associating caption generation with the foundational task of understanding image content (classification) is appropriate.

Extract movie title names from movie poster images.

Workload: Optical character recognition (OCR)

Explanation:
This is the primary use case for OCR. The task is to detect and read the text—the movie title—that is visually present within the image of the movie poster. OCR is specifically designed to convert text in images into machine-readable strings.

Locate vehicles in images.

Workload: Object detection

Explanation:
This is a classic object detection scenario. The goal is not just to say "this is a traffic image" (classification) but to identify where each vehicle is located within the image by drawing bounding boxes around them. Object detection is designed for this precise task of locating multiple object instances.

Reference:

What is computer vision? - Image Classification

What is computer vision? - Object Detection

What is Optical Character Recognition?

correctly completes the sentence


Summary:
In machine learning, the input data used to make predictions are distinct from the output you're trying to predict. These inputs are the measurable characteristics or attributes that describe each instance in your dataset. They are the variables the model uses to learn patterns and relationships.

Correct Option:

features
In a machine learning model, the input data are called features. These are the independent, measurable variables or characteristics that describe each instance (row) in the dataset. For example, in a model predicting house prices, features would include square footage, number of bedrooms, and year built. The model learns the relationship between these features and the target variable.

Incorrect Option:

labels
Labels are the outputs or the answers in a supervised learning scenario. They are the dependent variable that the model is trained to predict. For instance, in a disease prediction model, the "disease" or "no disease" outcome is the label. The inputs used to predict that label are the features.

instances
An instance (or sample) refers to a single row or individual example within the entire dataset. An instance is comprised of a set of features and, in supervised learning, an associated label. The term "instance" refers to the whole data point, not specifically to the input data within it.

functions
Functions are blocks of code that perform a specific task. While machine learning models themselves can be thought of as complex functions (mapping inputs to outputs), the input data passed into the model are not called "functions."

Reference:
What is machine learning? - Key terminology

You need to build an app that will identify celebrities in images.
Which service should you use?

A. Azure OpenAI Service

B. Azure Machine Learning

C. conversational language understanding (CLU)

D. Azure Al Vision

D.   Azure Al Vision

Summary:
This question asks for a specific, pre-built AI service capable of identifying well-known people in images. The requirement is for an out-of-the-box solution that already contains a knowledge base of famous individuals, eliminating the need to collect data or train a custom model. This functionality is a dedicated feature of a particular computer vision service.

Correct Option:

D. Azure AI Vision
Azure AI Vision (formerly Computer Vision) is the correct service. It includes a pre-built Celebrity Recognition feature. This model is already trained to identify thousands of well-known celebrities from the worlds of business, politics, sports, and entertainment. You simply send an image to the service's API, and it returns the names and locations of any detected celebrities, making it the perfect "out-of-the-box" solution for this task.

Incorrect Option:

A. Azure OpenAI Service
Azure OpenAI Service provides access to powerful large language models (LLMs) like GPT for text generation, summarization, and coding. It also includes models like DALL-E for image generation. However, it does not contain a pre-built, dedicated model for identifying celebrities in existing images.

B. Azure Machine Learning
Azure Machine Learning is a platform for data scientists to build, train, and deploy their own custom machine learning models. You could use it to create a celebrity identification model, but that would require you to gather a massive dataset of celebrity images and handle the training yourself. Since a pre-built solution exists, this is not the most efficient or correct choice.

C. Conversational Language Understanding (CLU)
CLU is a service for building custom natural language understanding models to recognize intents and entities in text or speech. It is designed for processing language, not for analyzing visual content or identifying people in images.

Reference:
Azure AI Vision - Celebrity Recognition

Page 1 out of 27 Pages

Microsoft AI-900 Exam Details


Exam Code: AI-102
Exam Name: Microsoft Azure AI Fundamentals Exam
Certification Name: Microsoft Certified Azure AI Fundamentals Certification
Certification Provider: Microsoft

Are You Truly Prepared?

Don't risk your exam fee on uncertainty. Take this definitive practice test to validate your readiness for the Microsoft AI-900 exam.