Microsoft DP-600 Practice Test Questions

Stop wondering if you're ready. Our DP-600 practice test is designed to identify your exact knowledge gaps. Validate your skills with questions that mirror the real exam's format and difficulty. Build a personalized study plan based on your DP-600 exam questions performance, focusing your effort where it matters most.

Targeted practice like this helps candidates feel significantly more prepared for Microsoft DP-600 exam day.

2500 already prepared
Updated On : 15-Dec-2025
50 Questions
4.9/5.0

Page 1 out of 5 Pages

Litware. Inc. Case Study

   

Overview

Litware. Inc. is a manufacturing company that has offices throughout North America. The analytics team at Litware contains data engineers, analytics engineers, data analysts, and data scientists.

Existing Environment
litware has been using a Microsoft Power Bl tenant for three years. Litware has NOT enabled any Fabric capacities and features.

Fabric Environment
Litware has data that must be analyzed as shown in the following table.



The Product data contains a single table and the following columns.



The customer satisfaction data contains the following tables:

• Survey

• Question

• Response

For each survey submitted, the following occurs:

• One row is added to the Survey table.

• One row is added to the Response table for each question in the survey.

The Question table contains the text of each survey question. The third question in each survey response is an overall satisfaction score. Customers can submit a survey after each purchase.

User Problems
The analytics team has large volumes of data, some of which is semi-structured. The team wants to use Fabric to create a new data store.

Product data is often classified into three pricing groups: high, medium, and low. This logic is implemented in several databases and semantic models, but the logic does NOT always match across implementations.

Planned Changes
Litware plans to enable Fabric features in the existing tenant. The analytics team will createa new data store as a proof of concept (PoC). The remaining Litware users will only get access to the Fabric features once the PoC is complete. The PoC will be completed by using a Fabric trial capacity.

The following three workspaces will be created:

• AnalyticsPOC: Will contain the data store, semantic models, reports, pipelines, dataflows, and notebooks used to populate the data store

• DataEngPOC: Will contain all the pipelines, dataflows, and notebooks used to populate Onelake

• DataSciPOC: Will contain all the notebooks and reports created by the data scientists The following will be created in the AnalyticsPOC workspace:

• A data store (type to be decided)

• A custom semantic model

• A default semantic model

• Interactive reports

The data engineers will create data pipelines to load data to OneLake either hourly or daily depending on the data source. The analytics engineers will create processes to ingest transform, and load the data to the data store in the AnalyticsPOC workspace daily.

Whenever possible, the data engineers will use low-code tools for data ingestion. The choice of which data cleansing and transformation tools to use will be at the data engineers' discretion.

All the semantic models and reports in the Analytics POC workspace will use the data store as the sole data source.

Technical Requirements

The data store must support the following:

• Read access by using T-SQL or Python

• Semi-structured and unstructured data

• Row-level security (RLS) for users executing T-SQL queries

Files loaded by the data engineers to OneLake will be stored in the Parquet format and will meet Delta Lake specifications.

Data will be loaded without transformation in one area of the AnalyticsPOC data store. The data will then be cleansed, merged, and transformed into a dimensional model.

The data load process must ensure that the raw and cleansed data is updated completely before populating the dimensional model.

The dimensional model must contain a date dimension. There is no existing data source for the date dimension. The Litware fiscal year matches the calendar year. The date dimension must always contain dates from 2010 through the end of the current year.

The product pricing group logic must be maintained by the analytics engineers in a single location. The pricing group data must be made available in the data store for T-SQL queries and in the default semantic model. The following logic must be used:

• List prices that are less than or equal to 50 are in the low pricing group.

• List prices that are greater than 50 and less than or equal to 1,000 are in the medium pricing group.

• List pnces that are greater than 1,000 are in the high pricing group.

Security Requirements

Only Fabric administrators and the analytics team must be able to see the Fabric items created as part of the PoC. Litware identifies the following security requirements for the Fabric items in the AnalyticsPOC workspace:

• Fabric administrators will be the workspace administrators.

• The data engineers must be able to read from and write to the data store. No access must be granted to datasets or reports.

• The analytics engineers must be able to read from, write to, and create schemas in the data store. They also must be able to create and share semantic models with the data analysts and view and modify all reports in the workspace.

• The data scientists must be able to read from the data store, but not write to it. They will access the data by using a Spark notebook.

• The data analysts must have read access to only the dimensional model objects in the data store. They also must have access to create Power Bl reports by using the semantic models created by the analytics engineers.

• The date dimension must be available to all users of the data store.

• The principle of least privilege must be followed.

Both the default and custom semantic models must include only tables or views from the dimensional model in the data store. Litware already has the following Microsoft Entra security groups:

• FabricAdmins: Fabric administrators

• AnalyticsTeam: All the members of the analytics team

• DataAnalysts: The data analysts on the analytics team

• DataScientists: The data scientists on the analytics team

• Data Engineers: The data engineers on the analytics team

• Analytics Engineers: The analytics engineers on the analytics team

Report Requirements

The data analysis must create a customer satisfaction report that meets the following requirements:

• Enables a user to select a product to filter customer survey responses to only those who have purchased that product

• Displays the average overall satisfaction score of all the surveys submitted during the last 12 months up to a selected date

• Shows data as soon as the data is updated in the data store

• Ensures that the report and the semantic model only contain data from the current and previous year

• Ensures that the report respects any table-level security specified in the source data store

• Minimizes the execution time of report queries

You need to recommend a solution to prepare the tenant for the PoC.

Which two actions should you recommend performing from the Fabric Admin portal? Each correct answer presents part of the solution.

NOTE: Each correct answer is worth one point.

A. Enable the Users can try Microsoft Fabric paid features option for specific security groups.

B. Enable the Allow Azure Active Directory guest users to access Microsoft Fabric option for specific security groups.

C. Enable the Users can create Fabric items option and exclude specific security groups.

D. Enable the Users can try Microsoft Fabric paid features option for the entire organization.

E. Enable the Users can create Fabric items option for specific security groups.

A.   Enable the Users can try Microsoft Fabric paid features option for specific security groups.
E.   Enable the Users can create Fabric items option for specific security groups.

Explanation:
For a Proof of Concept (PoC), you need to enable a controlled group of users to explore Fabric's full capabilities and create items, without opening it to the entire tenant. The solution must be targeted, not organization-wide, to limit scope and manage costs during the evaluation phase.

Correct Option:

A. Enable the Users can try Microsoft Fabric paid features option for specific security groups:
This grants the selected PoC users access to advanced, capacity-backed features (like Data Warehouse, Direct Lake mode) essential for a proper evaluation, without enabling it for all users.

E. Enable the Users can create Fabric items option for specific security groups:
This allows the designated PoC participants to create workspaces, reports, and other Fabric artifacts. This is a core prerequisite for them to actively build and test solutions during the PoC.

Incorrect Option:

B. Enable the Allow Azure Active Directory guest users to access Microsoft Fabric option for specific security groups:
This is unnecessary for a standard internal PoC. It's only relevant if you need to include external guest users (B2B) in the evaluation, which is not stated in the requirement.

C. Enable the Users can create Fabric items option and exclude specific security groups:
This is the inverse of what's needed. The "enable and exclude" model would open creation rights to the entire organization except a few groups. For a PoC, you want to enable creation only for specific groups, not for everyone by default.

D. Enable the Users can try Microsoft Fabric paid features option for the entire organization:
This is too broad and poses a licensing and governance risk. A PoC should be a controlled rollout, not an organization-wide enablement of paid features.

Reference:
Microsoft Learn - "Configure Microsoft Fabric admin settings" details how to use tenant-level admin controls to manage feature availability by security group, which is the recommended practice for phased rollouts and PoCs.

Which type of data store should you recommend in the AnalyticsPOC workspace

A. a data lake

B. a warehouse

C. a lakehouse

D. an external Hive metaStore

C.   a lakehouse

Explanation:
The Lakehouse architecture in Microsoft Fabric is designed to be the versatile, foundational data store that combines the best features of a data lake and a data warehouse. This makes it the ideal choice for an initial Proof of Concept (PoC) or a workspace intended for general analytics. It supports structured, semi-structured, and unstructured data (unlike a pure warehouse), leverages the Delta Lake format on OneLake, and natively integrates with both Spark (for data engineering and data science) and a SQL analytics endpoint (for BI and T-SQL querying). Its flexibility supports a broad range of data professional skillsets and analytic requirements often present in a PoC.

Correct Option:

C. a lakehouse
Versatile Data Support: A Lakehouse natively supports all data types, including the semi-structured and unstructured data that is typically involved in a modern analytics PoC, whereas a traditional warehouse is optimized primarily for structured data.

Unified Platform: In Microsoft Fabric, a Lakehouse is the primary workspace item for data engineering and data science, offering a unified storage layer (Delta Lake in OneLake) that can be accessed by both Spark notebooks (e.g., Python/PySpark) and the automatically generated SQL Analytics Endpoint (e.g., T-SQL).

DP-600 Context: The DP-600 exam focuses on implementing analytic solutions in Microsoft Fabric, where the Lakehouse is central to modern data engineering and analytics workloads that require handling raw, diverse data before transformation into a curated model.

Incorrect Option:

A. a data lake
A pure data lake is primarily a storage location for raw data and lacks the integrated relational management, metadata layer, and transactional support (ACID properties) that a Lakehouse or Warehouse provides. While the Lakehouse uses a data lake for its storage (OneLake), the Lakehouse item itself provides the necessary structure and capabilities for analytics and BI that a raw data lake does not.

B. a warehouse
A Warehouse is optimized for relational, structured data and Business Intelligence (BI) workloads, offering high performance for T-SQL querying and full multi-table ACID transactions. However, if the PoC needs to handle semi-structured or unstructured data (a common scenario in modern analytics), a pure Warehouse is less suitable than a Lakehouse, which can handle diverse data first, then curate it into a structure that can still be queried via its SQL endpoint.

D. an external Hive metaStore
An external Hive metaStore is an outdated or external component used to manage metadata for tables stored in a data lake, primarily in older Apache Hadoop or Spark environments. Microsoft Fabric's Lakehouse manages metadata internally through OneLake and Delta Lake, eliminating the need for an external Hive metastore. This option represents a legacy architecture not central to the native Fabric experience.

Reference:
Microsoft Learn: Microsoft Fabric decision guide: choose a data store; Microsoft Learn: What is a Lakehouse?

You need to design a semantic model for the customer satisfaction report.

Which data source authentication method and mode should you use? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.




Explanation:
The question focuses on designing a semantic model in Microsoft Fabric for a customer satisfaction report. In DP-600 scenarios, semantic models are typically built on Lakehouse or Warehouse data and must support secure, scalable, and performant access. The choice depends on aligning with Fabric best practices, enterprise security, and real-time or near–real-time analytics. Authentication should support unattended access, while the mode should minimize data movement and optimize query performance.

Correct Option:

Authentication method: Service principal authentication
Service principal authentication is recommended for enterprise-grade semantic models in Microsoft Fabric. It enables secure, non-interactive authentication between services, supports automation, and aligns with least-privilege access control using Azure AD. This method is ideal for production workloads, CI/CD pipelines, and scheduled refresh scenarios, which are common in customer satisfaction reporting solutions.

Mode: Direct Lake
Direct Lake mode is optimized for Fabric Lakehouse data and allows the semantic model to query Delta tables directly in OneLake without importing or duplicating data. It provides near real-time performance with lower latency than Import mode and simpler architecture than DirectQuery. This makes it the preferred choice for analytical reports like customer satisfaction dashboards.

Incorrect Option:

Basic authentication
Basic authentication relies on usernames and passwords and is not recommended for modern Fabric or Azure-based solutions. It poses security risks, lacks support for conditional access, and is unsuitable for automated or enterprise-scale reporting scenarios. Microsoft strongly discourages its use in favor of Azure AD–based authentication methods.

Single sign-on (SSO) authentication
While SSO is useful for end-user access to reports, it is not ideal for backend semantic model authentication. SSO depends on user context and does not support unattended service-to-service access, making it unsuitable for scheduled refreshes and automated workloads.

DirectQuery
DirectQuery sends queries directly to the source system at runtime, which can introduce latency and performance bottlenecks. In Fabric, Direct Lake is preferred over DirectQuery for Lakehouse data because it provides better performance and simpler management while still avoiding data duplication.

Import
Import mode copies data into the semantic model, increasing storage usage and requiring scheduled refreshes. This can lead to data latency and higher maintenance overhead. For Fabric Lakehouse scenarios, Direct Lake is a more efficient and scalable alternative.

Reference:
Microsoft Learn – Semantic models in Microsoft Fabric

Microsoft Learn – Direct Lake vs DirectQuery vs Import modes

Microsoft Learn – Service principal authentication in Microsoft Fabric

You need to implement the date dimension in the data store. The solution must meet the technical requirements.

What are two ways to achieve the goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A. Populate the date dimension table by using a dataflow.

B. Populate the date dimension table by using a Stored procedure activity in a pipeline.

C. Populate the date dimension view by using T-SQL.

D. Populate the date dimension table by using a Copy activity in a pipeline.

A.   Populate the date dimension table by using a dataflow.
B.   Populate the date dimension table by using a Stored procedure activity in a pipeline.

Explanation:
The requirement is to implement a date dimension table in the data store. A date dimension is a static, pre-calculated table generated via logic, not copied from a source. The solution must populate this table, implying an initial load and potentially a refresh mechanism. The key is to use a tool that can execute the generation logic and write to the destination table.

Correct Option:

A. Populate the date dimension table by using a dataflow:
Dataflow Gen2 can generate date ranges using Power Query logic, apply complex transformations, and load the result directly into a table within a Fabric Lakehouse or Warehouse. This is a low-code, managed ETL solution.

B. Populate the date dimension table by using a Stored procedure activity in a pipeline:
This is a code-first approach. You can author a T-SQL script within a Stored Procedure in the Warehouse to generate the date dimension, and then execute this SP using a Pipeline activity. This leverages the data store's native compute.

Incorrect Option:

C. Populate the date dimension view by using T-SQL:
This does not meet the requirement to implement a table. A view is a logical layer, not a physical data store object that materializes the data. While you can create a view with T-SQL logic, the requirement is for a table in the data store.

D. Populate the date dimension table by using a Copy activity in a pipeline:
A Copy activity is designed to move data from a source to a destination. It cannot generate or transform data on its own. Since a date dimension is generated from logic, not copied from an existing source, this activity is unsuitable.

Reference:
Microsoft Learn - "Dataflows Gen2 in Microsoft Fabric" and "Pipeline activities in Data Factory" detail the transformation capabilities of Dataflows and the execution logic of the Stored Procedure activity, both of which can be used to generate and populate dimension tables.

You need to ensure the data loading activities in the AnalyticsPOC workspace are executed in the appropriate sequence. The solution must meet the technical requirements.

What should you do?

A. Create a pipeline that has dependencies between activities and schedule the pipeline.

B. Create and schedule a Spark job definition.

C. Create a dataflow that has multiple steps and schedule the dataflow.

D. Create and schedule a Spark notebook.

A.   Create a pipeline that has dependencies between activities and schedule the pipeline.

Explanation:
The core requirement is to ensure sequential execution and schedule the activities. In Microsoft Fabric, a Pipeline (similar to Azure Data Factory or Synapse Pipelines) is the dedicated tool for orchestrating and automating data movement and transformation workflows. A pipeline allows you to define a series of activities (like notebook execution, data copying, or stored procedure calls) and enforce a strict execution order using dependencies (e.g., Activity B runs only if Activity A succeeds). This ensures the data loading steps, such as ingestion followed by transformation, occur in the precise sequence required by the technical specification.

Correct Option:

A. Create a pipeline that has dependencies between activities and schedule the pipeline.
Orchestration and Sequencing: Pipelines are the dedicated component in Fabric for orchestration. Defining dependencies (Success, Failure, Completion) between activities explicitly enforces the required sequential order for your data loading steps.

Scheduling: The pipeline object itself contains scheduling properties, allowing you to define recurrence (e.g., daily, hourly) without needing external triggers.

Activity Diversity: A pipeline can orchestrate different types of activities needed for data loading, such as Copy Data, Dataflow, Notebook, or Stored Procedure activities, making it highly versatile for complex workflows.

Incorrect Option:

B. Create and schedule a Spark job definition.
A Spark job definition is used to submit a single, self-contained set of Spark commands or a compiled application to a Spark compute cluster. While it executes code, it is not designed to orchestrate a sequence of distinct activities or manage dependencies between different types of data loading tasks. It handles the logic within one step of the sequence, not the sequence itself.

C. Create a dataflow that has multiple steps and schedule the dataflow.
A Dataflow (Gen2) is used for data transformation and preparation using a low-code/no-code interface (Power Query). While a dataflow can have multiple query steps, these steps are executed as part of a single transformation job. It is designed for ETL/ELT logic, not for orchestrating the overall flow of multiple separate loading activities (like Notebooks, Copy activities, and further transformations) that must run strictly one after the other.

D. Create and schedule a Spark notebook.
A Spark notebook is used to write and execute data transformation code using languages like Python (PySpark), Scala, or SQL. A single notebook can contain a sequence of cells, but it is typically used for a single, large logical step. It is not the standard Fabric tool for orchestrating and scheduling an end-to-end workflow consisting of different types of activities with defined dependencies across multiple processes.

Reference:
Microsoft Learn: Orchestrate data movement and transformation in Microsoft Fabric with pipelines; Microsoft Learn: How to create a pipeline in Microsoft Fabric.

You need to create a DAX measure to calculate the average overall satisfaction score.
How should you complete the DAX code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.






Explanation:
The question tests the ability to write a DAX measure for a rolling 12-month average of survey satisfaction scores, specifically for the "Overall Satisfaction" question. The measure must dynamically filter the last 12 months using DATESINPERIOD, apply a filter for the relevant question, and correctly average response values. In survey scenarios with multiple responses per customer, averaging per customer first avoids bias toward customers with more submissions.

Correct Option:

AVERAGEX(VALUES('Survey'[Customer Key]), CALCULATE(AVERAGE('Survey'[Response Value])))
This option ensures a per-customer average satisfaction score before overall averaging, preventing over-weighting customers with multiple survey responses. VALUES('Survey'[Customer Key]) iterates over unique customers in the period, and the inner CALCULATE(AVERAGE) computes each customer's mean response, resulting in a fair rolling average.

Period (in the CALCULATE modifiers)
The Period variable, defined via DATESINPERIOD, correctly restricts the calculation to the trailing 12 months ending at the last date in context, enabling the rolling aspect.

'Survey Question'[Question Title] = "Overall Satisfaction" (as a filter in CALCULATE)
This filter expression ensures only responses to the "Overall Satisfaction" question are included, accurately targeting the required metric while ignoring other questions.

Incorrect Option:

AVERAGE('Survey'[Response Value])
This directly averages all individual responses without considering customer-level grouping, leading to bias where customers submitting more surveys disproportionately influence the result.

AVERAGEA('Survey'[Question Text])
This is invalid as 'Question Text' is likely text data; AVERAGEA would treat non-numeric values incorrectly, and it does not target response values.

AVERAGEX(VALUES('Survey'[Customer Key]))
This is incomplete and syntactically incorrect, lacking the expression to average within the iteration.

NumberOfMonths, LastCurrentDate, Period (as direct parts of Result)
These are intermediate variables and cannot be used directly in the calculation; selecting them would break the measure logic.

Reference:
Microsoft Learn: DAX time intelligence functions (DATESINPERIOD, CALCULATE modifiers); SQLBI article on Rolling 12 Months Average in DAX (sqlbi.com/articles/rolling-12-months-average-in-dax/); Exam discussion reference from ExamTopics DP-600 Topic 1 Question 6.

What should you recommend using to ingest the customer data into the data store in the AnatyticsPOC workspace?

A. a stored procedure

B. a pipeline that contains a KQL activity

C. a Spark notebook

D. a dataflow

D.   a dataflow

Explanation:
The scenario involves ingesting source data into a data store, typically a foundational ETL task. The question implies transforming and loading customer data. When the requirement centers on data transformation and loading within Fabric's analytics workspace, the primary low-code, user-friendly tool designed for this exact pattern is Dataflow Gen2. It is optimized for visually building repeatable data preparation logic.

Correct Option:

D. a dataflow:
Dataflow Gen2 is the core recommendation for ingesting and transforming data into a Fabric Lakehouse or Warehouse. It provides a visual, low-code interface using Power Query, supports a wide range of data sources, handles complex transformations, and can load data directly into tables in the workspace's data store, aligning perfectly with standard ingestion patterns.

Incorrect Option:

A. a stored procedure:
Stored procedures are for executing predefined T-SQL logic within a SQL database or Warehouse. They are not an ingestion tool for bringing external data into the platform. They operate on data already present in the store.

B. a pipeline that contains a KQL activity:
A KQL (Kusto Query Language) activity is specific to Real-Time Analytics in Fabric for querying or ingesting data into KQL databases. It is not the general-purpose tool for ingesting relational or file-based customer data into a standard Lakehouse or Warehouse data store.

C. a Spark notebook:
While a Spark notebook can technically perform ingestion and is code-based, it is not the primary recommended tool for a straightforward ETL ingestion task. Notebooks are better suited for data exploration, advanced analytics, and complex programmatic logic, whereas Dataflow is the dedicated, optimized service for managed data ingestion and transformation.

Reference:
Microsoft Learn - "Data ingestion with Dataflows Gen2 in Microsoft Fabric" establishes Dataflow as the primary service for visually ingesting, transforming, and loading data from various sources into OneLake and Fabric data stores.

You need to resolve the issue with the pricing group classification.

How should you complete the T-SQL statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.






Explanation:
The question focuses on correcting a T-SQL statement to create a view that properly classifies products into pricing groups (low, medium, high) based on ListPrice from the dbo.Products table. The classification requires: ≤50 as 'low', >50 and ≤1000 as 'medium', >1000 as 'high'. The view ensures dynamic updates when underlying product data changes, unlike a static table.

Correct Option:

VIEW
Creating a view (CREATE VIEW) provides a virtual table that always reflects the latest data in dbo.Products, including any changes to ListPrice, ensuring real-time accurate pricing group classification without data duplication.

CASE
The CASE expression is the standard T-SQL construct for conditional logic, allowing multiple WHEN-THEN conditions to assign 'low', 'medium', or 'high' based on ListPrice ranges in a readable and efficient manner.

WHEN ListPrice > 50 AND ListPrice <= 1000 THEN 'medium'
This condition precisely captures the medium range (>50 to ≤1000), avoiding overlap with the low category (≤50). CASE evaluates conditions sequentially, so this ensures correct classification without misassigning boundary values like 50 or 1000.

Incorrect Option:

TABLE
CREATE TABLE would materialize data statically (e.g., via CTAS), requiring manual refreshes when dbo.Products changes, leading to outdated pricing groups and failing dynamic requirements.

SELECT
SELECT alone cannot define a persistent object like a view; it only queries data temporarily and does not complete the CREATE statement syntax.

COALESCE, IIF, SET
These are not suitable for multi-condition branching: COALESCE returns the first non-NULL, IIF handles only two outcomes, and SET assigns variables, none fitting the required tiered classification logic.

WHEN ListPrice <= 50 THEN 'low'
This is part of the correct logic but incomplete alone; selecting only this ignores medium and high categories.

WHEN ListPrice BETWEEN 50 AND 1000 THEN 'medium'
This incorrectly includes ListPrice=50 in 'medium', conflicting with the requirement that ≤50 is 'low', causing wrong classification for boundary values at 50.

You to need assign permissions for the data store in the AnalyticsPOC workspace. The solution must meet the security requirements.

Which additional permissions should you assign when you share the data store? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.




Explanation:
The question evaluates knowledge of Microsoft Fabric lakehouse item-level sharing permissions in a workspace. When sharing a lakehouse (the data store), additional granular permissions can be granted beyond basic access. The goal is to apply least-privilege security: enabling DataEngineers to manage Spark jobs, DataAnalysts to build Power BI reports on semantic models, and DataScientists to query via SQL endpoint without unnecessary elevated access.

Correct Option:

DataEngineers: Read All Apache Spark
This permission allows DataEngineers to read and execute Spark notebooks/jobs on the lakehouse files, supporting data ingestion, transformation, and curation tasks in Apache Spark while adhering to least privilege.

DataAnalysts: Build Reports on the default dataset
This grants DataAnalysts the ability to create and publish Power BI reports directly on the default semantic model of the lakehouse, enabling visualization and analysis without access to underlying Spark or raw files.

DataScientists: Read All SQL analytics endpoint data
This provides DataScientists read-only access to query the lakehouse data via the SQL analytics endpoint, ideal for experimentation, modeling, and analysis using T-SQL without allowing modifications or Spark access.

Incorrect Option:

DataEngineers: Build Reports on the default dataset / Read All SQL analytics endpoint data
These are insufficient or mismatched; engineers need Spark access for data management, not just reporting or SQL read, which would prevent them from performing transformation tasks.

DataAnalysts: Read All Apache Spark / Read All SQL analytics endpoint data
Spark access is unnecessary for analysts focused on reporting; SQL endpoint alone lacks the "Build Reports" capability tied to the default dataset for seamless Power BI integration.

DataScientists: Read All Apache Spark / Build Reports on the default dataset
Spark access exceeds needs for scientists who primarily query via SQL; report building is analyst-focused and not required for data science workflows.

Reference:
Microsoft Learn: Lakehouse sharing and permission management; ExamTopics DP-600 Topic 1 discussions on lakehouse permissions; Microsoft Fabric documentation on item-level permissions for lakehouses.

Which syntax should you use in a notebook to access the Research division data for Productlinel?







A. Option A

B. Option B

C. Option C

D. Option D

B.   Option B

Explanation:
The question assesses knowledge of accessing external Research division data in a Microsoft Fabric notebook. The data for Productline1 is in an ADLS Gen2 account, made available via a shortcut named "ResearchProduct" in the lakehouse's Files section. Shortcuts to external storage appear under Files, and Delta-formatted data can be read using Spark's load method pointing to the shortcut path, enabling seamless access without ingestion.

Correct Option:

B. spark.read.format("delta").load("Files/ResearchProduct")
This syntax correctly loads Delta data from the external location via the shortcut in the Files section. Shortcuts to folders in external storage are placed under Files, and Spark treats the path "Files/" as the root for reading Delta tables, providing efficient access to ResearchProduct data without duplicating it in the lakehouse.

Incorrect Option:

A. spark.sql("SELECT * FROM Lakehouse1.ResearchProduct")
This assumes ResearchProduct is a managed table in the Tables section with lakehouse-qualified naming, but as a Files shortcut to external Delta data, it is not registered in the metastore as a table, causing a table not found error.

C. spark.sql("SELECT * FROM Lakehouse1.productline1.ResearchProduct")
This uses incorrect naming (productline1 as schema) and assumes a managed table structure; shortcuts do not create schemas or tables in the catalog, making this syntax invalid for accessing external shortcut data.

D. external_table('Tables/ResearchProduct')
This appears to be a non-existent or invalid function in Fabric Spark notebooks; standard Delta access uses spark.read.format("delta").load or table registration, and Tables is reserved for managed tables, not shortcuts.

Reference:
Microsoft Learn: OneLake shortcuts; Fabric Lakehouse explorer behavior for shortcuts; DP-600 exam discussions on accessing external data via shortcuts in notebooks.

Page 1 out of 5 Pages

Are You Truly Prepared?

Don't risk your exam fee on uncertainty. Take this definitive practice test to validate your readiness for the Microsoft DP-600 exam.