How to pass Salesforce Data Architecture and Management designer Exam

In this post we will talk about how to pass Salesforce Data Architecture and Management designer Exam. Data architect is responsible for finding solutions around data management. Large amount of data can be a big problem; if proper decisions are not taken around data. In this blog we will share some important pointers, and some tips for data architect and management designer exam.

This post will help you to evaluate your readiness to successfully complete the Salesforce Certified Data Architecture and Management Designer certification exam.

What is Salesforce Data Architect Exam?

A Salesforce Certified Data Architecture and Management Designer assesses the architecture environment and requirements and designs sound, scalable, and perform solutions on the Lightning Platform as it pertains to enterprise data management.

Who is Ideal Candidate for Data Architect Exam?

The candidate also has experience assessing customers’ requirements in regards to data quality needs and creating solutions to ensure high quality data (e.g., no duplicates, correct data) and can also recommend organizational changes to ensure proper data stewardship. The candidate has experience communicating solutions and design trade-offs to business stakeholder.

The Salesforce Certified Data Architecture and Management Designer has the following background:

  • One to two years in Salesforce technology
  • Five to eight years of experience supporting or implementing data-centric initiatives

Salesforce Certified Data Architect Exam Outline

The Salesforce Certified Data Architecture and Management Designer exam has the following characteristics:

  • Content: 60 multiple-choice/multiple-select questions
  • Time allotted to complete the exam: 105 minutes (time allows for unscored questions)
  • Passing Score: 58%
  • Registration fee: USD 400, plus applicable taxes as required per local law
  • Prerequisite: None

Always check Salesforce document for latest information.

Salesforce Certified Data Architect Exam Key Topics

Data modeling/ Database Design: 25%

  • Compare and contrast various techniques and considerations for designing a data model for the Customer 360 platform. (e.g. objects, fields & relationships, object features).
  • Given a scenario, recommend approaches and techniques to design a scalable data model that obeys the current security and sharing model.
  • Compare and contrast various techniques, approaches and considerations for capturing and managing business and technical metadata (e.g. business dictionary, data lineage, taxonomy, data classification).
  • Compare and contrast the different reasons for implementing Big Objects vs Standard/Custom objects within a production instance, alongside the unique pros and cons of utilizing Big Objects in a Salesforce data model.
  • Given a customer scenario, recommend approaches and techniques to avoid data skew (record locking, sharing calculation issues, and excessive child to parent relationships).

Master Data Management: 5%

  • Compare and contrast the various techniques, approaches and considerations for implementing Master Data Management Solutions (e.g. MDM implementation styles, harmonizing & consolidating data from multiple sources, establishing data survivorship rules, thresholds & weights, leveraging external reference data for enrichment, Canonical modeling techniques, hierarchy management.)
  • Given a customer scenario, recommend and use techniques for establishing a “golden record” or “system of truth” for the customer domain in a Single Org
  • Given a customer scenario, recommend approaches and techniques for consolidating data attributes from multiple sources. Discuss criteria and methodology for picking the winning attributes.
  • Given a customer scenario, recommend appropriate approaches and techniques to capture and maintain customer reference & metadata to preserve traceability and establish a common context for business rules

Salesforce Data Management: 25%

  • Given a customer scenario, recommend appropriate combination of Salesforce license types to effectively leverage standard and custom objects to meet business needs.
  • Given a customer scenario, recommend techniques to ensure data is persisted in a consistent manner
  • Given a scenario with multiple systems of interaction, describe techniques to represent a single view of the customer on the Salesforce platform.
  • Given a customer scenario, recommend a design to effectively consolidate and/or leverage data from multiple Salesforce instances

Data Governance: 10%

  • Given a customer scenario, recommend an approach for designing a GDPR compliant data model. Discuss the various options to identify, classify and protect personal and sensitive information. 
  • Compare and contrast various approaches and considerations for designing and implementing an enterprise data governance program.

Large Data Volume considerations: 20%

  • Given a customer scenario, design a data model that scales considering large data volume and solution performance.
  • Given a customer scenario, recommend a data archiving and purging plan that is optimal for customer’s data storage management needs.
  • Given a customer scenario, decide when to use virtualised data and describe virtualised data options.

Data Migration: 15%

  • Given a customer scenario, recommend appropriate techniques and methods for ensuring high data quality at load time. 
  • Compare and contrast various techniques for improving performance when migrating large data volumes into Salesforce.
  • Compare and contrast various techniques and considerations for exporting data from Salesforce

Learning Materials for Salesforce Data Architecture and Management designer Exam Key Topics.

Let start with learning material for Salesforce Data Architecture and Management designer Exam.

1. Data modeling/ Database Design

Ownership Skew

When you have more than 10,000 records for a single object owned by a single owner.

Why does this cause problems?

  1. Share Table Calculations: When you move a user in the Role Hierarchy, sharing calculations need to take place on large volumes of data to grant and revoke access.
  2. Moving users around the hierarchy, causes the sharing rules to be re-calculated for both the user in the hierarchy, and any users above this user in the role hierarchy.

How can we avoid this?

  1. Data Migration: Work with client to divide records up across multiple real end-users
  2. Integration User: Avoid making integration user the record owner
  3. Leverage Lead and Case assignment rules
  4. If unavoidable: assign records to a user is an isolated role at the top of the Role Hierarchy

Parenting Skew

When you have more than 10,000 records for a single object underneath the same parent record

Why does this cause problems?

  1. Data Migration: Bulk API batch size is 10,000. Records in parallel batches linked to the same parent will force the parent to be updated potentially causing record locking.
  2. Implicit Sharing: Where access to a parent record is driven by access to children. If you lose access to child record, Salesforce must check every other child record to determine continued access to parent.

How can we avoid this?

  1. Avoid having > 10,000 records of a single object linked to the same parent record.
  2. When you have free-hanging contacts that need to be associated to accounts, distribute these across multiple accounts.
  3. Using a picklist field: when dealing with a small number of Lookup records, use a Picklist field instead

2. Large Data Volume considerations

Let ’s start with Large volumes of data. Data is one of the key elements of any application. Users constantly create data. All day long. Every day. So. Much. Data. Suddenly your org has accumulated millions of records, thousands of users, and several gigabytes of data storage.

These large data volumes (LDV) can lead to sluggish performance, including slower queries, slower search and list views, and slower sandbox refreshing. You can avoid this predicament if you plan for accommodating LDV up front, designing your data model to build scalability in from the get-go.

1. Avoid data skew

A key for managing large data volumes for peak performance is carefully architecting record ownership to avoid data skew. Data skew happens when more than 10,000 child records are associated with the same parent record within an org. Data skew can be of many types Ownership skew, Lookup Skew, Account data skew.

2. Use External Data Objects

Another strategy for LDV is using external objects—which means there’s no need to bring data into Salesforce. With a data-tearing strategy that spreads data across multiple objects and brings it in on demand from another object or external store, you avoid both storing large amounts of data in your org, and the performance issues associated with LDV.

3. Create efficient queries

Create efficient queries taking advantage of indexed fields. SOQL optimizer can be used to optimize queriesThe less data that your query returns, the better it is .Use indexed fields in where clause of query, we can request salesforce support team for custom indexesAvoid queries that require full table scan example-        

  • Querying for null rows—Queries that look for records in which the field is empty or null. For example: SELECT Id, Name FROM Account WHERE Custom_Field__c = null 
  • Negative filter operators—Using operators such as !=, NOT LIKE, or EXCLUDES in your queries. For example: SELECT CaseNumber FROM Case WHERE Status != ‘New’
  • Leading wildcards—Queries that use a leading wildcard, such as this: SELECT Id, LastName, FirstName FROM Contact WHERE LastName LIKE ‘%smi%’
  • Text fields with comparison operators—Using comparison operators, such as >, <, >=, or <=, with text-based fields. For example: SELECT AccountId, Amount FROM Opportunity WHERE Order_Number__c > 10
  • Query plan tool can suggest indexes, and gives cost of query

4. Use batch Apex to query data

In general, the best way to query and process large data sets in the Force.com platform is to do it asynchronously in batches. You can query and process up to 50 million records using Batch Apex.

5. Use skinny tables

Use skinny table if performance is not good enough even after using custom indexes- A skinny table is a custom table in the Force.com platform that contains a subset of fields from a standard or custom base Salesforce object. Force.com can have multiple skinny tables if needed, and maintains them and keeps them completely transparent to you.

What are skinny tables? What makes them fast?

  • They avoid resource intensive joins
  • Their tables are kept in sync with their source tables when source tables are modified
  • They donot include soft deleted records
  • Skinny help improve report and query performance in following ways-:
  • Skinny tables provide a view across multiple objects for easy access to combined data
  • Skinny tables contain frequently used fields and thereby help avoiding joins
  • Skinny tables are kept in sync with changes to data in source tables

6. Use PK Chunking

PK Chunking is a supported feature of the Salesforce Bulk API. Now you can get the performance benefits of PK Chunking without doing all the work of splitting the queries into manageable chunks. You can simply enter a few parameters on your Bulk API job, and the platform will automatically split the query into separate chunks, execute a query for each chunk and return the data.

Primary Key Chunking

PK CHUNKING is a very important topic for this exam, PK Chunking can be used to Extract Large Data Sets from Salesforce. Primary Key Chunking helps in splitting queries into manageable chunks.

Some of the larger enterprise customers have recently been using a strategy we call PK Chunking to handle large data set extracts. PK stands for Primary Key — the object’s record ID — which is always indexed. With this method, customers first query the target table to identify a number of chunks of records with sequential IDs. They then submit separate queries to extract the data in each chunk, and finally combine the results.

With the arrival of the Spring ’15 release, primary key chunking is available in salesforce. This can be configured by adding few parameters on your Bulk API job, and the platform will automatically split the query into separate chunks, execute a query for each chunk and return the data.

Links to learn more about primary Key chunking

  • https://developer.salesforce.com/blogs/engineering/2015/03/use-pk-chunking-extract-large-data-sets-salesforce.html
  • https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm

7. Understand report performance parameters

Performance of report depends on following

  • Number of joins used in report query
  • Number of records returned by report query
  • Use new reports based on Analytics cloud when required
  • Filters used in report should be indexed fields, as far as possible

Check Our Large Data Volume Session recording here. This session is highly recommend for those who are studying for their Data Architecture & Management Designer certification exam and those on the #journey2cta

YouTube video

Best Practices when importing large amount of data

  • Defer sharing rules feature helps in suspending sharing rule for migration of large volumes of data. To suspend, resume, or recalculate sharing rule calculation: Consider deferring your sharing calculations before performing massive updates to sharing rules. When sharing is recalculated, Salesforce also runs all Apex sharing recalculations
  • Remove all duplicates before importing data.
  • Use right strategy, Bulk API 1.0 OR BULK API 2.0
  • Deleting Data-The Salesforce data deletion mechanism can have a profound effect on the performance of large data volumes. Salesforce uses a Recycle Bin metaphor for data that users delete. Instead of removing the data, Salesforce flags the data as deleted and makes it visible through the Recycle Bin. This process is called soft deletion. While the data is soft deleted, it still affects database performance because the data is still resident, and deleted records have to be excluded from any queries.

3. Bulk API

Bulk API is based on REST principles and is optimized for working with large sets of data. You can use it to insert, update, upsert, or delete many records asynchronously, meaning that you submit a request and come back for the results later. Salesforce processes the request in the background. Bulk api is asynchronous. Enabling the Bulk API in Data Loader allows you to load or delete a large number of records faster than using the default SOAP-based API

Difference between Bulk API 1.0 and Bulk API 2.0

Understanding Bulk API 1.0, and Bulk API 2.0 is very important.

Bulk API 1.0Bulk API 2.0
Support Create Update, Delete and QuerySupport Create, Update and Delete
Must prepare in batchesNo concept of batches
Built on a custom Rest frameworkBuilt on a standard Rest framework
Supports
serial and parallel processing
Supports parallel processing

4. Data quality

Causes of Bad data-:

  • Missing Records
  • Duplicate Records
  • No Data Standards
  • Incomplete Records
  • Stale Data

Inaccurate or incomplete data can lead to 20% stalled productivity, which is one day of work each week. The average company loses 12% of its revenue as a result of inaccurate data. Forty percent of all business initiatives fail to achieve their targeted benefits because oTo assess data quality of your org you can use App exchange Apps like ‘Data Quality Analysis Dashboards App.

Measures to ensure good data quality

Workflow Rules

Workflow rules are the magic wand in your Salesforce implementation act. Workflow rules let you automate standard internal procedures and processes to save time across your company. You set up workflow rules so that leads are routed to the nearest rep. You do the same to assign service requests, too. Now Gelato’s reps can focus their time on growing business—not assigning records.

Page Layouts

Some records have a zillion fields that you know your reps aren’t using. Ditch ’em! That’s right, you remove them from the page layout for your reps. In fact, you create customized page layouts for different kinds of reps and managers across Gelato, to give them the fields they need when they need them. While you’re at it, you put the most important, required fields at the top.

Dashboards

Why make your reps and managers wade through the swamp of reports and records? Instead, create simple dashboards to support business objectives. For Gelato, you create a series of dashboards for managers across Gelato to show things like lead assignment and missing campaign data.

Data Enrichment Tools

Data is obsolete almost as soon as it’s entered. That’s why it’s important to regularly match your data against a trusted source. A number of products in Data Apps on AppExchange help you with this task.

Duplicate Management

Duplicate records are the bane of any rep’s existence! Which record is the right record? You make sure there’s one account record for each Gelato customer. Then you use Duplicate Management, Salesforce’s built-in duplicate management tools, to prevent duplicates from now on.

Custom Field Types

You know the format your company wants to use for dates and currency, so you employ field types on custom fields. You make sure to assign all custom date fields to Type = Date and all custom currency fields to Type = Currency. For fields that have a standard list of values, you use Type = Picklist. And, speaking of picklists, you set up State and Country Picklists. That way, your reps enter addresses by choosing from a standardized list of states and countries

Install and configure Data.com Clean to monitor

Data.com Clean compares your account, contact, and lead records with records from Data.com and creates a link between your records and matching Data.com records. Clean also provides clean status information for accounts, contacts, and leads.

Basic difference between lookup and master detail, custom setting and custom metadata may be asked in exam. We are not covering these topics in this blog, as these topics are easily available on internet.

5. Data archiving Strategy

Data archiving is the practice of moving data that’s no longer being used to a separate storage device. Data backup expert and a senior consultant with Long View Systems Inc. Data archiving defines can be defined as “a single or a collection of historical records specifically selected for long-term retention and future reference.” In addition, data archives consist of older data that is still important and necessary for future reference, as well as data that must be retained for regulatory compliance. Data archives are also indexed and have search capabilities so that files and parts of files can be easily located and retrieved.

Salesforce data archiving strategies present in the market today

ON Salesforce Platform : Let see how many pattern are available for data archiving and backup in Salesforce on platform.

  • Pattern 1: Custom Storage Objects
  • Pattern 2: Salesforce Big Object

OFF Salesforce Platform : Let see how many pattern are available for data archiving and backup in Salesforce on platform.

  • Pattern 3: On Prem DataStore
  • Pattern 4: 3rd Party Vendor product
YouTube video
Amit Chaudhary
Amit Chaudhary

Amit Chaudhary is Salesforce Application & System Architect and working on Salesforce Platform since 2010. He is Salesforce MVP since 2017 and have 17 Salesforce Certificates.

He is a active blogger and founder of Apex Hours.

Articles: 467

7 Comments

  1. Very informative and well explained . Especially the concepts of how Salesforce objects works with user and Salesforce multi tenant perspective.

Leave a Reply

Your email address will not be published. Required fields are marked *