Google Cloud Certified – Professional Data Engineer – Practice Exam (Question 41)
Question 1
You work for a shipping company that has distribution centers where packages move on delivery lines to route them properly.
The company wants to add cameras to the delivery lines to detect and track any visual damage to the packages in transit. You need to create a way to automate the detection of damaged packages and flag them for human review in real time while the packages are in transit.
Which solution should you choose?
- A. Use Google BigQuery machine learning to be able to train the model at scale, so you can analyze the packages in batches.
- B. Train an AutoML model on your corpus of images, and build an API around that model to integrate with the package tracking applications.
- C. Use the Google Cloud Vision API to detect for damage, and raise an alert through Google Cloud Functions. Integrate the package tracking applications with this function.
- D. Use TensorFlow to create a model that is trained on your corpus of images. Create a Python notebook in Google Cloud Datalab that uses this model so you can analyze for damaged packages.
Correct Answer: A
Question 2
You work for a shipping company that uses handheld scanners to read shipping labels.
Your company has strict data privacy standards that require scanners to only transmit recipients’ personally identifiable information (PII) to analytics systems, which violates user privacy rules. You want to quickly build a scalable solution using cloud-native managed services to prevent exposure of PII to the analytics systems.
What should you do?
- A. Create an authorized view in Google BigQuery to restrict access to tables with sensitive data.
- B. Install a third-party data validation tool on Google Compute Engine virtual machines to check the incoming data for sensitive information.
- C. Use Stackdriver logging to analyze the data passed through the total pipeline to identify transactions that may contain sensitive information.
- D. Build a Cloud Function that reads the topics and makes a call to the Google Cloud Data Loss Prevention API. Use the tagging and confidence levels to either pass or quarantine the data in a bucket for review.
Correct Answer: A
Question 3
You work for an advertising company, and you’ve developed a Spark ML model to predict click-through rates at advertisement blocks.
You’ve been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be closing soon, so a rapid lift-and-shift migration is necessary. However, the data you’ve been using will be migrated to Google BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud.
What should you do?
- A. Use Cloud ML Engine for training existing Spark ML models.
- B. Rewrite your models on TensorFlow, and start using Cloud ML Engine.
- C. Use Google Cloud Dataproc for training existing Spark ML models, but start reading data directly from Google BigQuery.
- D. Spin up a Spark cluster on Google Compute Engine, and train Spark ML models on the data exported from Google BigQuery.