A data engineer is configuring an AWS Glue job to read data from an Amazon S3 bucket. The data engineer has set up the necessary AWS Glue connection details and an associated IAM role. However, when the data engineer attempts to run the AWS Glue job, the data engineer receives an error message that indicates that there are problems with the Amazon S3 VPC gateway endpoint. The data engineer must resolve the error and connect the AWS Glue job to the S3 bucket. Which solution will meet this requirement?
[ ] A. Update the AWS Glue security group to allow inbound traffic from the Amazon S3 VPC gateway endpoint.
[ ] B. Configure an S3 bucket policy to explicitly grant the AWS Glue job permissions to access the S3 bucket.
[ ] C. Review the AWS Glue job code to ensure that the AWS Glue connection details include a fully qualified domain name.
[x] D. Verify that the VPC's route table includes inbound and outbound routes for the Amazon S3 VPC gateway endpoint.
描述
資料工程師,設定 AWS Glue job 讀取 S3 裡的資料。
設定 AWS Glue connection 和 IAM role; 執行 Glue job 的時候跳錯誤說 「there are problems with the Amazon S3 VPC gateway endpoint」
解析
首先你要知道三個名詞
AWS Glue (用來方便資料科學家進行菜渣集中的無伺服器服務)
IAM role (IAM role 被用來配置操作 AWS 服務權限),在這題的情境是要「允許 Glue 服務,去訪問 S3 Bucket 中的資料,所以要配置一個 role 給 Glue 權限去允許它碰 S3 Bucket」
A retail company has a customer data hub in an Amazon S3 bucket. Employees from many countries use the data hub to support company-wide analytics. A governance team must ensure that the company's data analysts can access data only for customers who are within the same country as the analysts. Which solution will meet these requirements with the LEAST operational effort?
[ ] A. Create a separate table for each country's customer data. Provide access to each analyst based on the country that the analyst serves.
[x] B. Register the S3 bucket as a data lake location in AWS Lake Formation. Use the Lake Formation row-level security features to enforce the company's access policies.
[ ] C. Move the data to AWS Regions that are close to the countries where the customers are. Provide access to each analyst based on the country that the analyst serves.
[ ] D. Load the data into Amazon Redshift. Create a view for each country. Create separate IAM roles for each country to provide access to data from each country. Assign the appropriate roles to the analysts.
A media company wants to improve a system that recommends media content to customer based on user behavior and preferences. To improve the recommendation system, the company needs to incorporate insights from third-party datasets into the company's existing analytics platform. The company wants to minimize the effort and time required to incorporate third-party datasets. Which solution will meet these requirements with the LEAST operational overhead?
[x] A. Use API calls to access and integrate third-party datasets from AWS Data Exchange.
[ ] B. Use API calls to access and integrate third-party datasets from AWS DataSync.
[ ] C. Use Amazon Kinesis Data Streams to access and integrate third-party datasets from AWS CodeCommit repositories.
[ ] D. Use Amazon Kinesis Data Streams to access and integrate third-party datasets from Amazon Elastic Container Registry (Amazon ECR).
描述
資料放在 AWS 以外的地方
不要大費周章,要盡量簡單做到
解析
考兩個名詞:AWS Data Exchange 是一次性的資料搬運; AWS DataSync 是用來備份抄寫。