Project 4: Serverless Data Engineering Pipeline

Project Objectives

Reproduce the architecture of the example serverless data engineering project or perform something similar using only serverless technologies
Enhance the project by extending the functionality of the NLP analysis: adding entity extraction, key phrase extraction, or some other NLP feature or doing Applied Computer Vision.

Introduction

In this project, I will learn to build a serverless data engineering pipeline.

Structure Diagram

Step

1. Create a table in DynamoDB

DynamoDB: A fast and flexible NoSQL database service for any scale
Table: lambda-dynamodb-stream
partition key: id(String)

2. Create lambda functions and add trigger

AWS lambda
Function1: ProcessDynamoDBRecords
- add trigger, remember to add the DynamoDB permission for this IAM role
Function2: updateTable.

3. Setup an Amazon EventBridge schedule*

choose API: AWS Lambda and invoke
to invoke the updateTable lambda function every 60 seconds

4. Result

The EventBridge will invoke the updateTable function every minute, and it will insert a new item in the table in the DynamoDB;
The ProcessDynamoDBRecords will detect the change of DynamoDB database, and it will write the result to S3 bucket.

Week 9 progress

1.AWS lambda

https://github.com/noahgift/awslambda

Week 10 progress

CloudWatch Timer: schedule a 30-second job to code the lambda function;
Lambda function: update data from the database;
Log Lambda function: detect the changes from the database, then create some logs to S3;
1.AWS S3 2.lambda function

Revisit project 3.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ProcessDynamoDBRecords.js		ProcessDynamoDBRecords.js
README.md		README.md
updateTable.js		updateTable.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 4: Serverless Data Engineering Pipeline

Project Objectives

Introduction

Structure Diagram

Step

1. Create a table in DynamoDB

2. Create lambda functions and add trigger

3. Setup an Amazon EventBridge schedule*

4. Result

Week 9 progress

Week 10 progress

About

Releases

Packages

Languages

LiuSuen/serverless-data-engineering-pipeline

Folders and files

Latest commit

History

Repository files navigation

Project 4: Serverless Data Engineering Pipeline

Project Objectives

Introduction

Structure Diagram

Step

1. Create a table in DynamoDB

2. Create lambda functions and add trigger

3. Setup an Amazon EventBridge schedule*

4. Result

Week 9 progress

Week 10 progress

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages