diff --git a/docs/source/advanced_usage.rst b/docs/source/advanced_usage.rst index f087d346..9c7926db 100644 --- a/docs/source/advanced_usage.rst +++ b/docs/source/advanced_usage.rst @@ -6,22 +6,43 @@ Define a shell-script for each invocation of the Lambda function Instead of packaging the script to be used inside the container image and having to modify the image each time you want to modify the script, you can specify a shell-script when initializing the Lambda function to trigger its execution inside the container on each invocation of the Lambda function. For example:: - cat >> init-script.yaml << EOF + cat >> cow.sh << EOF + #!/bin/bash + /usr/games/cowsay "Executing init script !!" + EOF + + cat >> cow.yaml << EOF functions: - scar-cowsay: - image: grycap/cowsay - init_script: src/test/test-cowsay.sh + aws: + - lambda: + name: scar-cowsay + init_script: cow.sh + container: + image: grycap/cowsay EOF - scar init -f init-script.yaml + scar init -f cow.yaml or using CLI parameters:: - scar init -s src/test/test-cowsay.sh -n scar-cowsay -i grycap/cowsay + scar init -s cow.sh -n scar-cowsay -i grycap/cowsay Now whenever this Lambda function is executed, the script will be run in the container:: - scar run -f init-script.yaml + scar run -f cow.yaml + + Request Id: fb925bfa-bc65-47d5-beed-077f0de471e2 + Log Group Name: /aws/lambda/scar-cowsay + Log Stream Name: 2019/12/19/[$LATEST]0eb088e8a18d4599a572b7bf9f0ed321 + __________________________ + < Executing init script !! > + -------------------------- + \ ^__^ + \ (oo)\_______ + (__)\ )\/\ + ||----w | + || || + As explained in next section, this can be overridden by speciying a different shell-script when running the Lambda function. @@ -31,24 +52,69 @@ Executing an user-defined shell-script You can execute the Lambda function and specify a shell-script locally available in your machine to be executed within the container:: - cat >> run-script.yaml << EOF + cat >> runcow.sh << EOF + #!/bin/bash + /usr/games/cowsay "Executing run script !!" + EOF + + cat >> cow.yaml << EOF functions: - scar-cowsay: - image: grycap/cowsay - run_script: src/test/test-cowsay.sh + aws: + - lambda: + name: scar-cowsay + run_script: runcow.sh + container: + image: grycap/cowsay EOF - scar run -f run-script.yaml + scar init -f cow.yaml -or using CLI parameters:: +Now if you execute the function without passing more parameters, the entrypoint of the container is executed:: + + scar run -n scar-cowsay + + Request Id: 97492a12-ca84-4539-be80-45696501ee4a + Log Group Name: /aws/lambda/scar-cowsay + Log Stream Name: 2019/12/19/[$LATEST]d5cc7a9db9b44e529873130f6d005fe1 + ____________________________________ + / No matter where I go, the place is \ + \ always called "here". / + ------------------------------------ + \ ^__^ + \ (oo)\_______ + (__)\ )\/\ + ||----w | + || || + +But, when you use the configuration file with the ``run_script`` property:: + + scar run -f cow.yaml - scar run -s src/test/test-cowsay.sh -n scar-cowsay +or use CLI parameters:: -or a combination of both (to avoid editing the .yaml file):: + scar run -n scar-cowsay -s runcow.sh - scar run -f run-script.yaml -s /tmp/test-cowsay.sh +or a combination of both (to avoid editing the initial .yaml file):: -Have in mind that the script used in combination with the run command is no saved anywhere. It is uploaded and executed inside the container, but the container image is not updated. The shell-script needs to be specified and can be changed in each different execution of the Lambda function. + scar run -f cow.yaml -s runcow.sh + +the passed script is executed:: + + Request Id: db3ff40e-ab51-4f90-95ad-7473751fb9c7 + Log Group Name: /aws/lambda/scar-cowsay + Log Stream Name: 2019/12/19/[$LATEST]d5cc7a9db9b44e529873130f6d005fe1 + _________________________ + < Executing run script !! > + ------------------------- + \ ^__^ + \ (oo)\_______ + (__)\ )\/\ + ||----w | + || || + +Have in mind that the script used in combination with the run command is no saved anywhere. +It is uploaded and executed inside the container, but the container image is not updated. +The shell-script needs to be specified and can be changed in each different execution of the Lambda function. Passing environment variables @@ -57,67 +123,46 @@ Passing environment variables You can specify environment variables to the init command which will be in turn passed to the executed Docker container and made available to your shell-script. Using a configuration file:: - cat >> env-var.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - init_script: src/test/test-global-vars.sh - environment: - TEST1: 45 - TEST2: 69 + cat >> cow.sh << EOF + #!/bin/bash + env | /usr/games/cowsay EOF - scar init -f env-var.yaml - -or using CLI parameters:: - - scar init -e TEST1=45 -e TEST2=69 -s src/test/test-global-vars.sh -n scar-cowsay - -You can also update the environment variables by changing the configuration file and then using the update command:: - - cat >> env-var.yaml << EOF + cat >> cow-env.yaml << EOF functions: - scar-cowsay: - image: grycap/cowsay - init_script: src/test/test-global-vars.sh - environment: - TEST1: 145 - TEST2: i69 - TEST3: 42 + aws: + - lambda: + name: scar-cowsay + run_script: runcow.sh + container: + image: grycap/cowsay + environment: + Variables: + TESTKEY1: val1 + TESTKEY2: val2 EOF - scar update -f env-var.yaml + scar init -f cow-env.yaml -or:: - - scar update -e EST1: 145 -e TEST2: i69 -e TEST2: 42 -n scar-cowsay - -In addition, the following environment variables are automatically made available to the underlying Docker container: - -* AWS_ACCESS_KEY_ID -* AWS_SECRET_ACCESS_KEY -* AWS_SESSION_TOKEN -* AWS_SECURITY_TOKEN - -This allows a script running in the Docker container to access other AWS services. As an example, see how the AWS CLI runs on AWS Lambda in the `examples/aws-cli `_ folder. - - -Executing cli commands ----------------------- - -To run commands inside the docker image you can specify the command to be executed at the end of the command line:: +or using CLI parameters:: - scar run -f basic-cow.yaml ls + scar init -n scar-cowsay -i grycap/cowsay -e TEST1=45 -e TEST2=69 -s cow.sh -Passing arguments -^^^^^^^^^^^^^^^^^ +Executing custom commands and arguments +--------------------------------------- -You can also supply arguments which will be passed to the command executed in the Docker container:: +To run commands inside the docker image you can specify the command to be executed at the end of the command line. +This command overrides any ``init`` or ``run`` script defined:: - scar run -f basic-cow.yaml /usr/bin/perl /usr/games/cowsay Hello World + scar run -f cow.yaml df -h -Note that since cowsay is a Perl script you will have to prepend it with the location of the Perl interpreter (in the Docker container). + Request Id: 39e6fc0d-6831-48d4-aa03-8614307cf8b7 + Log Group Name: /aws/lambda/scar-cowsay + Log Stream Name: 2019/12/19/[$LATEST]9764af5bf6854244a1c9469d8cb84484 + Filesystem Size Used Avail Use% Mounted on + /dev/root 526M 206M 309M 41% / + /dev/vdb 1.5G 21M 1.4G 2% /dev Obtaining a JSON Output @@ -125,63 +170,64 @@ Obtaining a JSON Output For easier scripting, a JSON output can be obtained by including the `-j` or the `-v` (even more verbose output) flags:: - scar run -f basic-cow.yaml -j + scar run -f cow.yaml -j -Upload docker images using an S3 bucket ---------------------------------------- - -If you want to save some space inside the lambda function you can deploy a lambda function using an S3 bucket by issuing the following command:: - - cat >> s3-bucket.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - s3: - deployment_bucket: scar-cowsay - EOF + { "LambdaOutput": + { + "StatusCode": 200, + "Payload": " _________________________________________\n/ \"I always avoid prophesying beforehand \\\n| because it is much better |\n| |\n| to prophesy after the event has already |\n| taken place. \" - Winston |\n| |\n\\ Churchill /\n -----------------------------------------\n \\ ^__^\n \\ (oo)\\_______\n (__)\\ )\\/\\\n ||----w |\n || ||\n", + "LogGroupName": "/aws/lambda/scar-cowsay", + "LogStreamName": "2019/12/19/[$LATEST]a4ba02914fd14ab4825d6c6635a1dfd6", + "RequestId": "fcc4e24c-1fe3-4ca9-9f00-b15ec18c1676" + } + } - scar init -f s3-bucket.yaml - -or using the CLI:: - - scar init -db scar-cowsay -n scar-cowsay -i grycap/cowsay - -The maximum deployment package size allowed by AWS is an unzipped file of 250MB. With this restriction in mind, SCAR downloads the docker image to a temporal folder and creates the udocker file structure needed. -* If the image information and the container filesystem fit in the 250MB SCAR will upload everything and the lambda function will not need to download or create a container structure thus improving the execution time of the function. This option gives the user the full 500MB of ``/tmp/`` storage. -* If the container filesystem doesn't fit in the deployment package SCAR will only upload the image information, that is, the layers. Also the lambda function execution time is improved because it doesn't need to dowload the container. In this case udocker needs to create the container filesystem so the first function invocation can be delayed for a few of seconds. This option usually duplicates the available space in the ``/tmp/`` folder with respect to the SCAR standard initialization. Upload docker image files using an S3 bucket -------------------------------------------- -SCAR also allows to upload a saved docker image:: +SCAR allows to upload a saved docker image. +We created the image file with the command ``docker save grycap/cowsay > cowsay.tar.gz``:: - cat >> s3-bucket.yaml << EOF + cat >> cow.yaml << EOF functions: - scar-cowsay: - image_file: slim_cow.tar.gz - s3: - deployment_bucket: scar-cowsay + aws: + - lambda: + name: scar-cowsay + container: + image_file: cowsay.tar.gz + deployment: + bucket: scar-test EOF - scar init -f s3-bucket.yaml + scar init -f cow.yaml -and for the CLI fans:: +or for the CLI fans:: - scar init -db scar-cowsay -n scar-cowsay -if slim_cow.tar.gz + scar init -db scar-cowsay -n scar-cowsay -if cowsay.tar.gz -The behavior of SCAR is the same as in the case above (when uploading an image from docker hub). The image file is unpacked in a temporal folder and the udocker layers and container filesystem are created. Depending on the size of the layers and the filesystem, SCAR will try to upload everything or only the image layers. +Have in mind that the maximum deployment package size allowed by AWS is an unzipped file of 250MB. +The image file is unpacked in a temporal folder and the udocker layers are created. +Depending on the size of the layers, SCAR will try to upload them or will show the user an error. Upload 'slim' docker image files in the payload ----------------------------------------------- -Finally, if the image is small enough, SCAR allows to upload it in the function payload. Due to the SCAR libraries weighting ~10MB, the maximum size of the image uploaded using this method should not be bigger than ~40MB:: +Finally, if the image is small enough, SCAR allows to upload it in the function payload wich is ~50MB:: + + docker save grycap/minicow > minicow.tar.gz - cat >> slim-image.yaml << EOF + cat >> minicow.yaml << EOF functions: - scar-cowsay: - image_file: slimcow.tar.gz + aws: + - lambda: + name: scar-cowsay + container: + image_file: minicow.tar.gz EOF - scar init -f slim-image.yaml + scar init -f minicow.yaml -To help with the creation of slim images, you can use `minicon `_. Minicon is a general tool to analyze applications and executions of these applications to obtain a filesystem that contains all the dependencies that have been detected. By using minicon the size of the cowsay image was reduced from 170MB to 11MB. +To help with the creation of slim images, you can use `minicon `_. +Minicon is a general tool to analyze applications and executions of these applications to obtain a filesystem that contains all the dependencies that have been detected. +By using minicon the size of the cowsay image was reduced from 170MB to 11MB. \ No newline at end of file diff --git a/docs/source/api_gateway.rst b/docs/source/api_gateway.rst index 6f5d4ea3..cae4cab5 100644 --- a/docs/source/api_gateway.rst +++ b/docs/source/api_gateway.rst @@ -9,11 +9,14 @@ SCAR allows to transparently integrate an HTTP endpoint with a Lambda function v The following configuration file creates a generic api endpoint that redirects the http petitions to your lambda function:: cat >> api-cow.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - api_gateway: - name: cow-api + functions: + aws: + - lambda: + name: scar-api-cow + container: + image: grycap/cowsay + api_gateway: + name: api-cow EOF scar init -f api-cow.yaml @@ -24,79 +27,137 @@ After the function is created you can check the API URL with the command:: That shows the basic function properties:: - NAME MEMORY TIME IMAGE_ID API_URL - ------------------- -------- ------ ---------------- ------------------------------------------------------------------ - scar-cowsay 512 300 grycap/cowsay https://r8c55jbfz9.execute-api.us-east-1.amazonaws.com/scar/launch + NAME MEMORY TIME IMAGE_ID API_URL SUPERVISOR_VERSION + ---------------- -------- ------ ------------------ ------------------------------------------------------------------ -------------------- + scar-api-cow 512 300 grycap/cowsay https://r20bwcmdf9.execute-api.us-east-1.amazonaws.com/scar/launch 1.2.0 +CURL Invocation +--------------- +You can directly invoke the API Gateway endpoint with ``curl`` to obtain the output generated by the application:: + + curl -s https://r20bwcmdf9.execute-api.us-east-1.amazonaws.com/scar/launch | base64 --decode + + ________________________________________ + / Hildebrant's Principle: \ + | | + | If you don't know where you are going, | + \ any road will get you there. / + ---------------------------------------- + \ ^__^ + \ (oo)\_______ + (__)\ )\/\ + ||----w | + || || + +This way, you can easily provide an HTTP-based endpoint to trigger the execution of an application. + GET Request ----------- -SCAR also allows you to make an HTTP request, for that you can use the command `invoke` like this:: +SCAR also allows you to make an HTTP request, for that you can use the command ``invoke`` like this:: scar invoke -f api-cow.yaml + Request Id: e8cba9ee-5a60-4ff2-9e52-475e5fceb165 + Log Group Name: /aws/lambda/scar-api-cow + Log Stream Name: 2019/12/20/[$LATEST]8aa8bdecba0647edae61e2e45e99ff90 + _______________________________________ + / What if everything is an illusion and \ + | nothing exists? In that case, I | + | definitely overpaid for my carpet. | + | | + \ -- Woody Allen, "Without Feathers" / + --------------------------------------- + \ ^__^ + \ (oo)\_______ + (__)\ )\/\ + ||----w | + || || + This command automatically creates a `GET` request and passes the petition to the API endpoint defined previously. Bear in mind that the timeout for the API Gateway requests is 29s. Therefore, if the function takes more time to respond, the API will return an error message. -To launch asynchronous functions you only need to add the `-a` parameter to the call:: +To launch asynchronous functions you only need to add the ``-a`` parameter to the call:: scar invoke -f api-cow.yaml -a + Function 'scar-api-cow' launched successfully. + When you invoke an asynchronous function through the API Gateway there is no way to know if the function finishes successfully until you check the function invocation logs. POST Request ------------ -You can also pass files through the HTTP endpoint using the following command:: +You can also pass files through the HTTP endpoint. +For the next example we will pass an image to an image transformation system. +The following files were user to define the service:: - cat >> api-cow.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - data_binary: /tmp/img.jpg - api_gateway: - name: cow-api + cat >> grayify-image.sh << EOF + #! /bin/sh + FILE_NAME=`basename $INPUT_FILE_PATH` + OUTPUT_FILE=$TMP_OUTPUT_DIR/$FILE_NAME + convert $INPUT_FILE_PATH -type Grayscale $OUTPUT_FILE EOF - scar invoke -f api-cow.yaml - -or:: - - scar invoke -n scar-cowsay -db /tmp/img.jpg + cat >> image-parser.yaml << EOF + functions: + aws: + - lambda: + name: scar-imagemagick + init_script: grayify-image.sh + container: + image: grycap/imagemagick + output: + - storage_provider: s3 + path: scar-imagemagick/output + api_gateway: + name: image-api + EOF -The file specified after the parameter ``-db`` is codified and passed as the POST body. -Take into account that the file limitations for request response and asynchronous requests are 6MB and 128KB respectively, as specified in the `AWS Lambda documentation `_. + scar init -f image-parser.yaml -You can also submit a JSON as the body of the request to the HTTP endpoint with no other configuration, as long as `Content-Type` is `application/json`. If SCAR detects a JSON body, it will write this body to the file `/tmp/{REQUEST_ID}/api_event.json`. Otherwise, the body will be considered to be a file. +We are going to convert this `image `_. -This can invoked via the cli:: +.. image:: images/homer.png + :align: center - scar invoke -n scar-cowsay -jd '{"key1": "value1", "key2": "value3"}' - -Lastly, you can directly invoke the API Gateway endpoint with ``curl`` to obtain the output generated by the application:: +To launch the service through the api endpoint you can use the following command:: - curl -s https://r8c55jbfz9.execute-api.us-east-1.amazonaws.com/scar/launch | jq -r ".udocker_output" + scar invoke -f image-parser.yaml -db homer.png -This way, you can easily provide an HTTP-based endpoint to trigger the execution of an application. +The file specified after the parameter ``-db`` is codified and passed as the POST body. +The output generated will be stored in the output bucket specified in the configuration file. +Take into account that the file limitations for request response and asynchronous requests are 6MB and 128KB respectively, as specified in the `AWS Lambda documentation `_. -Passing parameters in the requests ----------------------------------- +The last option available is to store the output wihtout bucket intervention. +What we are going to do is pass the generated files to the output of the function and then store them in our machine. +For that we need to slightly modify the script and the configuration file:: -You can add parameters to the invocations adding the `parameters` section to the configuration described as follows:: + cat >> grayify-image.sh << EOF + #! /bin/sh + FILE_NAME=`basename $INPUT_FILE_PATH` + OUTPUT_FILE=$TMP_OUTPUT_DIR/$FILE_NAME + convert $INPUT_FILE_PATH -type Grayscale $OUTPUT_FILE + cat $OUTPUT_FILE + EOF - cat >> api-cow.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - api_gateway: - name: cow-api - parameters: - test1: 45 - test2: 69 + cat >> image-parser.yaml << EOF + functions: + aws: + - lambda: + name: scar-imagemagick + init_script: grayify-image.sh + container: + image: grycap/imagemagick + api_gateway: + name: image-api EOF - scar invoke -f api-cow.yaml + scar init -f image-parser.yaml + +This can be achieved with the command:: -or:: + scar invoke -f image-parser.yaml -db homer.png -o grey_homer.png - scar invoke -n scar-cowsay -p '{"key1": "value1", "key2": "value3"}' +.. image:: images/result.png + :align: center \ No newline at end of file diff --git a/docs/source/basic_usage.rst b/docs/source/basic_usage.rst index 20c133ca..f2fa3c7c 100644 --- a/docs/source/basic_usage.rst +++ b/docs/source/basic_usage.rst @@ -8,8 +8,11 @@ Using a configuration file (recommended) cat >> basic-cow.yaml << EOF functions: - scar-cowsay: - image: grycap/cowsay + aws: + - lambda: + name: scar-cowsay + container: + image: grycap/cowsay EOF Where you define the name of the function and under it the image that will run inside the function. @@ -26,17 +29,6 @@ Using a configuration file (recommended) scar log -f basic-cow.yaml - If you want to get an specific log stream or request id from the logs you can specify it either in the configuration file or in the command line, although due to the dinamic nature of those parameters its easier to specify them in the cli. - To retrieve an specific log stream or request id using the configuration file would be:: - - cat >> basic-cow.yaml << EOF - functions: - scar-cowsay: - image: grycap/cowsay - log_stream_name: 2018/07/10/[$LATEST]037b5bbf77a44a5basdfwerb92805303 - request_id: bc456798-841a-11e8-8z1b-49c89abc6ff1 - EOF - 5) Finally to delete the function:: scar rm -f basic-cow.yaml diff --git a/docs/source/batch.rst b/docs/source/batch.rst index c8038356..8116264c 100644 --- a/docs/source/batch.rst +++ b/docs/source/batch.rst @@ -3,7 +3,7 @@ AWS Batch Integration ======================= -AWS Batch allows to efficiently execute lots of batch computing jobs on AWS by dynamically provisioning the required underlying EC2 instances on which Docker-based jobs are executed. +AWS Batch allows to efficiently execute batch computing jobs on AWS by dynamically provisioning the required underlying EC2 instances on which Docker-based jobs are executed. SCAR allows to transparently integrate the execution of the jobs through `AWS Batch `_. Three execution modes are now available in SCAR: @@ -19,30 +19,49 @@ Set up your configuration file To be able to use `AWS Batch `_, first you need to set up your configuration file, located in `~/.scar/scar.cfg` -The new variables added to the SCAR config file are:: +The variables responsible for batch configuration are:: "batch": { + "boto_profile": "default", + "region": "us-east-1", + "vcpus": 1, + "memory": 1024, + "enable_gpu": false, "state": "ENABLED", "type": "MANAGED", - "security_group_ids": [""], - "comp_type": "EC2", - "desired_v_cpus": 0, - "min_v_cpus": 0, - "max_v_cpus": 2, - "subnets": [""], - "instance_types": ["m3.medium"] + "environment": { + "Variables": {} + }, + "compute_resources": { + "security_group_ids": [], + "type": "EC2", + "desired_v_cpus": 0, + "min_v_cpus": 0, + "max_v_cpus": 2, + "subnets": [], + "instance_types": [ + "m3.medium" + ], + "launch_template_name": "faas-supervisor", + "instance_role": "arn:aws:iam::{account_id}:instance-profile/ecsInstanceRole" + }, + "service_role": "arn:aws:iam::{account_id}:role/service-role/AWSBatchServiceRole" } -Since AWS Batch deploys Amazon EC2 instances, you have to fill the following variables: +Since AWS Batch deploys Amazon EC2 instances, the REQUIRED variables are: * `security_group_ids`: The EC2 security group that is associated with the instances launched in the compute environment. This allows to define the inbound and outbound network rules in order to allow or disallow TCP/UDP traffic generated from (or received by) the EC2 instance. You can choose the default VPC security group. * `subnets`: The VPC subnet(s) identifier(s) on which the EC2 instances will be deployed. This allows to use multiple Availability Zones for enhanced fault-tolerance. -More info about the variables and the different values that can be assigned can be found in the `AWS API Documentation `_. +The remaining variables have default values that should be enough to manage standard batch jobs. +The default `fdl file `_ explains briefly the remaining Batch variables and how are they used. + +Additional info about the variables and the different values that can be assigned can be found in the `AWS API Documentation `_. Set up your Batch IAM role -------------------------- -The default IAM role used in the creation of the EC2 for the Batch Compute Environment is **arn:aws:iam::$ACCOUNT_ID:instance-profile/**ecsInstanceRole****. Thus, if you want to provide S3 access to your Batch jobs you have to specify the corresponding policies in the aforementioned role. +The default IAM role used in the creation of the EC2 for the Batch Compute Environment is **arn:aws:iam::$ACCOUNT_ID:instance-profile/**ecsInstanceRole****. Thus, if you want to provide S3 access to your Batch jobs you have to specify the corresponding policies in the aforementioned role. +If you have a role aleredy configured, you can set it in the configuration file by changin the variable `batch.compute_resources.instance_role`. Define a job to be executed in batch @@ -50,27 +69,39 @@ Define a job to be executed in batch To enable this functionality you only need to set the execution mode of the Lambda function to one of the two available used to create batch jobs ('lambda-batch' or 'batch') and SCAR will take care of the integration process (before using this feature make sure you have the correct rights set in your AWS account). -As an example, the following configuration file defines a Lambda function that creates an AWS Batch job to execute the `MrBayes example `_ (the required script can be found in `mrbayes-sample-run.sh `_):: +As an example, the following configuration file defines a Lambda function that creates an AWS Batch job to execute the `plants classification example `_ (all the required scripts and example files used in this example can be found there):: - cat >> scar-mrbayes-batch.yaml << EOF + cat >> scar-plants.yaml << EOF functions: - scar-mrbayes-batch: - image: grycap/mrbayes - init_script: mrbayes-sample-run.sh - execution_mode: batch - s3: - input_bucket: scar-mrbayes - environment: - ITERATIONS: "10000" + aws: + - lambda: + name: scar-plants + init_script: bootstrap-plants.sh + memory: 1024 + execution_mode: batch + container: + image: deephdc/deep-oc-plant-classification-theano + input: + - storage_provider: s3 + path: scar-plants/input + output: + - storage_provider: s3 + path: scar-plants/output EOF You can then create the function:: - scar init -f scar-mrbayes-batch.yaml + scar init -f scar-plants.yaml + +Additionally for this example to run you have to upload the execution script to S3:: + + scar put -b scar-plants -p plant-classification-run.sh + +Once uploaded you have to manually set their access to public so it can be accessed from batch. This has to be done to deal with the batch limits as it is explained in the next section. And trigger the execution of the function by uploading a file to be processed to the corresponding folder:: - scar put -b scar-ffmpeg -bf scar-mrbayes-batch/input -p cynmix.nex + scar put -b scar-plants/input -p daisy.jpg SCAR automatically creates the compute environment in AWS Batch and submits a job to be executed. Input and output data files are transparently managed as well according to the programming model. @@ -83,7 +114,7 @@ Combine AWS Lambda and AWS Batch executions As explained in the section :doc:`/prog_model`, if you define an output bucket as the input bucket of another function, a workflow can be created. By doing this, AWS Batch and AWS Lambda executions can be combined through S3 events. -An example of this execution can be found in the `video process example `_ and in the `plant classification example `_. +An example of this execution can be found in the `video process example `_. Limits ------ @@ -93,6 +124,6 @@ For example, the Batch Job definition size is limited to 24KB and the invocation To create the AWS Batch job, the Lambda function defines a Job with the payload content included, and sometimes (i.e. when the script passed as payload is greater than 24KB) the Batch Job definition can fail. -The payload limit can be avoided by redefining the script used and passing the large payload files using other service (e.g S3 or some bash command like 'wget' or 'curl' to download the information in execution time). +The payload limit can be avoided by redefining the script used and passing the large payload files using other service (e.g S3 or some bash command like 'wget' or 'curl' to download the information in execution time). As we didi with the plant classification example, where a `bootstrap script `_ was used to download the `executed script `_. Also, AWS Batch does not allow to override the container entrypoint so containers with an entrypoint defined can not execute an user script. diff --git a/docs/source/configuration.rst b/docs/source/configuration.rst index ad17e8d4..d3ff0f73 100644 --- a/docs/source/configuration.rst +++ b/docs/source/configuration.rst @@ -45,28 +45,135 @@ Configuration file ^^^^^^^^^^^^^^^^^^ The first time you execute SCAR a default configuration file is created in the user location: ``$HOME/.scar/scar.cfg``. -As explained above, it is mandatory to set a value for the aws.iam.role property. The rest of the values can be customized to your preferences:: +As explained above, it is mandatory to set a value for the ``aws.iam.role`` property to use the Lambda service. +If you also want to use the Batch service you have to update the values of the ``aws.batch.compute_resources.security_group_ids``, and ``aws.batch.compute_resources.subnets``. There is more information about the Batch usage `here `_. +Additionally, an explanation of all the configurable properties can be found in the `example configuration file `_. +Below is the complete default configuration file :: - { "aws" : { - "iam" : {"role" : ""}, - "lambda" : { - "region" : "us-east-1", - "time" : 300, - "memory" : 512, - "description" : "Automatically generated lambda function", - "timeout_threshold" : 10 }, - "cloudwatch" : { "log_retention_policy_in_days" : 30 }} - } - - -The values represent: - -* **aws.iam.role**: The `ARN `_ of the IAM Role that you just created in the previous section. -* **aws.lambda.region**: The `AWS region `_ on which the AWS Lambda function will be created. -* **aws.lambda.time**: Default maximum execution time of the AWS Lambda function [1]_. -* **aws.lambda.memory**: Default maximum memory allocated to the AWS Lambda function [1]_. -* **aws.lambda.description**: Default description of the AWS Lambda function [1]_. -* **aws.lambda.timeout_threshold:** Default time used to postprocess the container output. Also used to avoid getting timeout error in case the execution of the container takes more time than the lambda_time [1]_. -* **aws.cloudwatch.log_retention_policy_in_days**: Default time (in days) used to store the logs in cloudwatch. Any log older than this parameter will be deleted. - -.. [1] These parameters can also be set or updated with the SCAR CLI \ No newline at end of file + { + "scar": { + "config_version": "1.0.9" + }, + "aws": { + "iam": { + "boto_profile": "default", + "role": "" + }, + "lambda": { + "boto_profile": "default", + "region": "us-east-1", + "execution_mode": "lambda", + "timeout": 300, + "memory": 512, + "description": "Automatically generated lambda function", + "runtime": "python3.7", + "layers": [], + "invocation_type": "RequestResponse", + "asynchronous": false, + "log_type": "Tail", + "log_level": "INFO", + "environment": { + "Variables": { + "UDOCKER_BIN": "/opt/udocker/bin/", + "UDOCKER_LIB": "/opt/udocker/lib/", + "UDOCKER_DIR": "/tmp/shared/udocker", + "UDOCKER_EXEC": "/opt/udocker/udocker.py" + } + }, + "deployment": { + "max_payload_size": 52428800, + "max_s3_payload_size": 262144000 + }, + "container": { + "environment": { + "Variables": {} + }, + "timeout_threshold": 10 + }, + "supervisor": { + "version": "1.2.0-rc4", + "layer_name": "faas-supervisor", + "license_info": "Apache 2.0" + } + }, + "s3": { + "boto_profile": "default", + "region": "us-east-1", + "event": { + "Records": [ + { + "eventSource": "aws:s3", + "s3": { + "bucket": { + "name": "{bucket_name}", + "arn": "arn:aws:s3:::{bucket_name}" + }, + "object": { + "key": "{file_key}" + } + } + } + ] + } + }, + "api_gateway": { + "boto_profile": "default", + "region": "us-east-1", + "endpoint": "https://{api_id}.execute-api.{api_region}.amazonaws.com/{stage_name}/launch", + "request_parameters": { + "integration.request.header.X-Amz-Invocation-Type": "method.request.header.X-Amz-Invocation-Type" + }, + "http_method": "ANY", + "method": { + "authorizationType": "NONE", + "requestParameters": { + "method.request.header.X-Amz-Invocation-Type": false + } + }, + "integration": { + "type": "AWS_PROXY", + "integrationHttpMethod": "POST", + "uri": "arn:aws:apigateway:{api_region}:lambda:path/2015-03-31/functions/arn:aws:lambda:{lambda_region}:{account_id}:function:{function_name}/invocations", + "requestParameters": { + "integration.request.header.X-Amz-Invocation-Type": "method.request.header.X-Amz-Invocation-Type" + } + }, + "path_part": "{proxy+}", + "stage_name": "scar", + "service_id": "apigateway.amazonaws.com", + "source_arn_testing": "arn:aws:execute-api:{api_region}:{account_id}:{api_id}/*", + "source_arn_invocation": "arn:aws:execute-api:{api_region}:{account_id}:{api_id}/{stage_name}/ANY" + }, + "cloudwatch": { + "boto_profile": "default", + "region": "us-east-1", + "log_retention_policy_in_days": 30 + }, + "batch": { + "boto_profile": "default", + "region": "us-east-1", + "vcpus": 1, + "memory": 1024, + "enable_gpu": false, + "state": "ENABLED", + "type": "MANAGED", + "environment": { + "Variables": {} + }, + "compute_resources": { + "security_group_ids": [], + "type": "EC2", + "desired_v_cpus": 0, + "min_v_cpus": 0, + "max_v_cpus": 2, + "subnets": [], + "instance_types": [ + "m3.medium" + ], + "launch_template_name": "faas-supervisor", + "instance_role": "arn:aws:iam::{account_id}:instance-profile/ecsInstanceRole" + }, + "service_role": "arn:aws:iam::{account_id}:role/service-role/AWSBatchServiceRole" + } + } + } \ No newline at end of file diff --git a/docs/source/images/homer.png b/docs/source/images/homer.png new file mode 100644 index 00000000..fae13c61 Binary files /dev/null and b/docs/source/images/homer.png differ diff --git a/docs/source/images/result.png b/docs/source/images/result.png new file mode 100644 index 00000000..25c3414c Binary files /dev/null and b/docs/source/images/result.png differ diff --git a/docs/source/prog_model.rst b/docs/source/prog_model.rst index 399ec166..2c35caf5 100644 --- a/docs/source/prog_model.rst +++ b/docs/source/prog_model.rst @@ -7,29 +7,32 @@ The following command:: cat >> darknet.yaml << EOF functions: - scar-darknet: - image: grycap/darknet - memory: 2048 - init_script: examples/darknet/yolo-sample-object-detection.sh - s3: - input_bucket: scar-test + aws: + - lambda: + name: scar-darknet-s3 + memory: 2048 + init_script: yolo.sh + container: + image: grycap/darknet + input: + - storage_provider: s3 + path: scar-darknet/input + output: + - storage_provider: s3 + path: scar-darknet/output EOF scar init -f darknet.yaml -or using the CLI parameters:: +Creates a Lambda function to execute the shell-script `yolo.sh `_ inside a Docker container created out of the ``grycap/darknet`` Docker image stored in Docker Hub. - scar init -n scar-darknet -s examples/darknet/yolo-sample-object-detection.sh -es scar-test -i grycap/darknet +The following workflow summarises the programming model: -Creates a Lambda function to execute the shell-script ``yolo-sample-object-detection.sh`` inside a Docker container created out of the ``grycap/darknet`` Docker image stored in Docker Hub. - -The following workflow summarises the programming model, which heavily uses the `convention over configuration `_ pattern: - -#) The Amazon S3 bucket ``scar-test`` will be created if it does not exist, and if you don't specify any input folder, a folder with the name of the function ``scar-darknet`` will be created with an ``input`` folder inside it. +#) The Amazon S3 bucket ``scar-darknet`` is created with an ``input`` folder inside it if it doesn't exist. #) The Lambda function is triggered upon uploading a file into the ``input`` folder created. -#) The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container in the ``/tmp/$REQUEST_ID/input`` folder. The ``$INPUT_FILE_PATH`` environment variable will point to the location of the input file. -#) The shell-script processes the input file and produces the output (either one or multiple files) in the folder ``/tmp/$REQUEST_ID/output``. -#) The output files are automatically uploaded by the Lambda function into the ``output/$REQUEST_ID/`` folder created inside of the ``scar-test/scar-darknet`` path. +#) The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container in the path ``$TMP_INPUT_DIR``. The ``$INPUT_FILE_PATH`` environment variable will point to the location of the input file. +#) The shell-script processes the input file and produces the output (either one or multiple files) in the folder specified by the ``$TMP_OUTPUT_DIR`` global variable. +#) The output files are automatically uploaded by the Lambda function into the ``output`` folder created inside of the ``scar-darknet`` bucket. Many instances of the Lambda function may run concurrently and independently, depending on the files to be processed in the S3 bucket. Initial executions of the Lambda may require retrieving the Docker image from Docker Hub but this will be cached for subsequent invocations, thus speeding up the execution process. @@ -44,72 +47,66 @@ After creating a function with the configuration file defined in the previous se scar run -f darknet.yaml -This command lists the files in the ``scar-darknet/input`` folder of the ``scar-test`` bucket and sends the required events (one per file) to the lambda function. +This command lists the files in the ``input`` folder of the ``scar-darknet`` bucket and sends the required events (one per file) to the lambda function. -.. note:: The input path must be previously created and must contain some files in order to launch the functions. The bucket could be previously defined and you don't need to create it with SCAR. If you don't define an specific input folder, make sure the bucket that you use has the following structure: 'bucket/function-name/input'. +.. note:: The input path must be previously created and must contain some files in order to launch the functions. The bucket could be previously defined and you don't need to create it with SCAR. The following workflow summarises the programming model, the differences with the main programming model are in bold: -#) **The folder 'scar-darknet/input' inside the amazon S3 bucket 'scar-test' will be searched for files.** -#) **The Lambda function is triggered once for each file found in the 'input' folder. The first execution is of type 'request-response' and the rest are 'asynchronous' (this is done to ensure the caching and accelerate the execution).** -#) The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container in the ``/tmp/$REQUEST_ID/input`` folder. The ``$INPUT_FILE_PATH`` environment variable will point to the location of the input file. -#) The shell-script processes the input file and produces the output (either one or multiple files) in the folder ``/tmp/$REQUEST_ID/output``. -#) The output files are automatically uploaded by the Lambda function into the ``output`` folder of ``bucket-name``. +#) **The folder 'input' inside the amazon S3 bucket 'scar-darknet' will be searched for files.** +#) **The Lambda function is triggered once for each file found in the folder. The first execution is of type 'request-response' and the rest are 'asynchronous' (this is done to ensure the caching and accelerate the subsequent executions).** +#) The Lambda function retrieves the file from the Amazon S3 bucket and makes it available for the shell-script running inside the container. The ``$INPUT_FILE_PATH`` environment variable will point to the location of the input file. +#) The shell-script processes the input file and produces the output (either one or multiple files) in the path specified by the ``$TMP_OUTPUT_DIR`` global variable. +#) The output files are automatically uploaded by the Lambda function into the ``output`` folder of ``scar-darknet`` bucket. .. image:: images/wait.png :align: center -Specific input/output folders ------------------------------ - -If you don't like the default folder structure created by SCAR you can specify the input/ouput paths in the configuration file:: - - cat >> darknet.yaml << EOF - functions: - scar-darknet: - image: grycap/darknet - memory: 2048 - init_script: examples/darknet/yolo-sample-object-detection.sh - s3: - input_bucket: scar-test-input - input_folder: my-input-folder - output_bucket: scar-test-output - output_folder: my-output-folder - EOF +Function Definition Language (FDL) +---------------------------------- - scar init -f darknet.yaml +In the last update of SCAR, the language used to define functions was improved and now several functions with their complete configurations can be defined in one configuration file. Additionally, differente storage providers with different configurations can be used. -This configuration file is telling the lambda function to retrieve the input from the bucket ``scar-test-input`` and the folder ``my-input-folder`` and store the outputs in the bucket ``scar-test-output`` and the folder ``my-output-folder/$REQUEST_ID/``. None of this buckets or folders must be previously created for this to work. SCAR manages the creation of the required buckets/folders. +A complete working example of this functionality can be found `here `_. -This feature also allows us a workflow by setting the output folder of one function as the input folder of the next function that we want to execute. For example we could have a function that parses a video and stores a keyframe each 10 seconds and then have another function that takes that input and anlyzes it. The configuration files could be something like this:: +In this example two functions are created, one with Batch delegation to process videos and the other in Lambda to process the generated images. The functions are connected by their linked buckets as it can be seen in the configuration file:: - cat >> video-parser.yaml << EOF + cat >> scar-video-process.yaml << EOF functions: - scar-video: - image: grycap/ffmpeg - memory: 1024 - init_script: parse-video.sh - s3: - input_bucket: scar-input - output_folder: video-output + aws: + - lambda: + name: scar-batch-ffmpeg-split + init_script: split-video.sh + execution_mode: batch + container: + image: grycap/ffmpeg + input: + - storage_provider: s3 + path: scar-video/input + output: + - storage_provider: s3 + path: scar-video/split-images + - lambda: + name: scar-lambda-darknet + init_script: yolo-sample-object-detection.sh + memory: 3008 + container: + image: grycap/darknet + input: + - storage_provider: s3 + path: scar-video/split-images + output: + - storage_provider: s3 + path: scar-video/output EOF - scar init -f video-parser.yaml + scar init -f scar-video-process.yaml - cat >> image-parser.yaml << EOF - functions: - scar-darknet: - image: grycap/darknet - memory: 2048 - init_script: parse-images.sh - s3: - input_bucket: scar-input - input_folder: video-output - output_folder: image-output - EOF +Using the common folder ``split-images`` these functions can be connected to create a workflow. +None of this buckets or folders must be previously created for this to work. SCAR manages the creation of the required buckets/folders. - scar init -f image-parser.yaml +To launch this workflow you only need to upload a video to the folder ``input`` of the ``scar-video`` bucket, with the command:: -See how the functions are using the same bucket (although it's not neccesary) and the output folder of the first is the input folder of the second. + scar put -b scar-video/input -p seq1.avi -To launch the workflow you only need to upload a video to the folder ``scar-video/input`` of the ``scar-input`` bucket. \ No newline at end of file +This will launch first, the splitting function that will create 68 images (one per each second of the video), and second, the 68 Lambda functions that process the created images and analyze them. \ No newline at end of file diff --git a/docs/source/testing.rst b/docs/source/testing.rst index ebbcde17..9127c743 100644 --- a/docs/source/testing.rst +++ b/docs/source/testing.rst @@ -37,18 +37,4 @@ Procedure for testing: Further information is available in the udocker documentation:: - udocker help - -Testing of the Lambda functions with emulambda -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -For easier debugging of the Lambda functions, `emulambda `_ can be employed to locally execute them. - -#) Install emulambda - -#) Execute a sample local test:: - - sh test/emulambda/run-local-test.sh - - -This test locally executes the ubuntu:16.04 image in DockerHub via udocker executing a simple shell-script. + udocker help \ No newline at end of file diff --git a/examples/aws-cli/README.md b/examples/aws-cli/README.md index d5e46d88..84991425 100644 --- a/examples/aws-cli/README.md +++ b/examples/aws-cli/README.md @@ -4,12 +4,12 @@ Docker image for [AWS CLI](https://aws.amazon.com/cli/) based on the [alpine](ht ## Local Usage -Credentials can be passed through the following environment variables: +Credentials can be passed to the Docker container through the following environment variables: * `AWS_ACCESS_KEY_ID` * `AWS_SECRET_ACCESS_KEY` -Assuming that these variables are already populated on your machine, you would list all the EC2 instances by issuing the command: +Assuming that these variables are already populated on your machine, you would list all your defined lambda functions by issuing the command: ```sh docker run --rm -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY grycap/aws-cli lambda list-functions @@ -27,10 +27,8 @@ You can run AWS CLI in AWS Lambda via [SCAR](https://github.com/grycap/scar) usi scar init -f scar-aws-cli.yaml ``` -2. Execute the Lambda function +2. Invoke the Lambda function with the parameters that you want to execute in the aws-cli ```sh scar run -f scar-aws-cli.yaml lambda list-functions -``` - -You have the AWS CLI running on AWS Lambda. \ No newline at end of file +``` \ No newline at end of file diff --git a/examples/aws-cli/scar-aws-cli.yaml b/examples/aws-cli/scar-aws-cli.yaml index 55950767..4de43415 100644 --- a/examples/aws-cli/scar-aws-cli.yaml +++ b/examples/aws-cli/scar-aws-cli.yaml @@ -1,6 +1,10 @@ functions: - scar-aws-cli: - image: grycap/awscli - environment: - AWS_ACCESS_KEY_ID: XXXXX - AWS_SECRET_ACCESS_KEY: XXXXX + aws: + - lambda: + name: scar-aws-cli + container: + image: grycap/awscli + environment: + Variables: + AWS_ACCESS_KEY_ID: XXXXX + AWS_SECRET_ACCESS_KEY: XXXXX diff --git a/examples/cowsay/README.md b/examples/cowsay/README.md index 0742c144..8ca69b89 100644 --- a/examples/cowsay/README.md +++ b/examples/cowsay/README.md @@ -1,6 +1,6 @@ # Alpine-less Cowsay -Docker image for [Cowsay](https://en.wikipedia.org/wiki/Cowsay) and [Fortune](https://en.wikipedia.org/wiki/Fortune_(Unix)) based on the [ubuntu:16.04](https://hub.docker.com/r/library/ubuntu/tags/16.04/) Docker image. +Docker image for [Cowsay](https://en.wikipedia.org/wiki/Cowsay) and [Fortune](https://en.wikipedia.org/wiki/Fortune_(Unix) based on the [ubuntu:16.04](https://hub.docker.com/r/library/ubuntu/tags/16.04/) Docker image. ## Local Usage diff --git a/examples/cowsay/scar-cowsay.yaml b/examples/cowsay/scar-cowsay.yaml index 2c0abbc9..c269b565 100644 --- a/examples/cowsay/scar-cowsay.yaml +++ b/examples/cowsay/scar-cowsay.yaml @@ -1,3 +1,6 @@ functions: - scar-cowsay: - image: grycap/cowsay \ No newline at end of file + aws: + - lambda: + name: scar-cowsay + container: + image: grycap/cowsay diff --git a/examples/cowsay/scar-minicow.yaml b/examples/cowsay/scar-minicow.yaml index 562b9338..2206005b 100644 --- a/examples/cowsay/scar-minicow.yaml +++ b/examples/cowsay/scar-minicow.yaml @@ -1,3 +1,6 @@ functions: - scar-minicow: - image_file: minicow.tar.gz + aws: + - lambda: + name: scar-minicow + container: + image_file: minicow.tar.gz diff --git a/examples/darknet/README.md b/examples/darknet/README.md index 67e6a7d5..ad059ac5 100644 --- a/examples/darknet/README.md +++ b/examples/darknet/README.md @@ -22,10 +22,10 @@ Create the Lambda function using the `scar-darknet.yaml` configuration file: scar init -f scar-darknet.yaml ``` -Launch the Lambda function uploading a file to the `s3://scar-darknet/scar-darknet-s3/input` folder in S3. +Launch the Lambda function uploading a file to the `s3://scar-darknet/input` folder in S3. ```sh -scar put -b scar-darknet/scar-darknet-s3/input -p dog.jpg +scar put -b scar-darknet/input -p dog.jpg ``` Take into consideration than the first invocation will take considerably longer than the subsequent ones, where the container will be cached. @@ -64,33 +64,33 @@ scar invoke -f scar-darknet-api-s3.yaml -db dog.jpg -a When the execution of the function finishes, the script used produces two output files and SCAR copies them to the S3 bucket used. To check if the files are created and copied correctly you can use the command: ```sh -scar ls -b scar-darknet/scar-darknet-api/output +scar ls -b scar-darknet/output ``` Which outputs: ``` -scar-darknet-api/output/68f5c9d5-5826-44gr-basc-8f8b23f44cdf/image-result.png -scar-darknet-api/output/68f5c9d5-5826-44gr-basc-8f8b23f44cdf/result.out +output/dog.out +output/dog.png ``` -The files are created in the output folder following the `s3://$BUCKET_NAME/$FUNCTION_NAME/output/$REQUEST_ID/*.*` structure. +The files are created in the output folder following the `s3://$BUCKET_NAME/output/*.*` structure. To download the created files you can also use SCAR. Download a folder with: ```sh -scar get -b scar-darknet/scar-darknet-api/output -p /tmp/lambda/ +scar get -b scar-darknet/output -p /tmp/lambda/ ``` -This command creates the `scar-darknet-api/ouput` folder and all the required subfolders in the `/tmp/lambda/` folder +This command creates the `ouput` folder and all the required subfolders (if any) in the `/tmp/lambda/` folder In our case the two output files are result.out: ```sh -/tmp/68f5c9d5-5826-44gr-basc-8f8b23f44cdf/input/dog.jpg: Predicted in 12.383388 seconds. -dog: 82% -truck: 64% -bicycle: 85% +/tmp/tmpzhmispbg/dog.jpg: Predicted in 28.073856 seconds. +dog: 80% +truck: 73% +bicycle: 81% ``` and image-result.png: @@ -104,21 +104,4 @@ scar rm -f scar-darknet-api-s3.yaml Have in mind that the bucket and the folders and files created are not deleted when the function is deleted. -If you want to delete the bucket you have to do it manually. - -### Processing the output locally - -Other option when invoking a synchronous function is to store the output in our machine. - -When using this option you have to make sure that the output generated by your script is the binary content that you want to save in your machine. Also due to the API Gateway limits your function has to finish in 30 seconds or less. - -The script `scar-darknet-api-bin.yaml` process the image using darknet, packages the image output and the darknet output, and then dumps the packaged file in the standard output. - -SCAR reads the output binary content and then creates the file in the specified file by the CLI command: - -```sh -scar init -f scar-darknet-api-bin.yaml -scar invoke -f scar-darknet-api-bin.yaml -db dog.jpg -o output.tar.gz -``` - -By using this functionality the user can process the function output without using S3 buckets. \ No newline at end of file +If you want to delete the bucket you have to do it manually. \ No newline at end of file diff --git a/examples/darknet/scar-darknet-api-bin.yaml b/examples/darknet/scar-darknet-api-bin.yaml deleted file mode 100644 index 2b3798a1..00000000 --- a/examples/darknet/scar-darknet-api-bin.yaml +++ /dev/null @@ -1,7 +0,0 @@ -functions: - scar-darknet-api-bin: - image: grycap/darknet - memory: 2048 - init_script: yolo-bin.sh - api_gateway: - name: darknet-bin \ No newline at end of file diff --git a/examples/darknet/scar-darknet-api-s3.yaml b/examples/darknet/scar-darknet-api-s3.yaml index 86203c22..624c6801 100644 --- a/examples/darknet/scar-darknet-api-s3.yaml +++ b/examples/darknet/scar-darknet-api-s3.yaml @@ -1,9 +1,13 @@ functions: - scar-darknet-api: - image: grycap/darknet - memory: 2048 - init_script: yolo.sh + aws: + - lambda: + name: scar-darknet-api-s3 + memory: 2048 + init_script: yolo.sh + container: + image: grycap/darknet + output: + - storage_provider: s3 + path: scar-darknet/output api_gateway: - name: darknet - s3: - input_bucket: scar-darknet \ No newline at end of file + name: darknet diff --git a/examples/darknet/scar-darknet.yaml b/examples/darknet/scar-darknet.yaml index 5ffce794..70623aea 100644 --- a/examples/darknet/scar-darknet.yaml +++ b/examples/darknet/scar-darknet.yaml @@ -1,7 +1,15 @@ functions: - scar-darknet-s3: - image: grycap/darknet - memory: 2048 - init_script: yolo.sh - s3: - input_bucket: scar-darknet + aws: + - lambda: + name: scar-darknet-s3 + memory: 2048 + init_script: yolo.sh + container: + image: grycap/darknet + input: + - storage_provider: s3 + path: scar-darknet/input + output: + - storage_provider: s3 + path: scar-darknet/output + diff --git a/examples/darknet/yolo-bin.sh b/examples/darknet/yolo-bin.sh deleted file mode 100644 index 9b741729..00000000 --- a/examples/darknet/yolo-bin.sh +++ /dev/null @@ -1,10 +0,0 @@ -#!/bin/bash - -OUT_FOLDER="/tmp/output" -mkdir -p $OUT_FOLDER - -cd /opt/darknet -./darknet detect cfg/yolo.cfg yolo.weights $INPUT_FILE_PATH -out $OUT_FOLDER/image 2>/dev/null 1>$OUT_FOLDER/result - -tar -zcf $TMP_OUTPUT_DIR/result.tar.gz $OUT_FOLDER 2>/dev/null -cat $TMP_OUTPUT_DIR/result.tar.gz diff --git a/examples/darknet/yolo.sh b/examples/darknet/yolo.sh index 9db125ab..ac945cc2 100644 --- a/examples/darknet/yolo.sh +++ b/examples/darknet/yolo.sh @@ -1,9 +1,10 @@ #!/bin/bash -RESULT="$TMP_OUTPUT_DIR/result.out" -OUTPUT_IMAGE="$TMP_OUTPUT_DIR/image-result" +IMAGE_NAME=`basename "$INPUT_FILE_PATH" .jpg` +RESULT="$TMP_OUTPUT_DIR/$IMAGE_NAME.out" +OUTPUT_IMAGE="$TMP_OUTPUT_DIR/$IMAGE_NAME" echo "SCRIPT: Analyzing file '$INPUT_FILE_PATH', saving the result in '$RESULT' and the output image in '$OUTPUT_IMAGE.png'" cd /opt/darknet -./darknet detect cfg/yolo.cfg yolo.weights $INPUT_FILE_PATH -out $OUTPUT_IMAGE > $RESULT \ No newline at end of file +./darknet detect cfg/yolo.cfg yolo.weights $INPUT_FILE_PATH -out $OUTPUT_IMAGE > $RESULT diff --git a/examples/elixir/scar-elixir.yaml b/examples/elixir/scar-elixir.yaml index f588c491..c26bb1dc 100644 --- a/examples/elixir/scar-elixir.yaml +++ b/examples/elixir/scar-elixir.yaml @@ -1,3 +1,6 @@ functions: - scar-elixir: - image: grycap/elixir \ No newline at end of file + aws: + - lambda: + name: scar-elixir + container: + image: grycap/elixir diff --git a/examples/erlang/scar-erlang.yaml b/examples/erlang/scar-erlang.yaml index ea11d826..2159e3e6 100644 --- a/examples/erlang/scar-erlang.yaml +++ b/examples/erlang/scar-erlang.yaml @@ -1,3 +1,6 @@ functions: - scar-erlang: - image: grycap/erlang \ No newline at end of file + aws: + - lambda: + name: scar-erlang + container: + image: grycap/erlang diff --git a/examples/ffmpeg/README.md b/examples/ffmpeg/README.md index de8b5e1f..9065c982 100644 --- a/examples/ffmpeg/README.md +++ b/examples/ffmpeg/README.md @@ -16,15 +16,15 @@ In this example, the goal is that videos uploaded to an Amazon S3 bucket are aut A sample script to be executed inside the Docker container running on AWS Lambda is shown in the file [grayify-video.sh](grayify-video.sh). This script is agnostic to the Lambda function and it assumes that: -1. The user will upload the video into the `scar-ffmpeg/input` folder of an Amazon S3 bucket. -2. The input video file will automatically be made available in `tmp/$REQUEST_ID/input`, as specified by the `$INPUT_FILE_PATH` environment variable. +1. The user will upload the video into the `input` folder of the `scar-ffmpeg` Amazon S3 bucket. +2. The input video file will automatically be made available in the in the path specified by the `$INPUT_FILE_PATH` environment variable. 3. The script will convert to video to grayscale. -4. The output video file will be saved in `/tmp/$REQUEST_ID/output` that is specified by the `$TMP_OUTPUT_DIR` environment variable. -5. The video file will be automatically uploaded to the `scar-ffmpeg/output/$REQUEST_ID` folder of the Amazon S3 bucket and deleted from the underlying storage. +4. The output video file will be saved in the path specified by the `$TMP_OUTPUT_DIR` environment variable. +5. The video file will be automatically uploaded to the `output` folder of the `scar-ffmpeg` Amazon S3 bucket and deleted from the underlying storage. ## Create the Lambda function -This example assumes that the Amazon S3 bucket is `scar-test`. Since there is a flat namespace, please change this name for your tests. +This example assumes that the Amazon S3 bucket is `scar-ffmpeg`, if the bucket doesn't exist it will create it. ```sh scar init -f scar-ffmpeg.yaml @@ -33,29 +33,34 @@ scar init -f scar-ffmpeg.yaml ## Test the Lambda function Upload a video to the S3 bucket. For these examples we are using sample videos from the [ICPR 2010 Contest on Semantic Descriptio if Human Activities (SDAH 2010)](http://cvrc.ece.utexas.edu/SDHA2010/Human_Interaction.html). + ```sh -scar put -b scar-test/scar-ffmpeg/input -p seq1.avi +scar put -b scar-ffmpeg/input -p seq1.avi ``` To check the progress of the function invocation you can call the `log` command: + ```sh scar log -f scar-ffmpeg.yaml ``` -Whe the execution finishes, the converted video to grayscale will be available in `s3://scar-test/lambda-ffmpeg/output/$REQUEST_ID/seq1.avi`. Moreover you can list the files in the specified bucket with the command: +Whe the execution finishes, the converted video to grayscale will be available in `s3://scar-ffmpeg/output/seq1.avi`. Moreover you can list the files in the specified bucket with the command: + ```sh -scar ls -b scar-test/scar-ffmpeg/output/ +scar ls -b scar-ffmpeg/output/ ``` After the function finishes you can download the generated output video using the following command: + ```sh -scar get -b scar-test/scar-ffmpeg/output -p /tmp/ +scar get -b scar-ffmpeg/output -p /tmp/ ``` This command will download the ouput folder of the S3 bucket to the /tmp/ folder of your computer As an additional feature, you can also upload multiple videos to S3 using a folder instead an specific file. + ```sh -scar put -b scar-test/scar-ffmpeg/input -p /my-videos/ +scar put -b scar-ffmpeg/input -p /my-videos/ ``` Multiple concurrent Lambda invocations of the same function will process in parallel the video files. Notice that the first invocation(s) will take considerably longer until caching of the Docker container is performed. diff --git a/examples/ffmpeg/scar-ffmpeg.yaml b/examples/ffmpeg/scar-ffmpeg.yaml index aa245486..78550e0d 100644 --- a/examples/ffmpeg/scar-ffmpeg.yaml +++ b/examples/ffmpeg/scar-ffmpeg.yaml @@ -1,7 +1,14 @@ functions: - scar-ffmpeg: - image: sameersbn/ffmpeg - memory: 2048 - init_script: grayify-video.sh - s3: - input_bucket: scar-test \ No newline at end of file + aws: + - lambda: + name: scar-ffmpeg + memory: 2048 + init_script: grayify-video.sh + container: + image: sameersbn/ffmpeg + input: + - storage_provider: s3 + path: scar-ffmpeg/input + output: + - storage_provider: s3 + path: scar-ffmpeg/output diff --git a/examples/imagemagick/README.md b/examples/imagemagick/README.md index 3b83a85e..7fe65c8b 100644 --- a/examples/imagemagick/README.md +++ b/examples/imagemagick/README.md @@ -22,18 +22,18 @@ scar init -f scar-imagemagick.yaml 2. Upload a file to the S3 bucket ```sh -scar put -b scar-imagemagick/scar-imagemagick/input -p homer.png +scar put -b scar-imagemagick/input -p homer.png ``` -The converted image to grayscale will be available in `s3://scar-imagemagick/scar-imagemagick/output/$REQUEST_ID/homer.png` +The converted image to grayscale will be available in `s3://scar-imagemagick/output/homer.png` 3. Download a file from the S3 bucket ```sh -scar get -b scar-imagemagick/scar-imagemagick/output -p /tmp/ +scar get -b scar-imagemagick/output -p /tmp/ ``` -The image will be downloaded in the path `/tmp/scar-imagemagick/output/$REQUEST_ID/homer.png`. +The image will be downloaded in the path `/tmp/output/homer.png`. The first invocation will take considerable longer time than most of the subsequent invocations since the container will be cached. diff --git a/examples/imagemagick/scar-imagemagick.yaml b/examples/imagemagick/scar-imagemagick.yaml index 74c899b2..16df4083 100644 --- a/examples/imagemagick/scar-imagemagick.yaml +++ b/examples/imagemagick/scar-imagemagick.yaml @@ -1,6 +1,13 @@ functions: - scar-imagemagick: - image: grycap/imagemagick - init_script: grayify-image.sh - s3: - input_bucket: scar-imagemagick \ No newline at end of file + aws: + - lambda: + name: scar-imagemagick + init_script: grayify-image.sh + container: + image: grycap/imagemagick + input: + - storage_provider: s3 + path: scar-imagemagick/input + output: + - storage_provider: s3 + path: scar-imagemagick/output diff --git a/examples/mrbayes/README.md b/examples/mrbayes/README.md index d8127059..9a4c8a3a 100644 --- a/examples/mrbayes/README.md +++ b/examples/mrbayes/README.md @@ -2,39 +2,29 @@ Docker image for [MrBayes](http://mrbayes.sourceforge.net/) based on the [ubuntu:14.04](https://hub.docker.com/r/library/ubuntu/tags/14.04/) Docker image. -## Building docker image - -```sh -docker build -t grycap/mrbayes -f Dockerfile binary/ -``` +## Usage in AWS Lambda via SCAR -## Local Usage +You can run this image in AWS Lambda via [SCAR](https://github.com/grycap/scar) using the following procedure: -Gaining shell access: +1. Create the Lambda function ```sh -docker run --rm -ti grycap/mrbayes /bin/bash +scar init -f scar-mrbayes.yaml ``` -A sample execution can be initiated with: +2. Execute the Lambda function uploading a file to the linked bucket. ```sh -docker run --rm -ti grycap/mrbayes /tmp/mrbayes-sample-run.sh +scar put -b scar-mrbayes/input -p cynmix.nex ``` - -## Usage in AWS Lambda via SCAR - -You can run this image in AWS Lambda via [SCAR](https://github.com/grycap/scar) using the following procedure: - -1. Create the Lambda function +3. Check the function logs to see when the execution has finished. ```sh -scar init -f scar-mrbayes.yaml +scar ls -b scar-mrbayes ``` -2. Execute the Lambda function passing an execution script (in this case, specified on the configuration file) +4. Download the generated result file ```sh -scar run -f scar-mrbayes.yaml +scar get -b scar-mrbayes/output -p . ``` - diff --git a/examples/mrbayes/mrbayes-sample-run.sh b/examples/mrbayes/mrbayes-sample-run.sh index b21e34ff..e53fbaf5 100755 --- a/examples/mrbayes/mrbayes-sample-run.sh +++ b/examples/mrbayes/mrbayes-sample-run.sh @@ -26,4 +26,4 @@ mcmc file=${INPUT_FILE_PATH}3 quit EOF -mb < batch.txt +mb < batch.txt > $TMP_OUTPUT_FOLDER.out diff --git a/examples/mrbayes/scar-mrbayes-batch.yaml b/examples/mrbayes/scar-mrbayes-batch.yaml deleted file mode 100644 index 3f129c2d..00000000 --- a/examples/mrbayes/scar-mrbayes-batch.yaml +++ /dev/null @@ -1,10 +0,0 @@ -functions: - scar-mrbayes-batch: - image: grycap/mrbayes - init_script: mrbayes-sample-run.sh - execution_mode: batch - s3: - input_bucket: scar-mrbayes - environment: - ITERATIONS: "10000" - diff --git a/examples/mrbayes/scar-mrbayes-lambda-batch.yaml b/examples/mrbayes/scar-mrbayes-lambda-batch.yaml deleted file mode 100644 index a27a55ed..00000000 --- a/examples/mrbayes/scar-mrbayes-lambda-batch.yaml +++ /dev/null @@ -1,12 +0,0 @@ -functions: - scar-mrbayes-lambda-batch: - image: grycap/mrbayes - init_script: mrbayes-sample-run.sh - execution_mode: lambda-batch - time: 30 - log_level: DEBUG - environment: - ITERATIONS: "10000" - s3: - input_bucket: scar-mrbayes - \ No newline at end of file diff --git a/examples/mrbayes/scar-mrbayes.yaml b/examples/mrbayes/scar-mrbayes.yaml index 972dccab..506cfa3e 100644 --- a/examples/mrbayes/scar-mrbayes.yaml +++ b/examples/mrbayes/scar-mrbayes.yaml @@ -1,6 +1,13 @@ functions: - scar-mrbayes: - image: grycap/mrbayes - init_script: mrbayes-sample-run.sh - s3: - input_bucket: scar-mrbayes \ No newline at end of file + aws: + - lambda: + name: scar-mrbayes + init_script: mrbayes-sample-run.sh + container: + image: grycap/mrbayes + input: + - storage_provider: s3 + path: scar-mrbayes/input + output: + - storage_provider: s3 + path: scar-mrbayes/output diff --git a/examples/plant-classification/README.md b/examples/plant-classification/README.md index 3c52e884..df3fa898 100644 --- a/examples/plant-classification/README.md +++ b/examples/plant-classification/README.md @@ -20,25 +20,42 @@ In this case we will configure the Lambda function to automatically delegate to scar init -f scar-plant-classification.yaml ``` -2. Upload the processing script to Amazon S3 +2. Upload the processing script to Amazon S3. Take into account that SCAR doesn't make the files public by default, for this example to run you have to change the properties of the uploaded file and make it public. ```sh -scar put -b scar-test/scar-plants -p plant-classification-run.sh +scar put -b scar-plants -p plant-classification-run.sh ``` 3. Upload an image to the input folder of the bucket (to trigger the image processing) ```sh -scar put -b scar-test/scar-plants/input -p daisy.jpg +scar put -b scar-plants/input -p daisy.jpg ``` -If you upload several files, the corresponding jobs will be submitted to AWS Batch. Depending on the number of jobs it may decide to spawn additional EC2 instances in order to cope with the increased workload. Once the executions have finished (you can check the corresponding CloudWatch logs) ... +4. You can check the corresponding AWS Batch CloudWatch logs to see the job progress. To do this, you need to know first the request_id of the Batch job, for that you have to check the Lambda logs first and search for this output. -4. Download the generated files from the S3 bucket +```sh +scar log -f scar-plant-classification.yaml +``` +Then search for this string: + +``` +[...] +Check batch logs with: + scar log -n scar-plants -ri $BATCH_JOB_ID +[...] +``` +And finally check the job logs with the command offered: + +```sh +scar log -n scar-plants -ri $BATCH_JOB_ID +``` + +If you upload several files, the corresponding jobs will be submitted to AWS Batch. Depending on the number of jobs it may decide to spawn additional EC2 instances in order to cope with the increased workload. Once the executions have finished you can download the generated files from the S3 bucket: ```sh -scar get -b scar-test/scar-plants/output -p /tmp/plant-classification +scar get -b scar-plants/output -p /tmp/plant-classification ``` -The last command creates an `output/$REQUEST_ID` folder in the `/tmp/plant-classification` path with the files generated by each invocation. +The last command creates an `output` folder in the `/tmp/plant-classification` path with the files generated by each invocation. \ No newline at end of file diff --git a/examples/plant-classification/bootstrap-plants.sh b/examples/plant-classification/bootstrap-plants.sh index 8495c083..380a188f 100644 --- a/examples/plant-classification/bootstrap-plants.sh +++ b/examples/plant-classification/bootstrap-plants.sh @@ -1,2 +1,2 @@ #! /bin/sh -curl https://s3.amazonaws.com/scar-test/scar-plants/plant-classification-run.sh | sh - +curl https://s3.amazonaws.com/scar-plants/plant-classification-run.sh | sh - diff --git a/examples/plant-classification/scar-plant-classification.yaml b/examples/plant-classification/scar-plant-classification.yaml index 3e982084..75f34fbf 100644 --- a/examples/plant-classification/scar-plant-classification.yaml +++ b/examples/plant-classification/scar-plant-classification.yaml @@ -1,8 +1,15 @@ functions: - scar-plants: - image: deephdc/deep-oc-plant-classification-theano - memory: 1024 - execution_mode: batch - s3: - input_bucket: scar-test - init_script: bootstrap-plants.sh + aws: + - lambda: + name: scar-plants + init_script: bootstrap-plants.sh + memory: 1024 + execution_mode: batch + container: + image: deephdc/deep-oc-plant-classification-theano + input: + - storage_provider: s3 + path: scar-plants/input + output: + - storage_provider: s3 + path: scar-plants/output diff --git a/examples/r/scar-r.yaml b/examples/r/scar-r.yaml index 44563b3d..5cfc8282 100644 --- a/examples/r/scar-r.yaml +++ b/examples/r/scar-r.yaml @@ -1,3 +1,6 @@ functions: - scar-r: - image: grycap/r-base-lambda \ No newline at end of file + aws: + - lambda: + name: scar-r + container: + image: grycap/r-base-lambda diff --git a/examples/ruby/scar-ruby.yaml b/examples/ruby/scar-ruby.yaml index cdb6e1e2..708e1d72 100644 --- a/examples/ruby/scar-ruby.yaml +++ b/examples/ruby/scar-ruby.yaml @@ -1,3 +1,6 @@ functions: - scar-ruby: - image: ruby:2.2.10-slim-jessie \ No newline at end of file + aws: + - lambda: + name: scar-ruby + container: + image: ruby:2.2.10-slim-jessie diff --git a/examples/spectacle/README.md b/examples/spectacle/README.md index 4fda129e..1eae71c1 100644 --- a/examples/spectacle/README.md +++ b/examples/spectacle/README.md @@ -15,12 +15,12 @@ scar init -f scar-spectacle.yaml 2. Upload a file to the S3 bucket to launch the function ```sh -scar put -b scar-test/scar-spectacle/input -p swagger.json +scar put -b scar-spectacle/input -p swagger.json ``` 3. Download the generated files from the S3 bucket ```sh -scar get -b scar-test/scar-spectacle/output -p /tmp/spectacle +scar get -b scar-spectacle/output -p /tmp/spectacle ``` -The last command creates an `output/$REQUEST_ID` folder in the `/tmp/spectacle` path with the files generated by the spectacle invocation. \ No newline at end of file +The last command creates an `output/` folder in the `/tmp/spectacle` path with the files generated by the spectacle invocation. \ No newline at end of file diff --git a/examples/spectacle/scar-spectacle.yaml b/examples/spectacle/scar-spectacle.yaml index c533f533..da28ca30 100644 --- a/examples/spectacle/scar-spectacle.yaml +++ b/examples/spectacle/scar-spectacle.yaml @@ -1,7 +1,14 @@ functions: - scar-spectacle: - image: sourcey/spectacle - memory: 1024 - init_script: generate-documentation.sh - s3: - input_bucket: scar-test \ No newline at end of file + aws: + - lambda: + name: scar-spectacle + init_script: generate-documentation.sh + memory: 1024 + container: + image: sourcey/spectacle + input: + - storage_provider: s3 + path: scar-spectacle/input + output: + - storage_provider: s3 + path: scar-spectacle/output diff --git a/examples/theano/scar-theano.yaml b/examples/theano/scar-theano.yaml index d706dd33..37ac0263 100644 --- a/examples/theano/scar-theano.yaml +++ b/examples/theano/scar-theano.yaml @@ -1,3 +1,6 @@ functions: - scar-theano: - image: grycap/theano \ No newline at end of file + aws: + - lambda: + name: scar-theano + container: + image: grycap/theano diff --git a/examples/video-process/README.md b/examples/video-process/README.md index 69e4b5a3..8ac99935 100644 --- a/examples/video-process/README.md +++ b/examples/video-process/README.md @@ -2,21 +2,16 @@ In this example we are going to process an input video by combining the benefits of the highly-scalable AWS Lambda service with the convenience of batch-based computing provided by AWS Batch. The video is going to be split in different images and then those images are going to be analyzed by a neural network. This is a clear example of a serverless workflow. -Two different Lambda functions are defined to do this work: first, a function that creates an AWS Batch job that splits the video in 1-second pictures and stores them in S3; second, a Lambda function that processes each image to perform object detection and stores the result also in Amazon S3. - -The two different configuration files can be found in this folder. The file 'scar-batch-ffmpeg-split.yaml' defines a function that creates an AWS Batch job and the file 'scar-lambda-darknet.yaml' defines a functions that analyzes the images created. +Two different Lambda functions are defined to do this work: first `scar-batch-ffmpeg-split`, a function that creates an AWS Batch job that splits the video in 1-second pictures and stores them in S3; second `scar-lambda-darknet`, a Lambda function that processes each image to perform object detection and stores the result also in Amazon S3. Both functions are defined in the configuration file `scar-video-process.yaml`. More information about the AWS Batch integration can be found in the [documentation](https://scar.readthedocs.io/en/latest/batch.html). ## Create the processing functions -To create the functions you only need to execute two commands: +To create the workflow you only need to execute one command: ```sh -scar init -f scar-batch-ffmpeg-split.yaml -``` -```sh -scar init -f scar-lambda-darknet.yaml +scar init -f scar-video-process.yaml ``` ## Launch the execution @@ -24,49 +19,47 @@ scar init -f scar-lambda-darknet.yaml In order to launch an execution you have to upload a file to the defined input bucket of the Lambda function that creates the AWS Batch job. In this case, the following command will start the execution: ```sh -scar put -b scar-ffmpeg/scar-batch-ffmpeg-split/input -p ../ffmpeg/seq1.avi +scar put -b scar-video/input -p ../ffmpeg/seq1.avi ``` ## Process the output -When the execution of the function finishes, the script used produces two output files for each Lambda invocation. SCAR copies them to the S3 bucket specified as output. To check if the files are created and copied correctly you can use the command: +When the execution of the second function finishes, the script used produces two output files for each Lambda invocation. SCAR copies them to the S3 bucket specified as output. To check if the files are created and copied correctly you can use the command: ```sh -scar ls -b scar-ffmpeg/scar-batch-ffmpeg-split/image-output +scar ls -b scar-video/output ``` Which lists the following outputs: ``` -scar-batch-ffmpeg-split/image-output/c45433a2-e8e4-11e8-8c48-ab3c38d92053/image-result.png -scar-batch-ffmpeg-split/image-output/c45433a2-e8e4-11e8-8c48-ab3c38d92053/result.out +output/001.out +output/001.png +output/002.out +output/002.png ... -scar-batch-ffmpeg-split/image-output/c46aefe9-e8e4-11e8-97ef-8342661a6503/image-result.png -scar-batch-ffmpeg-split/image-output/c46aefe9-e8e4-11e8-97ef-8342661a6503/result.out -scar-batch-ffmpeg-split/image-output/c479475e-e8e4-11e8-995c-b14a6469fc4a/image-result.png -scar-batch-ffmpeg-split/image-output/c479475e-e8e4-11e8-995c-b14a6469fc4a/result.out +output/067.out +output/067.png +output/068.out +output/068.png ``` -The files are created in the output folder following the `s3://scar-ffmpeg/scar-batch-ffmpeg-split/image-output/$REQUEST_ID/*.*` structure. +The files are created in the output folder following the `s3://scar-video/output/*.*` structure. -To download the created files you can also use SCAR with the following command: +To download the generated files you can also use SCAR with the following command: ```sh -scar get -b scar-ffmpeg/scar-batch-ffmpeg-split/image-output -p /tmp/lambda/ +scar get -b scar-video/output -p /tmp/video/ ``` -This command creates and `image-output` folder and all the subfolders in the `/tmp/lambda/` folder +This command creates the `video/output` folder in the `/tmp` path. ## Delete the Lambda functions Do not forget to delete the functions when you finish your testing: ```sh -scar rm -f scar-batch-ffmpeg-split.yaml -``` - -```sh -scar rm -f scar-lambda-darknet.yaml +scar rm -f scar-video-process.yaml ``` Have in mind that the bucket, the folders and the files created are not deleted when the function is deleted. @@ -74,5 +67,5 @@ Have in mind that the bucket, the folders and the files created are not deleted If you want to delete the bucket you have to do it manually using, for example, AWS CLI:: ```sh - aws s3 rb s3://scar-ffmpeg --force + aws s3 rb s3://scar-video --force ``` \ No newline at end of file diff --git a/examples/video-process/scar-batch-ffmpeg-split.yaml b/examples/video-process/scar-batch-ffmpeg-split.yaml deleted file mode 100644 index 8b3587e8..00000000 --- a/examples/video-process/scar-batch-ffmpeg-split.yaml +++ /dev/null @@ -1,8 +0,0 @@ -functions: - scar-batch-ffmpeg-split: - image: grycap/ffmpeg - init_script: split-video.sh - execution_mode: batch - s3: - input_bucket: scar-ffmpeg - output_bucket: scar-ffmpeg/scar-batch-ffmpeg-split/video-output diff --git a/examples/video-process/scar-lambda-darknet.yaml b/examples/video-process/scar-lambda-darknet.yaml deleted file mode 100644 index 1881672e..00000000 --- a/examples/video-process/scar-lambda-darknet.yaml +++ /dev/null @@ -1,8 +0,0 @@ -functions: - scar-lambda-darknet: - image: grycap/darknet - memory: 3008 - init_script: yolo-sample-object-detection.sh - s3: - input_bucket: scar-ffmpeg/scar-batch-ffmpeg-split/video-output - output_bucket: scar-ffmpeg/scar-batch-ffmpeg-split/image-output \ No newline at end of file diff --git a/examples/video-process/scar-video-process.yaml b/examples/video-process/scar-video-process.yaml new file mode 100644 index 00000000..21b0672c --- /dev/null +++ b/examples/video-process/scar-video-process.yaml @@ -0,0 +1,26 @@ +functions: + aws: + - lambda: + name: scar-batch-ffmpeg-split + init_script: split-video.sh + execution_mode: batch + container: + image: grycap/ffmpeg + input: + - storage_provider: s3 + path: scar-video/input + output: + - storage_provider: s3 + path: scar-video/split-images + - lambda: + name: scar-lambda-darknet + init_script: yolo-sample-object-detection.sh + memory: 3008 + container: + image: grycap/darknet + input: + - storage_provider: s3 + path: scar-video/split-images + output: + - storage_provider: s3 + path: scar-video/output diff --git a/fdl-example.yaml b/fdl-example.yaml new file mode 100644 index 00000000..7a2dc457 --- /dev/null +++ b/fdl-example.yaml @@ -0,0 +1,190 @@ +# This file shows all the possible values that can be defined to confire the functions and their linked services +# Most of this values are already defined in the SCAR default configuration file +# The values define in a configuration file are only applied to the function or functions being deployed +# To override permanently some of this values and apply them to all the deployed functions, please edit the SCAR default configuration file in ~/.scar/scar.cfg +# ---------------------------------------------------------------------------------------------------------------- +functions: + # Define different providers under this property. Only supported 'aws' + aws: + # Define a list of functions under this property. + # You can define the function properties and all its related services + # Possible values are 'lambda', 'iam', 'api_gateway', 'cloudwatch', 'batch' + # REQUIRED 'lambda' + - lambda: + # Boto profile used for the lambda client + # Default 'default' + # Must match the profiles in the file ~/.aws/credentials + boto_profile: default + # Region of the function, can be any region supported by AWS + # Default 'us-east-1' + region: us-east-1 + # Function's name + # REQUIRED + name: function1 + # Memory of the function, in MB, min 128, max 3008. Default '512' + memory: 1024 + # Maximum execution time in seconds, max 900. Default '300' + timeout: 300 + # Set job delegation or not + # Possible values 'lambda', 'lambda-batch', 'batch' + # Default 'lambda' + execution_mode: lambda + "description": "Automatically generated lambda function", + # Supervisor log level + # Can be INFO, DEBUG, ERROR, WARNING + # Default 'INFO' + log_level: INFO + # Lambda function's layers arn (max 4). + # SCAR adds the supervisor layer automatically + layers: + - arn:.... + # Environment variables of the function + # This variables are used in the lambda's environment, not the container's environment + environment: + Variables: + KEY1: val1 + KEY2: val2 + # Script executed inside of the function's container + init_script: ffmpeg-script.sh + # Define udocker container properties + container: + # Container image to use. REQUIRED + image: jrottenberg/ffmpeg:4.1-ubuntu + # Time used to post-process data generated by the container + # This time is substracted from the total time set for the function + # If there are a lot of files to upload as output, maybe this value has to be increased + # Default '10' seconds + timeout_threshold": 10 + # Environment variables of the container + # These variables are passed to the container environment, that is, can be accessed from the user's script + environment: + Variables: + KEY1: val1 + KEY2: val2 + # Define input storage providers linked with the function + input: + # Storage type + # Possible values 'minio', 's3', 'onedata' + - storage_provider: minio + # Complete path of the bucket with folders 'if any' + path: my-bucket/test + # Define output storage providers linked with the function + output: + - storage_provider: s3 + path: my-bucket/test-output + # Define optional filters to upload the output files based on prefix or suffix + # Possible values 'prefix', 'suffix' + suffix: + # List of suffixes to filter (can be any string) + - wav + - srt + prefix: + # List of prefixes to filter (can be any string) + - result- + # Properties for the faas-supervisor used in the inside the lambda function + supervisor: + # Must be a Github tag or "latest". Default 'latest' + version: latest + + + # Set IAM properties + iam: + boto_profile: default + # The Amazon Resource Name (ARN) of the function's execution role. + # This value is usually set for all the functions in the SCAR's default configuration file + # REQUIRED + role: "" + + + # Set API Gateway properties + # All these values are set by default + api_gateway: + boto_profile: default + region: us-east-1 + + + # Set CloudWatch properties + # All these values are set by default + cloudwatch: + boto_profile: default + region: us-east-1 + # Number of days that the functions logs are stored + log_retention_policy_in_days: 30 + + + # Set AWS Batch properties. + # Only used when execution mode in 'lambda' is set to 'lambda-batch' or 'batch' + batch: + boto_profile: default + region: us-east-1 + # The number of vCPUs reserved for the container + # Used in the job definition + # Default 1 + vcpus: 1 + # The hard limit (in MiB) of memory to present to the container + # Used in the job definition + # Default 1024 + memory: 1024 + # Request GPU resources for the launched container + # Default 'False'. Values 'False', 'True' + enable_gpu: False + # The full arn of the IAM role that allows AWS Batch to make calls to other AWS services on your behalf. + service_role: "arn:..." + # Environment variables passed to the batch container + environment: + Variables: + KEY1: val1 + KEY2: val2 + compute_resources: + # List of the Amazon EC2 security groups associated with instances launched in the compute environment + # REQUIRED when using batch + security_group_ids: + - sg-12345678 + # The desired number of Amazon EC2 vCPUS in the compute environment + # Default 0 + desired_v_cpus: 0 + # The minimum number of Amazon EC2 vCPUs that an environment should maintain + # Default 0 + min_v_cpus: 0 + # The maximum number of Amazon EC2 vCPUs that an environment can reach + # Default 2 + max_v_cpus: 2 + # List of the VPC subnets into which the compute resources are launched. + # REQUIRED when using batch + subnets: + - subnet-12345 + subnet-67891 + # The instances types that may be launched. + # You can specify instance families to launch any instance type within those families (for example, c5 or p3 ), or you can specify specific sizes within a family (such as c5.8xlarge ). + # You can also choose optimal to pick instance types (from the C, M, and R instance families) on the fly that match the demand of your job queues. + # Default 'm3.medium' + instance_types: + - "m3.medium" + # The Amazon ECS instance profile applied to Amazon EC2 instances in a compute environment + instance_role: "arn:..." + + +# Define different storage providers connections. Supported 's3','minio', 'onedata' +# If you use a default S3 storage with the default boto configuration, this properties are not needed. +storage_providers: + # Define S3 properties + # If used, REQUIRED properties are 'access_key', 'secret_key' + # The supervisor will try to create the boto3 client using the function permissions (in AWS Lambda environment) + s3: + access_key: awsuser + secret_key: awskey + region: us-east-1 + # Define minio properties + # If used, REQUIRED properties are 'access_key', 'secret_key' + minio: + endpoint: minio-endpoint + verify: True + region: us-east-1 + access_key: muser + secret_key: mpass + # Define onedata properties + # If used, REQUIRED properties are 'oneprovider_host', 'token', 'space' + onedata: + oneprovider_host: op-host + token: mytoken + space: onedata_space diff --git a/scar/cmdtemplate.py b/scar/cmdtemplate.py index 61f43695..a6009c2a 100644 --- a/scar/cmdtemplate.py +++ b/scar/cmdtemplate.py @@ -19,7 +19,6 @@ class CallType(Enum): INIT = "init" INVOKE = "invoke" RUN = "run" - UPDATE = "update" LS = "ls" RM = "rm" LOG = "log" @@ -42,10 +41,6 @@ def invoke(self): def run(self): pass - @abc.abstractmethod - def update(self): - pass - @abc.abstractmethod def ls(self): pass @@ -65,7 +60,3 @@ def put(self): @abc.abstractmethod def get(self): pass - - @abc.abstractmethod - def parse_arguments(self, args): - pass diff --git a/scar/exceptions.py b/scar/exceptions.py index db465cb2..e9fd04a1 100644 --- a/scar/exceptions.py +++ b/scar/exceptions.py @@ -103,6 +103,15 @@ class YamlFileNotFoundError(ScarError): fmt = "Unable to find the yaml file '{file_path}'" +class FdlFileNotFoundError(ScarError): + """ + The configuration file does not exist + + :ivar file_path: Path of the file + """ + fmt = "Unable to find the configuration file '{file_path}'" + + class ValidatorError(ScarError): """ An error occurred when validating a parameter @@ -156,6 +165,23 @@ class GitHubTagNotFoundError(ScarError): fmt = "The tag '{tag}' was not found in the GitHub repository." +class StorageProviderNotSupportedError(ScarError): + """ + The storage provider parsed is not supported + + :ivar provider: Provider specified + """ + fmt = "The storage provider '{provider}' is not supported." + + +class AuthenticationVariableNotSupportedError(ScarError): + """ + The authentication variable parsed is not supported + + :ivar auth_var: Authentication variable specified + """ + fmt = "The authentication variable '{auth_var}' is not supported." + ################################################ # # LAMBDA EXCEPTIONS ## ################################################ @@ -253,6 +279,7 @@ class InvocationPayloadError(ScarError): "Check AWS Lambda invocation limits in : " "https://docs.aws.amazon.com/lambda/latest/dg/limits.html") + class NotExistentApiGatewayWarning(ScarError): """ The API with the id 'restApiId' was not found. @@ -261,6 +288,7 @@ class NotExistentApiGatewayWarning(ScarError): """ fmt = "The requested API '{restApiId}' does not exist." + ################################################ # # IAM EXCEPTIONS ## ################################################ diff --git a/scar/parser/cfgfile.py b/scar/parser/cfgfile.py index ced73ee2..99cffbc0 100644 --- a/scar/parser/cfgfile.py +++ b/scar/parser/cfgfile.py @@ -21,41 +21,119 @@ _DEFAULT_CFG = { "scar": { - # Must be a tag or "latest" - "supervisor_version": "latest", - "config_version": "1.0.0" + "config_version": "1.0.9" }, "aws": { - "boto_profile": "default", - "region": "us-east-1", - "execution_mode": "lambda", - "iam": {"role": ""}, + "iam": {"boto_profile": "default", + "role": ""}, "lambda": { - "time": 300, + "boto_profile": "default", + "region": "us-east-1", + "execution_mode": "lambda", + "timeout": 300, "memory": 512, "description": "Automatically generated lambda function", - "timeout_threshold": 10, - "runtime": "python3.6", - "max_payload_size": 52428800, - "max_s3_payload_size": 262144000, - "layers": [] + "runtime": "python3.7", + "layers": [], + "invocation_type": "RequestResponse", + "asynchronous": False, + "log_type": "Tail", + "log_level": "INFO", + "environment": { + "Variables": { + "UDOCKER_BIN" : "/opt/udocker/bin/", + "UDOCKER_LIB" : "/opt/udocker/lib/", + "UDOCKER_DIR" : "/tmp/shared/udocker", + "UDOCKER_EXEC": "/opt/udocker/udocker.py"}}, + "deployment": { + "max_payload_size": 52428800, + "max_s3_payload_size": 262144000 + }, + "container": { + "environment" : { + "Variables" : {}}, + "timeout_threshold": 10 + }, + # Must be a Github tag or "latest" + "supervisor": { + "version": "latest", + 'layer_name' : "faas-supervisor", + 'license_info' : 'Apache 2.0' + } + }, + "s3": { + "boto_profile": "default", + "region": "us-east-1", + "event" : { + "Records": [{ + "eventSource": "aws:s3", + "s3" : { + "bucket" : { + "name": "{bucket_name}", + "arn": "arn:aws:s3:::{bucket_name}" + }, + "object" : { + "key": "{file_key}" + } + } + }] + } + }, + "api_gateway": { + "boto_profile": "default", + "region": "us-east-1", + "endpoint": "https://{api_id}.execute-api.{api_region}.amazonaws.com/{stage_name}/launch", + 'request_parameters': {"integration.request.header.X-Amz-Invocation-Type": + "method.request.header.X-Amz-Invocation-Type"}, + # ANY, DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT + 'http_method': "ANY", + "method" : { + # NONE, AWS_IAM, CUSTOM, COGNITO_USER_POOLS + 'authorizationType': "NONE", + 'requestParameters': {'method.request.header.X-Amz-Invocation-Type' : False}, + }, + "integration": { + # 'HTTP'|'AWS'|'MOCK'|'HTTP_PROXY'|'AWS_PROXY' + 'type': "AWS_PROXY", + 'integrationHttpMethod' : "POST", + 'uri' : "arn:aws:apigateway:{api_region}:lambda:path/2015-03-31/functions/arn:aws:lambda:{lambda_region}:{account_id}:function:{function_name}/invocations", + 'requestParameters' : {"integration.request.header.X-Amz-Invocation-Type": + "method.request.header.X-Amz-Invocation-Type"} + }, + 'path_part': "{proxy+}", + 'stage_name': "scar", + # Used to add invocation permissions to lambda + 'service_id': 'apigateway.amazonaws.com', + 'source_arn_testing': 'arn:aws:execute-api:{api_region}:{account_id}:{api_id}/*', + 'source_arn_invocation': 'arn:aws:execute-api:{api_region}:{account_id}:{api_id}/{stage_name}/ANY' + }, + "cloudwatch": { + "boto_profile": "default", + "region": "us-east-1", + "log_retention_policy_in_days": 30 }, - "cloudwatch": {"log_retention_policy_in_days": 30}, "batch": { + "boto_profile": "default", + "region": "us-east-1", "vcpus": 1, "memory": 1024, "enable_gpu": False, + "state": "ENABLED", + "type": "MANAGED", + "environment" : { + "Variables" : {}}, "compute_resources": { - "state": "ENABLED", - "type": "MANAGED", "security_group_ids": [], - "comp_type": "EC2", + "type": "EC2", "desired_v_cpus": 0, "min_v_cpus": 0, "max_v_cpus": 2, "subnets": [], - "instance_types": ["m3.medium"] - } + "instance_types": ["m3.medium"], + "launch_template_name": "faas-supervisor", + "instance_role": "arn:aws:iam::{account_id}:instance-profile/ecsInstanceRole" + }, + "service_role": "arn:aws:iam::{account_id}:role/service-role/AWSBatchServiceRole" } } } @@ -86,7 +164,7 @@ def _is_config_file_updated(self): if 'config_version' not in self.cfg_data['scar']: return False return StrUtils.compare_versions(self.cfg_data.get('scar', {}).get("config_version", ""), - _DEFAULT_CFG['scar']["config_version"]) <= 0 + _DEFAULT_CFG['scar']["config_version"]) >= 0 def get_properties(self): """Returns the configuration data of the configuration file.""" @@ -114,30 +192,3 @@ def _update_config_file(self): logger.info((f"New configuration file saved in '{self.config_file_path}'.\n" "Please fill your new configuration file with your account information.")) SysUtils.finish_scar_execution() - -# self._merge_files(self.cfg_data, _DEFAULT_CFG) -# self._delete_unused_data() -# with open(self.config_file_path, mode='w') as cfg_file: -# cfg_file.write(json.dumps(self.cfg_data, indent=2)) - -# def _add_missing_attributes(self): -# logger.info("Updating old scar config file '{0}'.\n".format(self.config_file_path)) -# FileUtils.copy_file(self.config_file_path, self.backup_file_path) -# logger.info("Old scar config file saved in '{0}'.\n".format(self.backup_file_path)) -# self._merge_files(self.cfg_data, _DEFAULT_CFG) -# self._delete_unused_data() -# with open(self.config_file_path, mode='w') as cfg_file: -# cfg_file.write(json.dumps(self.cfg_data, indent=2)) -# -# def _merge_files(self, cfg_data, default_data): -# for key, val in default_data.items(): -# if key not in cfg_data: -# cfg_data[key] = val -# elif isinstance(cfg_data[key], dict): -# self._merge_files(cfg_data[key], default_data[key]) -# -# def _delete_unused_data(self): -# if 'region' in self.cfg_data['aws']['lambda']: -# region = self.cfg_data['aws']['lambda'].pop('region', None) -# if region: -# self.cfg_data['aws']['region'] = region diff --git a/scar/parser/cli.py b/scar/parser/cli.py deleted file mode 100644 index c9f519b3..00000000 --- a/scar/parser/cli.py +++ /dev/null @@ -1,246 +0,0 @@ -# Copyright (C) GRyCAP - I3M - UPV -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import argparse -import sys -import scar.exceptions as excp -import scar.logger as logger -from scar.utils import DataTypesUtils -import scar.version as version - - -class CommandParser(object): - - def __init__(self, scar_cli): - self.scar_cli = scar_cli - self.create_parser() - self.create_parent_parsers() - self.add_subparsers() - - def create_parser(self): - self.parser = argparse.ArgumentParser(prog="scar", - description="Deploy containers in serverless architectures", - epilog="Run 'scar COMMAND --help' for more information on a command.") - self.parser.add_argument('--version', help='Show SCAR version.', dest="version", action="store_true", default=False) - - def create_parent_parsers(self): - self.create_function_definition_parser() - self.create_exec_parser() - self.create_output_parser() - self.create_profile_parser() - self.create_storage_parser() - - def create_function_definition_parser(self): - self.function_definition_parser = argparse.ArgumentParser(add_help=False) - self.function_definition_parser.add_argument("-d", "--description", help="Lambda function description.") - self.function_definition_parser.add_argument("-e", "--environment", action='append', help="Pass environment variable to the container (VAR=val). Can be defined multiple times.") - self.function_definition_parser.add_argument("-le", "--lambda-environment", action='append', help="Pass environment variable to the lambda function (VAR=val). Can be defined multiple times.") - self.function_definition_parser.add_argument("-m", "--memory", type=int, help="Lambda function memory in megabytes. Range from 128 to 3008 in increments of 64") - self.function_definition_parser.add_argument("-t", "--time", type=int, help="Lambda function maximum execution time in seconds. Max 900.") - self.function_definition_parser.add_argument("-tt", "--timeout-threshold", type=int, help="Extra time used to postprocess the data. This time is extracted from the total time of the lambda function.") - self.function_definition_parser.add_argument("-ll", "--log-level", help="Set the log level of the lambda function. Accepted values are: 'CRITICAL','ERROR','WARNING','INFO','DEBUG'", default="INFO") - self.function_definition_parser.add_argument("-l", "--layers", action='append', help="Pass layers ARNs to the lambda function. Can be defined multiple times.") - self.function_definition_parser.add_argument("-ib", "--input-bucket", help="Bucket name where the input files will be stored.") - self.function_definition_parser.add_argument("-ob", "--output-bucket", help="Bucket name where the output files are saved.") - self.function_definition_parser.add_argument("-em", "--execution-mode", help="Specifies the execution mode of the job. It can be 'lambda', 'lambda-batch' or 'batch'") - self.function_definition_parser.add_argument("-r", "--iam-role", help="IAM role used in the management of the functions") - self.function_definition_parser.add_argument("-sv", "--supervisor-version", help="FaaS Supervisor version. Can be a tag or 'latest'.") - # Batch (job definition) options - self.function_definition_parser.add_argument("-bm", "--batch-memory", help="Batch job memory in megabytes") - self.function_definition_parser.add_argument("-bc", "--batch-vcpus", help="Number of vCPUs reserved for the Batch container") - self.function_definition_parser.add_argument("-g", "--enable-gpu", help="Reserve one physical GPU for the Batch container (if it's available in the compute environment)", action="store_true") - - def create_exec_parser(self): - self.exec_parser = argparse.ArgumentParser(add_help=False) - self.exec_parser.add_argument("-a", "--asynchronous", help="Launch an asynchronous function.", action="store_true") - self.exec_parser.add_argument("-o", "--output-file", help="Save output as a file") - - def create_output_parser(self): - self.output_parser = argparse.ArgumentParser(add_help=False) - self.output_parser.add_argument("-j", "--json", help="Return data in JSON format", action="store_true") - self.output_parser.add_argument("-v", "--verbose", help="Show the complete aws output in json format", action="store_true") - - def create_profile_parser(self): - self.profile_parser = argparse.ArgumentParser(add_help=False) - self.profile_parser.add_argument("-pf", "--profile", help="AWS profile to use") - - def create_storage_parser(self): - self.storage_parser = argparse.ArgumentParser(add_help=False) - self.storage_parser.add_argument("-b", "--bucket", help="Bucket to use as storage", required=True) - self.storage_parser.add_argument("-p", "--path", help="Path of the file or folder", required=True) - - def add_subparsers(self): - self.subparsers = self.parser.add_subparsers(title='Commands') - self.add_init_parser() - self.add_invoke_parser() - self.add_run_parser() - self.add_update_parser() - self.add_rm_parser() - self.add_ls_parser() - self.add_log_parser() - self.add_put_parser() - self.add_get_parser() - - def add_init_parser(self): - parser_init = self.subparsers.add_parser('init', parents=[self.function_definition_parser, self.output_parser, self.profile_parser], help="Create lambda function") - # Set default function - parser_init.set_defaults(func=self.scar_cli.init) - # Lambda conf - group = parser_init.add_mutually_exclusive_group(required=True) - group.add_argument("-i", "--image", help="Container image id (i.e. centos:7)") - group.add_argument("-if", "--image-file", help="Container image file created with 'docker save' (i.e. centos.tar.gz)") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - parser_init.add_argument("-n", "--name", help="Lambda function name") - parser_init.add_argument("-s", "--init-script", help="Path to the input file passed to the function") - parser_init.add_argument("-ph", "--preheat", help="Preheats the function running it once and downloading the necessary container", action="store_true") - parser_init.add_argument("-ep", "--extra-payload", help="Folder containing files that are going to be added to the lambda function") - parser_init.add_argument("-db", "--deployment-bucket", help="Bucket where the deployment package is going to be uploaded.") - # API Gateway conf - parser_init.add_argument("-api", "--api-gateway-name", help="API Gateway name created to launch the lambda function") - - def add_invoke_parser(self): - parser_invoke = self.subparsers.add_parser('invoke', parents=[self.profile_parser, self.exec_parser], help="Call a lambda function using an HTTP request") - # Set default function - parser_invoke.set_defaults(func=self.scar_cli.invoke) - group = parser_invoke.add_mutually_exclusive_group(required=True) - group.add_argument("-n", "--name", help="Lambda function name") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - parser_invoke.add_argument("-db", "--data-binary", help="File path of the HTTP data to POST.") - parser_invoke.add_argument("-jd", "--json-data", help="JSON Body to Post") - parser_invoke.add_argument("-p", "--parameters", help="In addition to passing the parameters in the URL, you can pass the parameters here (i.e. '{\"key1\": \"value1\", \"key2\": [\"value2\", \"value3\"]}').") - - def add_update_parser(self): - parser_update = self.subparsers.add_parser('update', parents=[self.function_definition_parser, self.output_parser, self.profile_parser], help="Update function properties") - parser_update.set_defaults(func=self.scar_cli.update) - group = parser_update.add_mutually_exclusive_group(required=True) - group.add_argument("-n", "--name", help="Lambda function name") - group.add_argument("-a", "--all", help="Update all lambda functions", action="store_true") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - - def add_run_parser(self): - parser_run = self.subparsers.add_parser('run', parents=[self.output_parser, self.profile_parser, self.exec_parser], help="Deploy function") - parser_run.set_defaults(func=self.scar_cli.run) - group = parser_run.add_mutually_exclusive_group(required=True) - group.add_argument("-n", "--name", help="Lambda function name") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - parser_run.add_argument("-s", "--run-script", help="Path to the input file passed to the function") - parser_run.add_argument('c_args', nargs=argparse.REMAINDER, help="Arguments passed to the container.") - - def add_rm_parser(self): - parser_rm = self.subparsers.add_parser('rm', parents=[self.output_parser, self.profile_parser], help="Delete function") - parser_rm.set_defaults(func=self.scar_cli.rm) - group = parser_rm.add_mutually_exclusive_group(required=True) - group.add_argument("-n", "--name", help="Lambda function name") - group.add_argument("-a", "--all", help="Delete all lambda functions", action="store_true") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - - def add_log_parser(self): - parser_log = self.subparsers.add_parser('log', parents=[self.profile_parser], help="Show the logs for the lambda function") - parser_log.set_defaults(func=self.scar_cli.log) - group = parser_log.add_mutually_exclusive_group(required=True) - group.add_argument("-n", "--name", help="Lambda function name") - group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") - # CloudWatch args - parser_log.add_argument("-ls", "--log-stream-name", help="Return the output for the log stream specified.") - parser_log.add_argument("-ri", "--request-id", help="Return the output for the request id specified.") - - def add_ls_parser(self): - parser_ls = self.subparsers.add_parser('ls', parents=[self.output_parser, self.profile_parser], help="List lambda functions") - parser_ls.set_defaults(func=self.scar_cli.ls) - # S3 args - parser_ls.add_argument("-b", "--bucket", help="Show bucket files") - # Layer args - parser_ls.add_argument("-l", "--list-layers", help="Show lambda layers information", action="store_true") - - def add_put_parser(self): - parser_put = self.subparsers.add_parser('put', parents=[self.storage_parser, self.profile_parser], help="Upload file(s) to bucket") - parser_put.set_defaults(func=self.scar_cli.put) - - def add_get_parser(self): - parser_get = self.subparsers.add_parser('get', parents=[self.storage_parser, self.profile_parser], help="Download file(s) from bucket") - parser_get.set_defaults(func=self.scar_cli.get) - - @excp.exception(logger) - def parse_arguments(self): - """Command parsing and selection""" - try: - cmd_args = self.parser.parse_args() - if cmd_args.version: - print(f"SCAR {version.__version__}") - sys.exit(0) - - cmd_args = vars(cmd_args) - if 'func' not in cmd_args: - raise excp.MissingCommandError() - scar_args = self.parse_scar_args(cmd_args) - aws_args = self.parse_aws_args(cmd_args) - return DataTypesUtils.merge_dicts(scar_args, aws_args) - except AttributeError as aerr: - logger.error("Incorrect arguments: use scar -h to see the options available", - f"Error parsing arguments: {aerr}") - else: - raise - - def set_args(self, args, key, val): - if key and val: - args[key] = val - - def parse_aws_args(self, cmd_args): - aws_args = {} - other_args = [('profile', 'boto_profile'), 'region', 'execution_mode'] - self.set_args(aws_args, 'iam', self.parse_iam_args(cmd_args)) - self.set_args(aws_args, 'lambda', self.parse_lambda_args(cmd_args)) - self.set_args(aws_args, 'batch', self.parse_batch_args(cmd_args)) - self.set_args(aws_args, 'cloudwatch', self.parse_cloudwatchlogs_args(cmd_args)) - self.set_args(aws_args, 's3', self.parse_s3_args(cmd_args)) - self.set_args(aws_args, 'api_gateway', self.parse_api_gateway_args(cmd_args)) - aws_args.update(DataTypesUtils.parse_arg_list(other_args, cmd_args)) - return {'aws' : aws_args} - - def parse_scar_args(self, cmd_args): - scar_args = ['func', 'conf_file', 'json', - 'verbose', 'path', - 'preheat', 'execution_mode', - 'output_file', 'supervisor_version'] - return {'scar' : DataTypesUtils.parse_arg_list(scar_args, cmd_args)} - - def parse_lambda_args(self, cmd_args): - lambda_args = ['name', 'asynchronous', 'init_script', 'run_script', 'c_args', 'memory', 'time', - 'timeout_threshold', 'log_level', 'image', 'image_file', 'description', - 'lambda_role', 'extra_payload', ('environment', 'environment_variables'), - 'layers', 'lambda_environment', 'list_layers', 'all'] - return DataTypesUtils.parse_arg_list(lambda_args, cmd_args) - - def parse_batch_args(self, cmd_args): - batch_args = [('batch_vcpus', 'vcpus'), ('batch_memory', 'memory'), 'enable_gpu'] - return DataTypesUtils.parse_arg_list(batch_args, cmd_args) - - def parse_iam_args(self, cmd_args): - iam_args = [('iam_role', 'role')] - return DataTypesUtils.parse_arg_list(iam_args, cmd_args) - - def parse_cloudwatchlogs_args(self, cmd_args): - cw_log_args = ['log_stream_name', 'request_id'] - return DataTypesUtils.parse_arg_list(cw_log_args, cmd_args) - - def parse_api_gateway_args(self, cmd_args): - api_gtw_args = [('api_gateway_name', 'name'), 'parameters', 'data_binary', 'json_data'] - return DataTypesUtils.parse_arg_list(api_gtw_args, cmd_args) - - def parse_s3_args(self, cmd_args): - s3_args = ['deployment_bucket', - 'input_bucket', - 'output_bucket', - ('bucket', 'input_bucket')] - return DataTypesUtils.parse_arg_list(s3_args, cmd_args) diff --git a/scar/parser/cli/__init__.py b/scar/parser/cli/__init__.py new file mode 100644 index 00000000..f6827290 --- /dev/null +++ b/scar/parser/cli/__init__.py @@ -0,0 +1,196 @@ +# Copyright (C) GRyCAP - I3M - UPV +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +"""Module with methods and classes in charge +of parsing the SCAR CLI commands.""" + +import argparse +import sys +from typing import Dict, List +from scar.parser.cli.subparsers import Subparsers +from scar.parser.cli.parents import * +from scar.utils import DataTypesUtils, StrUtils, FileUtils +from scar.cmdtemplate import CallType +import scar.exceptions as excp +import scar.logger as logger +import scar.version as version + + +def _parse_aws_args(cmd_args: Dict) -> Dict: + aws_args = {} + other_args = [('profile', 'boto_profile'), 'region', 'execution_mode'] + _set_args(aws_args, 'iam', _parse_iam_args(cmd_args)) + _set_args(aws_args, 'lambda', _parse_lambda_args(cmd_args)) + _set_args(aws_args, 'batch', _parse_batch_args(cmd_args)) + _set_args(aws_args, 'cloudwatch', _parse_cloudwatchlogs_args(cmd_args)) + _set_args(aws_args, 'api_gateway', _parse_api_gateway_args(cmd_args)) + aws_args.update(DataTypesUtils.parse_arg_list(other_args, cmd_args)) + storage = _parse_s3_args(aws_args, cmd_args) + result = {'functions': {'aws': [aws_args]}} + if storage: + result.update(storage) + return result + + +def _set_args(args: Dict, key: str, val: str) -> None: + if key and val: + args[key] = val + + +def _parse_scar_args(cmd_args: Dict) -> Dict: + scar_args = ['conf_file', 'json', 'verbose', 'path', 'execution_mode', + 'output_file', 'supervisor_version', 'all'] + return {'scar' : DataTypesUtils.parse_arg_list(scar_args, cmd_args)} + + +def _parse_iam_args(cmd_args: Dict) -> Dict: + iam_args = [('iam_role', 'role')] + return DataTypesUtils.parse_arg_list(iam_args, cmd_args) + + +def _parse_lambda_args(cmd_args: Dict) -> Dict: + lambda_arg_list = ['name', 'asynchronous', 'init_script', 'run_script', 'c_args', 'memory', + 'timeout', 'timeout_threshold', 'image', 'image_file', 'description', + 'lambda_role', 'extra_payload', ('environment', 'environment_variables'), + 'layers', 'lambda_environment', 'list_layers', 'log_level', 'preheat'] + lambda_args = DataTypesUtils.parse_arg_list(lambda_arg_list, cmd_args) + # Standardize log level if defined + if "log_level" in lambda_args: + lambda_args['log_level'] = lambda_args['log_level'].upper() + # Parse environment variables + lambda_args.update(_get_lambda_environment_variables(lambda_args)) + return lambda_args + + +def _get_lambda_environment_variables(lambda_args: Dict) -> None: + lambda_env_vars = {"environment": {"Variables": {}}, + "container": {'environment' : {"Variables": {}}}} + if "environment_variables" in lambda_args: + # These variables define the udocker container environment variables + for env_var in lambda_args["environment_variables"]: + key_val = env_var.split("=") + # Add an specific prefix to be able to find the container variables defined by the user + lambda_env_vars['container']['environment']['Variables'][f'{key_val[0]}'] = key_val[1] + del(lambda_args['environment_variables']) + if "extra_payload" in lambda_args: + lambda_env_vars['container']['extra_payload'] = f"/var/task" + if "init_script" in lambda_args: + lambda_env_vars['container']['init_script'] = f"/var/task/{FileUtils.get_file_name(lambda_args['init_script'])}" + if "image" in lambda_args: + lambda_env_vars['container']['image'] = lambda_args.get('image') + del(lambda_args['image']) + if "image_file" in lambda_args: + lambda_env_vars['container']['image_file'] = lambda_args.get('image_file') + del(lambda_args['image_file']) + + if "lambda_environment" in lambda_args: + # These variables define the lambda environment variables + for env_var in lambda_args["lambda_environment"]: + key_val = env_var.split("=") + lambda_env_vars['environment']['Variables'][f'{key_val[0]}'] = key_val[1] + del(lambda_args['lambda_environment']) + return lambda_env_vars + + +def _parse_batch_args(cmd_args: Dict) -> Dict: + batch_args = [('batch_vcpus', 'vcpus'), ('batch_memory', 'memory'), 'enable_gpu'] + return DataTypesUtils.parse_arg_list(batch_args, cmd_args) + + +def _parse_cloudwatchlogs_args(cmd_args: Dict) -> Dict: + cw_log_args = ['log_stream_name', 'request_id'] + return DataTypesUtils.parse_arg_list(cw_log_args, cmd_args) + + +def _parse_s3_args(aws_args: Dict, cmd_args: Dict) -> Dict: + s3_arg_list = ['deployment_bucket', + 'input_bucket', + 'output_bucket', + ('bucket', 'input_bucket')] + + s3_args = DataTypesUtils.parse_arg_list(s3_arg_list, cmd_args) + storage = {} + if s3_args: + if 'deployment_bucket' in s3_args: + aws_args['lambda']['deployment'] = {'bucket': s3_args['deployment_bucket']} + if 'input_bucket' in s3_args: + aws_args['lambda']['input'] = [{'storage_provider': 's3', 'path': s3_args['input_bucket']}] + if 'output_bucket' in s3_args: + aws_args['lambda']['output'] = [{'storage_provider': 's3', 'path': s3_args['output_bucket']}] + storage['storage_providers'] = {'s3': {}} + return storage + + +def _parse_api_gateway_args(cmd_args: Dict) -> Dict: + api_gtw_args = [('api_gateway_name', 'name'), 'parameters', 'data_binary', 'json_data'] + return DataTypesUtils.parse_arg_list(api_gtw_args, cmd_args) + + +def _create_main_parser(): + parser = argparse.ArgumentParser(prog="scar", + description=("Deploy containers " + "in serverless architectures"), + epilog=("Run 'scar COMMAND --help' " + "for more information on a command.")) + parser.add_argument('--version', + help='Show SCAR version.', + dest="version", + action="store_true", + default=False) + return parser + + +def _create_parent_parsers() -> Dict: + parsers = {} + parsers['function_definition_parser'] = create_function_definition_parser() + parsers['exec_parser'] = create_exec_parser() + parsers['output_parser'] = create_output_parser() + parsers['profile_parser'] = create_profile_parser() + parsers['storage_parser'] = create_storage_parser() + return parsers + + +class CommandParser(): + """Class to manage the SCAR CLI commands.""" + + def __init__(self): + self.parser = _create_main_parser() + self.parent_parsers = _create_parent_parsers() + self._add_subparsers() + + def _add_subparsers(self) -> None: + subparsers = Subparsers(self.parser.add_subparsers(title='Commands'), self.parent_parsers) + # We need to define a subparser for each command defined in the CallType class + for cmd in CallType: + subparsers.add_subparser(cmd.value) + + @excp.exception(logger) + def parse_arguments(self) -> Dict: + """Command parsing and selection""" + try: + cmd_args = self.parser.parse_args() + if cmd_args.version: + print(f"SCAR {version.__version__}") + sys.exit(0) + cmd_args = vars(cmd_args) + if 'func' not in cmd_args: + raise excp.MissingCommandError() + scar_args = _parse_scar_args(cmd_args) + aws_args = _parse_aws_args(cmd_args) + return cmd_args['func'], DataTypesUtils.merge_dicts_with_copy(scar_args, aws_args) + except AttributeError as aerr: + logger.error("Incorrect arguments: use scar -h to see the options available", + f"Error parsing arguments: {aerr}") + except: + print("Unexpected error:", sys.exc_info()[0]) + raise diff --git a/scar/parser/cli/parents.py b/scar/parser/cli/parents.py new file mode 100644 index 00000000..3639f2f1 --- /dev/null +++ b/scar/parser/cli/parents.py @@ -0,0 +1,116 @@ +# Copyright (C) GRyCAP - I3M - UPV +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import argparse + + +def create_function_definition_parser(): + function_definition_parser = argparse.ArgumentParser(add_help=False) + function_definition_parser.add_argument("-d", "--description", + help="Lambda function description.") + function_definition_parser.add_argument("-e", "--environment", + action='append', + help=("Pass environment variable to the container " + "(VAR=val). Can be defined multiple times.")) + function_definition_parser.add_argument("-le", "--lambda-environment", + action='append', + help=("Pass environment variable to the lambda " + "function (VAR=val). Can be defined multiple " + "times.")) + function_definition_parser.add_argument("-m", "--memory", + type=int, + help=("Lambda function memory in megabytes. " + "Range from 128 to 3008 in increments of 64")) + function_definition_parser.add_argument("-t", "--timeout", + type=int, + help=("Lambda function maximum execution " + "time in seconds. Max 900.")) + function_definition_parser.add_argument("-tt", "--timeout-threshold", + type=int, + help=("Extra time used to postprocess the data. " + "This time is extracted from the total " + "time of the lambda function.")) + function_definition_parser.add_argument("-ll", "--log-level", + help=("Set the log level of the lambda function. " + "Accepted values are: " + "'CRITICAL','ERROR','WARNING','INFO','DEBUG'")) + function_definition_parser.add_argument("-l", "--layers", + action='append', + help=("Pass layers ARNs to the lambda function. " + "Can be defined multiple times.")) + function_definition_parser.add_argument("-ib", "--input-bucket", + help=("Bucket name where the input files " + "will be stored.")) + function_definition_parser.add_argument("-ob", "--output-bucket", + help=("Bucket name where the output files are saved.")) + function_definition_parser.add_argument("-em", "--execution-mode", + help=("Specifies the execution mode of the job. " + "It can be 'lambda', 'lambda-batch' or 'batch'")) + function_definition_parser.add_argument("-r", "--iam-role", + help=("IAM role used in the management of " + "the functions")) + function_definition_parser.add_argument("-sv", "--supervisor-version", + help=("FaaS Supervisor version. " + "Can be a tag or 'latest'.")) + # Batch (job definition) options + function_definition_parser.add_argument("-bm", "--batch-memory", + help="Batch job memory in megabytes") + function_definition_parser.add_argument("-bc", "--batch-vcpus", + help=("Number of vCPUs reserved for the " + "Batch container")) + function_definition_parser.add_argument("-g", "--enable-gpu", + help=("Reserve one physical GPU for the Batch " + "container (if it's available in the " + "compute environment)"), + action="store_true") + return function_definition_parser + + +def create_exec_parser(): + exec_parser = argparse.ArgumentParser(add_help=False) + exec_parser.add_argument("-a", "--asynchronous", + help="Launch an asynchronous function.", + action="store_true") + exec_parser.add_argument("-o", "--output-file", + help="Save output as a file") + return exec_parser + + +def create_output_parser(): + output_parser = argparse.ArgumentParser(add_help=False) + output_parser.add_argument("-j", "--json", + help="Return data in JSON format", + action="store_true") + output_parser.add_argument("-v", "--verbose", + help="Show the complete aws output in json format", + action="store_true") + return output_parser + + +def create_profile_parser(): + profile_parser = argparse.ArgumentParser(add_help=False) + profile_parser.add_argument("-pf", "--profile", + help="AWS profile to use") + return profile_parser + + +def create_storage_parser(): + storage_parser = argparse.ArgumentParser(add_help=False) + storage_parser.add_argument("-b", "--bucket", + help="Bucket to use as storage", + required=True) + storage_parser.add_argument("-p", "--path", + help="Path of the file or folder", + required=True) + return storage_parser diff --git a/scar/parser/cli/subparsers.py b/scar/parser/cli/subparsers.py new file mode 100644 index 00000000..3578034f --- /dev/null +++ b/scar/parser/cli/subparsers.py @@ -0,0 +1,161 @@ +# Copyright (C) GRyCAP - I3M - UPV +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +import argparse + +FUNCTION_DEFINITION = "function_definition_parser" +OUTPUT = "output_parser" +PROFILE = "profile_parser" +EXEC = "exec_parser" +STORAGE = "storage_parser" + +INIT_PARENTS = [PROFILE, FUNCTION_DEFINITION, OUTPUT] +INVOKE_PARENTS = [PROFILE, EXEC] +RUN_PARENTS = [PROFILE, EXEC, OUTPUT] +RM_LS_PARENTS = [PROFILE, OUTPUT] +LOG_PARENTS = [PROFILE] +PUT_GET_PARENTS = [PROFILE, STORAGE] + + +class Subparsers(): + + def __init__(self, subparser, parents): + self.subparser = subparser + self.parent_parsers = parents + + def _get_parents(self, parent_sublist): + return [self.parent_parsers.get(parent, "") for parent in parent_sublist] + + def add_subparser(self, name): + getattr(self, f'_add_{name}_parser')() + + def _add_init_parser(self): + init = self.subparser.add_parser('init', + parents=self._get_parents(INIT_PARENTS), + help="Create lambda function") + # Set default function + init.set_defaults(func="init") + # Lambda conf + group = init.add_mutually_exclusive_group(required=True) + group.add_argument("-i", "--image", + help="Container image id (i.e. centos:7)") + group.add_argument("-if", "--image-file", + help=("Container image file created with " + "'docker save' (i.e. centos.tar.gz)")) + group.add_argument("-f", "--conf-file", + help="Yaml file with the function configuration") + init.add_argument("-n", "--name", help="Lambda function name") + init.add_argument("-s", "--init-script", help=("Path to the input file " + "passed to the function")) + init.add_argument("-ph", "--preheat", + help=("Invokes the function once and downloads the container"), + action="store_true") + init.add_argument("-ep", "--extra-payload", + help=("Folder containing files that are going to be " + "added to the lambda function")) + init.add_argument("-db", "--deployment-bucket", + help="Bucket where the deployment package is going to be uploaded.") + # API Gateway conf + init.add_argument("-api", "--api-gateway-name", + help="API Gateway name created to launch the lambda function") + + def _add_invoke_parser(self): + invoke = self.subparser.add_parser('invoke', + parents=self._get_parents(INVOKE_PARENTS), + help="Call a lambda function using an HTTP request") + # Set default function + invoke.set_defaults(func='invoke') + group = invoke.add_mutually_exclusive_group(required=True) + group.add_argument("-n", "--name", + help="Lambda function name") + group.add_argument("-f", "--conf-file", + help="Yaml file with the function configuration") + invoke.add_argument("-db", "--data-binary", + help="File path of the HTTP data to POST.") + invoke.add_argument("-jd", "--json-data", + help="JSON Body to Post") + invoke.add_argument("-p", "--parameters", + help=("In addition to passing the parameters in the URL, " + "you can pass the parameters here (i.e. '{\"key1\": " + "\"value1\", \"key2\": [\"value2\", \"value3\"]}').")) + + def _add_run_parser(self): + run = self.subparser.add_parser('run', + parents=self._get_parents(RUN_PARENTS), + help="Deploy function") + # Set default function + run.set_defaults(func='run') + group = run.add_mutually_exclusive_group(required=True) + group.add_argument("-n", "--name", help="Lambda function name") + group.add_argument("-f", "--conf-file", help="Yaml file with the function configuration") + run.add_argument("-s", "--run-script", help="Path to the script passed to the function") + run.add_argument("-ib", "--input-bucket", help=("Bucket name with files to launch the function.")) + run.add_argument('c_args', + nargs=argparse.REMAINDER, + help="Arguments passed to the container.") + + def _add_rm_parser(self): + rm = self.subparser.add_parser('rm', + parents=self._get_parents(RM_LS_PARENTS), + help="Delete function") + # Set default function + rm.set_defaults(func='rm') + group = rm.add_mutually_exclusive_group(required=True) + group.add_argument("-n", "--name", + help="Lambda function name") + group.add_argument("-a", "--all", + help="Delete all lambda functions", + action="store_true") + group.add_argument("-f", "--conf-file", + help="Yaml file with the function configuration") + + def _add_log_parser(self): + log = self.subparser.add_parser('log', + parents=self._get_parents(LOG_PARENTS), + help="Show the logs for the lambda function") + # Set default function + log.set_defaults(func='log') + group = log.add_mutually_exclusive_group(required=True) + group.add_argument("-n", "--name", + help="Lambda function name") + group.add_argument("-f", "--conf-file", + help="Yaml file with the function configuration") + # CloudWatch args + log.add_argument("-ls", "--log-stream-name", + help="Return the output for the log stream specified.") + log.add_argument("-ri", "--request-id", + help="Return the output for the request id specified.") + + def _add_ls_parser(self): + ls = self.subparser.add_parser('ls', + parents=self._get_parents(RM_LS_PARENTS), + help="List lambda functions") + # Set default function + ls.set_defaults(func='ls') + # S3 args + ls.add_argument("-b", "--bucket", help="Show bucket files") + + def _add_put_parser(self): + put = self.subparser.add_parser('put', + parents=self._get_parents(PUT_GET_PARENTS), + help="Upload file(s) to bucket") + # Set default function + put.set_defaults(func='put') + + def _add_get_parser(self): + get = self.subparser.add_parser('get', + parents=self._get_parents(PUT_GET_PARENTS), + help="Download file(s) from bucket") + # Set default function + get.set_defaults(func='get') + diff --git a/scar/parser/fdl.py b/scar/parser/fdl.py new file mode 100644 index 00000000..fa4b19c6 --- /dev/null +++ b/scar/parser/fdl.py @@ -0,0 +1,42 @@ +# Copyright (C) GRyCAP - I3M - UPV +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from typing import Dict +from scar.utils import DataTypesUtils + +FAAS_PROVIDERS = ["aws", "openfaas"] + +def merge_conf(conf: Dict, yaml: Dict) -> Dict: + result = yaml.copy() + # We have to merge the default config with all the defined functions + for provider in FAAS_PROVIDERS: + for index, function in enumerate(result.get('functions', {}).get(provider, {})): + result['functions'][provider][index] = \ + DataTypesUtils.merge_dicts_with_copy(conf.get(provider,{}), function) + result['scar'] = DataTypesUtils.merge_dicts_with_copy(result.get('scar', {}), + conf.get('scar', {})) + return result + +def merge_cmd_yaml(cmd: Dict, yaml: Dict) -> Dict: + result = yaml.copy() + # We merge the cli commands with all the defined functions + # CLI only allows define AWS parameters + for cli_cmd in cmd.get('functions', {}).get("aws", {}): + for index, function in enumerate(result.get('functions', {}).get("aws", {})): + result['functions']['aws'][index] = \ + DataTypesUtils.merge_dicts_with_copy(function, cli_cmd) + result['scar'] = DataTypesUtils.merge_dicts_with_copy(result.get('scar', {}), + cmd.get('scar', {})) + result['storage_providers'] = DataTypesUtils.merge_dicts_with_copy(result.get('storage_providers', {}), + cmd.get('storage_providers', {})) + return result diff --git a/scar/parser/yaml.py b/scar/parser/yaml.py deleted file mode 100644 index bec1e665..00000000 --- a/scar/parser/yaml.py +++ /dev/null @@ -1,57 +0,0 @@ -# Copyright (C) GRyCAP - I3M - UPV -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import os -import yaml -from scar.exceptions import YamlFileNotFoundError -from scar.utils import DataTypesUtils - - -class YamlParser(object): - - def __init__(self, scar_args): - file_path = scar_args['conf_file'] - if os.path.isfile(file_path): - with open(file_path) as cfg_file: - self.yaml_data = yaml.safe_load(cfg_file) - else: - raise YamlFileNotFoundError(file_path=file_path) - - def parse_arguments(self): - functions = [] - for function in self.yaml_data['functions']: - functions.append(self.parse_aws_function(function, self.yaml_data['functions'][function])) - return functions[0] - - def parse_aws_function(self, function_name, function_data): - aws_args = {} - scar_args = {} - # Get function name - aws_args['lambda'] = self.parse_lambda_args(function_data) - aws_args['lambda']['name'] = function_name - aws_services = ['iam', 'cloudwatch', 's3', 'api_gateway', 'batch'] - aws_args.update(DataTypesUtils.parse_arg_list(aws_services, function_data)) - other_args = [('profile', 'boto_profile'), 'region', 'execution_mode'] - aws_args.update(DataTypesUtils.parse_arg_list(other_args, function_data)) - scar_args.update(DataTypesUtils.parse_arg_list(['supervisor_version'], function_data)) - scar = {'scar': scar_args if scar_args else {}} - aws = {'aws': aws_args if aws_args else {}} - return DataTypesUtils.merge_dicts(scar, aws) - - def parse_lambda_args(self, cmd_args): - lambda_args = ['asynchronous', 'init_script', 'run_script', 'c_args', 'memory', 'time', - 'timeout_threshold', 'log_level', 'image', 'image_file', 'description', - 'lambda_role', 'extra_payload', ('environment', 'environment_variables'), - 'layers', 'lambda_environment'] - return DataTypesUtils.parse_arg_list(lambda_args, cmd_args) diff --git a/scar/providers/aws/__init__.py b/scar/providers/aws/__init__.py index 57c60c38..195c17af 100644 --- a/scar/providers/aws/__init__.py +++ b/scar/providers/aws/__init__.py @@ -39,16 +39,18 @@ class GenericClient(): 'S3': S3Client, 'LAUNCHTEMPLATES': EC2Client} - def __init__(self, aws_properties: Dict): - self.aws = aws_properties - - def _get_client_args(self) -> Dict: - return {'client': {'region_name': self.aws.region}, - 'session': {'profile_name': self.aws.boto_profile}} + def __init__(self, resource_info: Dict =None): + self.properties = {} + if resource_info: + region = resource_info.get('region') + if region: + self.properties['client'] = {'region_name': region} + session = resource_info.get('boto_profile') + if session: + self.properties['session'] = {'profile_name': session} @lazy_property def client(self): """Returns the required boto client based on the implementing class name.""" - client_name = self.__class__.__name__.upper() - client = self._CLIENTS[client_name](**self._get_client_args()) + client = self._CLIENTS[self.__class__.__name__.upper()](self.properties) return client diff --git a/scar/providers/aws/apigateway.py b/scar/providers/aws/apigateway.py index 5ff8a213..23910fd9 100644 --- a/scar/providers/aws/apigateway.py +++ b/scar/providers/aws/apigateway.py @@ -17,63 +17,39 @@ from scar.providers.aws import GenericClient import scar.logger as logger -# 'HTTP'|'AWS'|'MOCK'|'HTTP_PROXY'|'AWS_PROXY' -_DEFAULT_TYPE = "AWS_PROXY" -_DEFAULT_INTEGRATION_METHOD = "POST" -_DEFAULT_REQUEST_PARAMETERS = {"integration.request.header.X-Amz-Invocation-Type": - "method.request.header.X-Amz-Invocation-Type"} -# ANY, DELETE, GET, HEAD, OPTIONS, PATCH, POST, PUT -_DEFAULT_HTTP_METHOD = "ANY" -# NONE, AWS_IAM, CUSTOM, COGNITO_USER_POOLS -_DEFAULT_AUTHORIZATION_TYPE = "NONE" -_DEFAULT_PATH_PART = "{proxy+}" -_DEFAULT_STAGE_NAME = "scar" - class APIGateway(GenericClient): """Manage the calls to the ApiGateway client.""" - def __init__(self, aws_properties) -> None: - super().__init__(aws_properties) - # {0}: lambda function region, {1}: aws account id, {1}: lambda function name - self.lambda_uri = "arn:aws:lambda:{region}:{acc_id}:function:{lambdaf_name}/invocations" - # {0}: api_region, {1}: lambda_uri - self.uri = "arn:aws:apigateway:{region}:lambda:path/2015-03-31/functions/{lambda_uri}" - # {0}: api_id, {1}: api_region - self.endpoint = "https://{api_id}.execute-api.{region}.amazonaws.com/scar/launch" - - def _get_uri(self) -> str: - lambda_uri_kwargs = {'region': self.aws.region, - 'acc_id': self.aws.account_id, - 'lambdaf_name': self.aws.lambdaf.name} - uri_kwargs = {'region': self.aws.region, - 'lambda_uri': self.lambda_uri.format(**lambda_uri_kwargs)} - return self.uri.format(**uri_kwargs) + def __init__(self, resources_info: Dict): + super().__init__(resources_info.get('api_gateway', {})) + self.resources_info = resources_info + self.api = self.resources_info.get('api_gateway', {}) - def _get_common_args(self, resource_info: Dict) -> Dict: - return {'restApiId' : self.aws.api_gateway.id, - 'resourceId' : resource_info.get('id', ''), - 'httpMethod' : _DEFAULT_HTTP_METHOD} + def _get_common_args(self) -> Dict: + return {'restApiId' : self.api.get('id', ''), + 'resourceId' : self.api.get('resource_id', ''), + 'httpMethod' : self.api.get('http_method', '')} - def _get_method_args(self, resource_info: Dict) -> Dict: - args = {'authorizationType' : _DEFAULT_AUTHORIZATION_TYPE, - 'requestParameters' : {'method.request.header.X-Amz-Invocation-Type' : False}} - method_args = self._get_common_args(resource_info) - method_args.update(args) - return method_args + def _get_method_args(self) -> Dict: + args = self._get_common_args() + args.update(self.api.get('method', {})) + return args - def _get_integration_args(self, resource_info: Dict) -> Dict: - args = {'type' : _DEFAULT_TYPE, - 'integrationHttpMethod' : _DEFAULT_INTEGRATION_METHOD, - 'uri' : self._get_uri(), - 'requestParameters' : _DEFAULT_REQUEST_PARAMETERS} - integration_args = self._get_common_args(resource_info) - integration_args.update(args) - return integration_args + def _get_integration_args(self) -> Dict: + integration_args = self.api.get('integration', {}) + uri_args = {'api_region': self.api.get('region', ''), + 'lambda_region': self.resources_info.get('lambda', {}).get('region', ''), + 'account_id': self.resources_info.get('iam', {}).get('account_id', ''), + 'function_name': self.resources_info.get('lambda', {}).get('name', '')} + integration_args['uri'] = integration_args['uri'].format(**uri_args) + args = self._get_common_args() + args.update(integration_args) + return args def _get_resource_id(self) -> str: res_id = "" - resources_info = self.client.get_resources(self.aws.api_gateway.id) + resources_info = self.client.get_resources(self.api.get('id', '')) for resource in resources_info['items']: if resource['path'] == '/': res_id = resource['id'] @@ -81,24 +57,33 @@ def _get_resource_id(self) -> str: return res_id def _set_api_gateway_id(self, api_info: Dict) -> None: - self.aws.api_gateway.id = api_info.get('id', '') + self.api['id'] = api_info.get('id', '') + # We store the parameter in the lambda configuration that + # is going to be uploaded to the Lambda service + self.resources_info['lambda']['environment']['Variables']['API_GATEWAY_ID'] = api_info.get('id', '') + + def _set_resource_info_id(self, resource_info: Dict) -> None: + self.api['resource_id'] = resource_info.get('id', '') def _get_endpoint(self) -> str: - kwargs = {'api_id': self.aws.api_gateway.id, 'region': self.aws.region} - return self.endpoint.format(**kwargs) + endpoint_args = {'api_id': self.api.get('id', ''), + 'api_region': self.api.get('region', ''), + 'stage_name': self.api.get('stage_name', '')} + return self.api.get('endpoint', '').format(**endpoint_args) def create_api_gateway(self) -> None: """Creates an Api Gateway endpoint.""" - api_info = self.client.create_rest_api(self.aws.api_gateway.name) + api_info = self.client.create_rest_api(self.api.get('name', '')) self._set_api_gateway_id(api_info) - resource_info = self.client.create_resource(self.aws.api_gateway.id, + resource_info = self.client.create_resource(self.api.get('id', ''), self._get_resource_id(), - _DEFAULT_PATH_PART) - self.client.create_method(**self._get_method_args(resource_info)) - self.client.set_integration(**self._get_integration_args(resource_info)) - self.client.create_deployment(self.aws.api_gateway.id, _DEFAULT_STAGE_NAME) + self.api.get('path_part', '')) + self._set_resource_info_id(resource_info) + self.client.create_method(**self._get_method_args()) + self.client.set_integration(**self._get_integration_args()) + self.client.create_deployment(self.api.get('id', ''), self.api.get('stage_name', '')) logger.info(f'API Gateway endpoint: {self._get_endpoint()}') - def delete_api_gateway(self, api_gateway_id: str) -> None: + def delete_api_gateway(self) -> None: """Deletes an Api Gateway endpoint.""" - return self.client.delete_rest_api(api_gateway_id) + return self.client.delete_rest_api(self.resources_info['lambda']['environment']['Variables']['API_GATEWAY_ID']) diff --git a/scar/providers/aws/batchfunction.py b/scar/providers/aws/batchfunction.py index 960c0153..f9caaf1c 100644 --- a/scar/providers/aws/batchfunction.py +++ b/scar/providers/aws/batchfunction.py @@ -12,14 +12,13 @@ # See the License for the specific language governing permissions and # limitations under the License. -import random from typing import Dict, List +import yaml from scar.providers.aws import GenericClient import scar.logger as logger from scar.providers.aws.launchtemplates import LaunchTemplates -from scar.utils import lazy_property, FileUtils, StrUtils - -_LAUNCH_TEMPLATE_NAME = 'faas-supervisor' +from scar.providers.aws.functioncode import create_function_config +from scar.utils import FileUtils, StrUtils def _get_job_definitions(jobs_info: Dict) -> List: @@ -29,170 +28,135 @@ def _get_job_definitions(jobs_info: Dict) -> List: class Batch(GenericClient): - @lazy_property - def launch_templates(self): - launch_templates = LaunchTemplates(self.aws, self.supervisor_version) - return launch_templates - - def __init__(self, aws_properties, supervisor_version): - super().__init__(aws_properties) - self.supervisor_version = supervisor_version - self.aws.batch.instance_role = (f"arn:aws:iam::{self.aws.account_id}:" - "instance-profile/ecsInstanceRole") - self.aws.batch.service_role = (f"arn:aws:iam::{self.aws.account_id}:" - "role/service-role/AWSBatchServiceRole") - self.aws.batch.env_vars = [] + def __init__(self, resources_info): + super().__init__(resources_info.get('batch')) + self.resources_info = resources_info + self.batch = resources_info.get('batch') + self.function_name = self.resources_info.get('lambda').get('name') - def _set_required_environment_variables(self): - self._set_batch_environment_variable('AWS_LAMBDA_FUNCTION_NAME', self.aws.lambdaf.name) + def _set_required_environment_variables(self) -> None: + self._set_batch_environment_variable('AWS_LAMBDA_FUNCTION_NAME', self.function_name) self._set_batch_environment_variable('SCRIPT', self._get_user_script()) - if (hasattr(self.aws.lambdaf, 'environment_variables') and - self.aws.lambdaf.environment_variables): - self._add_custom_environment_variables(self.aws.lambdaf.environment_variables) - if (hasattr(self.aws.lambdaf, 'lambda_environment') and - self.aws.lambdaf.lambda_environment): - self._add_custom_environment_variables(self.aws.lambdaf.lambda_environment) - if hasattr(self.aws, "s3"): - self._add_s3_environment_vars() - - def _add_custom_environment_variables(self, env_vars): - if isinstance(env_vars, dict): - for key, val in env_vars.items(): - self._set_batch_environment_variable(key, val) - else: - for env_var in env_vars: - self._set_batch_environment_variable(*env_var.split("=")) - - def _set_batch_environment_variable(self, key, value): - if key and value is not None: - self.aws.batch.env_vars.append({'name': key, 'value': value}) + self._set_batch_environment_variable('FUNCTION_CONFIG', self._get_config_file()) + if self.resources_info.get('lambda').get('container').get('environment').get('Variables', False): + for key, value in self.resources_info.get('lambda').get('container').get('environment').get('Variables').items(): + self._set_batch_environment_variable(key, value) - def _add_s3_environment_vars(self): - provider_id = random.randint(1, 1000001) - if hasattr(self.aws.s3, "input_bucket"): - self._set_batch_environment_variable(f'STORAGE_PATH_INPUT_{provider_id}', - self.aws.s3.storage_path_input) - if hasattr(self.aws.s3, "output_bucket"): - self._set_batch_environment_variable(f'STORAGE_PATH_OUTPUT_{provider_id}', - self.aws.s3.storage_path_output) - else: - self._set_batch_environment_variable(f'STORAGE_PATH_OUTPUT_{provider_id}', - self.aws.s3.storage_path_input) - self._set_batch_environment_variable(f'STORAGE_AUTH_S3_USER_{provider_id}', 'scar') + def _set_batch_environment_variable(self, key: str, value: str) -> None: + self.resources_info['batch']['environment']['Variables'].update({key: value}) - def _get_user_script(self): + def _get_user_script(self) -> str: script = '' - if hasattr(self.aws.lambdaf, "init_script"): - file_content = FileUtils.read_file(self.aws.lambdaf.init_script) + if self.resources_info.get('lambda').get('init_script', False): + file_content = FileUtils.read_file(self.resources_info.get('lambda').get('init_script')) script = StrUtils.utf8_to_base64_string(file_content) return script - def _delete_job_definitions(self, name): - job_definitions = [] - # Get IO definitions (if any) - kwargs = {"jobDefinitionName": '{0}-io'.format(name)} - io_job_info = self.client.describe_job_definitions(**kwargs) - job_definitions.extend(_get_job_definitions(io_job_info)) + def _get_config_file(self) -> str: + cfg_file = '' + config = create_function_config(self.resources_info) + yaml_str = yaml.safe_dump(config) + cfg_file = StrUtils.utf8_to_base64_string(yaml_str) + return cfg_file + + def _delete_job_definitions(self) -> None: # Get main job definition - kwargs = {"jobDefinitionName": name} + kwargs = {"jobDefinitionName": self.function_name} job_info = self.client.describe_job_definitions(**kwargs) - job_definitions.extend(_get_job_definitions(job_info)) - for job_def in job_definitions: + for job_def in _get_job_definitions(job_info): kwars = {"jobDefinition": job_def} self.client.deregister_job_definition(**kwars) - logger.info("Job definitions deleted") + logger.info("Job definitions successfully deleted.") - def _get_job_queue_info(self, name): - job_queue_info_args = {'jobQueues': [self._get_resource_name(name)]} + def _get_job_queue_info(self): + job_queue_info_args = {'jobQueues': [self.function_name]} return self.client.describe_job_queues(**job_queue_info_args) - def _delete_job_queue(self, name): - response = self._get_job_queue_info(name) + def _delete_job_queue(self): + response = self._get_job_queue_info() while response["jobQueues"]: state = response["jobQueues"][0]["state"] status = response["jobQueues"][0]["status"] if status == "VALID": - self._delete_valid_job_queue(state, name) - response = self._get_job_queue_info(name) + self._delete_valid_job_queue(state) + response = self._get_job_queue_info() - def _delete_valid_job_queue(self, state, name): + def _delete_valid_job_queue(self, state): if state == "ENABLED": - updating_args = {'jobQueue': self._get_resource_name(name), + updating_args = {'jobQueue': self.function_name, 'state': 'DISABLED'} self.client.update_job_queue(**updating_args) elif state == "DISABLED": - deleting_args = {'jobQueue': self._get_resource_name(name)} - logger.info("Job queue deleted") + deleting_args = {'jobQueue': self.function_name} self.client.delete_job_queue(**deleting_args) + logger.info("Job queue successfully deleted.") + + def _get_describe_compute_env_args(self): + return {'computeEnvironments': [self.function_name]} - def _get_compute_env_info(self, name): - creation_args = self._get_describe_compute_env_args(name_c=name) + def _get_compute_env_info(self): + creation_args = self._get_describe_compute_env_args() return self.client.describe_compute_environments(**creation_args) - def _delete_compute_env(self, name): - response = self._get_compute_env_info(name) + def _delete_compute_env(self): + response = self._get_compute_env_info() while response["computeEnvironments"]: state = response["computeEnvironments"][0]["state"] status = response["computeEnvironments"][0]["status"] if status == "VALID": - self._delete_valid_compute_environment(state, name) - response = self._get_compute_env_info(name) + self._delete_valid_compute_environment(state) + response = self._get_compute_env_info() - def _delete_valid_compute_environment(self, state, name): + def _delete_valid_compute_environment(self, state): if state == "ENABLED": - update_args = {'computeEnvironment': self._get_resource_name(name), + update_args = {'computeEnvironment': self.function_name, 'state': 'DISABLED'} self.client.update_compute_environment(**update_args) elif state == "DISABLED": - delete_args = {'computeEnvironment': self._get_resource_name(name)} - logger.info("Compute environment deleted") + delete_args = {'computeEnvironment': self.function_name} self.client.delete_compute_environment(**delete_args) + logger.info("Compute environment successfully deleted.") def _get_compute_env_args(self): + account_id = self.resources_info.get('iam').get('account_id') return { - 'computeEnvironmentName': self.aws.lambdaf.name, - 'serviceRole': self.aws.batch.service_role, - 'type': self.aws.batch.compute_resources['type'], - 'state': self.aws.batch.compute_resources['state'], + 'computeEnvironmentName': self.function_name, + 'serviceRole': self.batch.get('service_role').format(account_id=account_id), + 'type': self.batch.get('type'), + 'state': self.batch.get('state'), 'computeResources': { - 'type': self.aws.batch.compute_resources['comp_type'], - 'minvCpus': self.aws.batch.compute_resources['min_v_cpus'], - 'maxvCpus': self.aws.batch.compute_resources['max_v_cpus'], - 'desiredvCpus': self.aws.batch.compute_resources['desired_v_cpus'], - 'instanceTypes': self.aws.batch.compute_resources['instance_types'], - 'subnets': self.aws.batch.compute_resources['subnets'], - 'securityGroupIds': self.aws.batch.compute_resources['security_group_ids'], - 'instanceRole': self.aws.batch.instance_role, + 'type': self.batch.get('compute_resources').get('type'), + 'minvCpus': self.batch.get('compute_resources').get('min_v_cpus'), + 'maxvCpus': self.batch.get('compute_resources').get('max_v_cpus'), + 'desiredvCpus': self.batch.get('compute_resources').get('desired_v_cpus'), + 'instanceTypes': self.batch.get('compute_resources').get('instance_types'), + 'subnets': self.batch.get('compute_resources').get('subnets'), + 'securityGroupIds': self.batch.get('compute_resources').get('security_group_ids'), + 'instanceRole': self.batch.get('compute_resources').get('instance_role').format(account_id=account_id), 'launchTemplate': { - 'launchTemplateName': _LAUNCH_TEMPLATE_NAME, - 'version': str(self.launch_templates.get_launch_template_version()) + 'launchTemplateName': self.batch.get('compute_resources').get('launch_template_name'), + 'version': str(LaunchTemplates(self.resources_info).get_launch_template_version()) } } } def _get_creations_job_queue_args(self): return { - 'computeEnvironmentOrder': [{'computeEnvironment': self.aws.lambdaf.name, + 'computeEnvironmentOrder': [{'computeEnvironment': self.function_name, 'order': 1}, ], - 'jobQueueName': self.aws.lambdaf.name, + 'jobQueueName': self.function_name, 'priority': 1, - 'state': self.aws.batch.compute_resources['state'], + 'state': self.batch.get('state'), } - def _get_resource_name(self, name=None): - return name if name else self.aws.lambdaf.name - - def _get_describe_compute_env_args(self, name_c=None): - return {'computeEnvironments': [self._get_resource_name(name_c)]} - def _get_job_definition_args(self): job_def_args = { - 'jobDefinitionName': self.aws.lambdaf.name, + 'jobDefinitionName': self.function_name, 'type': 'container', 'containerProperties': { - 'image': self.aws.lambdaf.image, - 'memory': int(self.aws.batch.memory), - 'vcpus': int(self.aws.batch.vcpus), + 'image': self.resources_info.get('lambda').get('container').get('image'), + 'memory': int(self.batch.get('memory')), + 'vcpus': int(self.batch.get('vcpus')), 'command': [ '/bin/sh', '-c', @@ -206,7 +170,7 @@ def _get_job_definition_args(self): 'name': 'supervisor-bin' } ], - 'environment': self.aws.batch.env_vars, + 'environment': [{'name': key, 'value': value} for key, value in self.resources_info['batch']['environment']['Variables'].items()], 'mountPoints': [ { 'containerPath': '/opt/faas-supervisor/bin', @@ -215,7 +179,7 @@ def _get_job_definition_args(self): ] } } - if self.aws.batch.enable_gpu: + if self.batch.get('enable_gpu'): job_def_args['containerProperties']['resourceRequirements'] = [ { 'value': '1', @@ -224,8 +188,8 @@ def _get_job_definition_args(self): ] return job_def_args - def _get_state_and_status_of_compute_env(self, name=None): - creation_args = self._get_describe_compute_env_args(name_c=name) + def _get_state_and_status_of_compute_env(self): + creation_args = self._get_describe_compute_env_args() response = self.client.describe_compute_environments(**creation_args) return (response["computeEnvironments"][0]["state"], response["computeEnvironments"][0]["status"]) @@ -242,23 +206,23 @@ def create_batch_environment(self): self.client.create_job_queue(**creation_args) logger.info('Job queue successfully created.') creation_args = self._get_job_definition_args() - logger.info(f"Registering '{self.aws.lambdaf.name}' job definition.") + logger.info(f"Registering '{self.function_name}' job definition.") return self.client.register_job_definition(**creation_args) - def delete_compute_environment(self, name): - self._delete_job_definitions(name) - self._delete_job_queue(name) - self._delete_compute_env(name) + def delete_compute_environment(self): + self._delete_job_definitions() + self._delete_job_queue() + self._delete_compute_env() - def exist_compute_environments(self, name): - creation_args = self._get_describe_compute_env_args(name_c=name) + def exist_compute_environments(self): + creation_args = self._get_describe_compute_env_args() response = self.client.describe_compute_environments(**creation_args) return len(response["computeEnvironments"]) > 0 - def describe_jobs(self, job_id): - describe_args = {'jobs': [job_id]} + def get_jobs_with_request_id(self) -> Dict: + describe_args = {'jobs': [self.resources_info.get('cloudwatch').get('request_id')]} return self.client.describe_jobs(**describe_args) - def exist_job(self, job_id): - response = self.describe_jobs(job_id) - return len(response["jobs"]) != 0 +# def exist_job(self, job_id: str) -> bool: +# response = self.describe_jobs(job_id) +# return len(response["jobs"]) != 0 diff --git a/scar/providers/aws/clients/__init__.py b/scar/providers/aws/clients/__init__.py index 38d84a03..b583cc2f 100644 --- a/scar/providers/aws/clients/__init__.py +++ b/scar/providers/aws/clients/__init__.py @@ -26,19 +26,9 @@ class BotoClient(): _READ_TIMEOUT = 360 _BOTO_CLIENT_NAME = '' - def __init__(self, client: Dict = None, session: Dict = None): - """ - Default client args: - 'client' : {'region_name' : self.aws.region} - Default session args: - 'session' : {'profile_name' : self.aws.boto_profile} - """ - self.client_args = {} - if client: - self.client_args = client - self.session_args = {} - if session: - self.session_args = session + def __init__(self, client_args: Dict): + self.client_args = client_args.get('client', {}) + self.session_args = client_args.get('session', {}) @lazy_property def client(self): diff --git a/scar/providers/aws/clients/cloudwatchlogs.py b/scar/providers/aws/clients/cloudwatchlogs.py index 1de5d293..4c7b56a0 100644 --- a/scar/providers/aws/clients/cloudwatchlogs.py +++ b/scar/providers/aws/clients/cloudwatchlogs.py @@ -31,15 +31,6 @@ class CloudWatchLogsClient(BotoClient): @exception(logger) def get_log_events(self, **kwargs: Dict) -> List: """Lists log events from the specified log group.""" -# logs = [] -# kwargs = {} -# response = self.client.filter_log_events(**kwargs) -# logs.append(response) -# while 'nextToken' in response and (response['nextToken']): -# kwargs['nextToken'] = response['nextToken'] -# response = self.client.filter_log_events(**kwargs) -# logs.append(response) -# return logs log_events = [] logs_info = self.client.filter_log_events(**kwargs) log_events.extend(logs_info.get('events', [])) diff --git a/scar/providers/aws/clients/lambdafunction.py b/scar/providers/aws/clients/lambdafunction.py index d87dd3c3..6996541a 100644 --- a/scar/providers/aws/clients/lambdafunction.py +++ b/scar/providers/aws/clients/lambdafunction.py @@ -34,7 +34,7 @@ def create_function(self, **kwargs: Dict) -> Dict: logger.debug("Creating lambda function.") return self.client.create_function(**kwargs) - def get_function_info(self, function_name_or_arn: str) -> Dict: + def get_function_configuration(self, function_name_or_arn: str) -> Dict: """Returns the configuration information of the Lambda function.""" function_info = self.client.get_function_configuration(FunctionName=function_name_or_arn) @@ -42,6 +42,14 @@ def get_function_info(self, function_name_or_arn: str) -> Dict: function_info['SupervisorVersion'] = self.get_supervisor_version(function_info) return function_info + def get_function(self, function_name_or_arn: str) -> Dict: + """Returns the information of the Lambda function with a link to + download the deployment package that's valid for 10 minutes.""" + function_info = self.client.get_function(FunctionName=function_name_or_arn) + # Add supervisor version + function_info['SupervisorVersion'] = self.get_supervisor_version(function_info) + return function_info + @excp.exception(logger) def get_supervisor_version(self, function_info): version = '-' @@ -95,6 +103,21 @@ def list_layers(self, next_token: Optional[str]=None) -> List: layers.extend(self.list_layers(next_token=layers_info['NextMarker'])) return layers + @excp.exception(logger) + def list_layer_versions(self, layer_name: str, next_token: Optional[str]=None) -> str: + """Lists the versions of an AWS Lambda layer.""" + logger.debug(f'Listing versions of lambda layer "{layer_name}".') + versions = [] + kwargs = {'LayerName': layer_name} + if next_token: + kwargs['Marker'] = next_token + layer_versions_info = self.client.list_layer_versions(**kwargs) + if 'LayerVersions' in layer_versions_info and layer_versions_info['LayerVersions']: + versions.extend(layer_versions_info['LayerVersions']) + if 'NextMarker' in layer_versions_info: + versions.extend(self.list_layer_versions(layer_name, next_token=layer_versions_info['NextMarker'])) + return versions + @excp.exception(logger) def delete_function(self, function_name: str) -> Dict: """Deletes the specified Lambda diff --git a/scar/providers/aws/clients/s3.py b/scar/providers/aws/clients/s3.py index 0135f299..c0dbefbe 100644 --- a/scar/providers/aws/clients/s3.py +++ b/scar/providers/aws/clients/s3.py @@ -69,6 +69,21 @@ def download_file(self, **kwargs: Dict) -> Dict: """Download an object from S3 to a file-like object.""" return self.client.download_fileobj(**kwargs) + @exception(logger) + def is_folder(self, bucket: str, folder: str) -> bool: + """Checks if a file with the key specified exists.""" + try: + kwargs = {'Bucket' : bucket, + 'Key' : folder if folder.endswith('/') else folder + '/'} + # If this call works the folder exist + self.client.get_object(**kwargs) + return True + except ClientError as cerr: + # Folder not found + if cerr.response['Error']['Code'] == 'NoSuchKey': + return False + raise cerr + @exception(logger) def list_files(self, **kwargs: Dict) -> List: """Returns the keys of all the objects in a bucket. diff --git a/scar/providers/aws/cloudwatchlogs.py b/scar/providers/aws/cloudwatchlogs.py index a9205804..10db574f 100644 --- a/scar/providers/aws/cloudwatchlogs.py +++ b/scar/providers/aws/cloudwatchlogs.py @@ -14,9 +14,10 @@ """Module with classes and methods to manage the CloudWatch Log functionalities at high level.""" -from typing import List +from typing import List, Dict from botocore.exceptions import ClientError from scar.providers.aws import GenericClient +from scar.providers.aws.batchfunction import Batch import scar.logger as logger @@ -28,24 +29,29 @@ def _parse_events_in_message(log_events: List) -> str: class CloudWatchLogs(GenericClient): """Manages the AWS CloudWatch Logs functionality""" + + def __init__(self, resources_info: Dict): + super().__init__(resources_info.get('cloudwatch')) + self.resources_info = resources_info + self.cloudwatch = resources_info.get('cloudwatch') - def get_log_group_name(self, function_name=None): + def get_log_group_name(self, function_name: str=None) -> str: """Returns the log group matching the current lambda function being parsed.""" if function_name: return f'/aws/lambda/{function_name}' - return f'/aws/lambda/{self.aws.lambdaf.name}' + return f'/aws/lambda/{self.resources_info.get("lambda").get("name")}' - def _get_log_group_name_arg(self, function_name=None): + def _get_log_group_name_arg(self, function_name: str=None) -> Dict: return {'logGroupName' : self.get_log_group_name(function_name)} - def _is_end_line(self, line): - return line.startswith('REPORT') and self.aws.cloudwatch.request_id in line + def _is_end_line(self, line: str) -> bool: + return line.startswith('REPORT') and self.cloudwatch.get('request_id') in line - def _is_start_line(self, line): - return line.startswith('START') and self.aws.cloudwatch.request_id in line + def _is_start_line(self, line: str) -> bool: + return line.startswith('START') and self.cloudwatch.get('request_id') in line - def _parse_logs_with_requestid(self, function_logs): + def _parse_logs_with_requestid(self, function_logs: str) -> str: parsed_msg = "" if function_logs: in_req_id_logs = False @@ -60,36 +66,21 @@ def _parse_logs_with_requestid(self, function_logs): parsed_msg += f'{line}\n' return parsed_msg - def create_log_group(self): - """Creates a CloudWatch Log Group.""" - creation_args = self._get_log_group_name_arg() - creation_args['tags'] = self.aws.tags - response = self.client.create_log_group(**creation_args) - # Set retention policy into the log group - retention_args = self._get_log_group_name_arg() - retention_args['retentionInDays'] = self.aws.cloudwatch.log_retention_policy_in_days - self.client.set_log_retention_policy(**retention_args) - return response - - def delete_log_group(self, log_group_name): - """Deletes a CloudWatch Log Group.""" - return self.client.delete_log_group(log_group_name) - - def get_aws_log(self): + def _get_lambda_logs(self): """Returns Lambda logs for an specific lambda function.""" function_logs = "" try: kwargs = self._get_log_group_name_arg() - if hasattr(self.aws.cloudwatch, "log_stream_name"): - kwargs["logStreamNames"] = [self.aws.cloudwatch.log_stream_name] + if self.cloudwatch.get("log_stream_name", False): + kwargs["logStreamNames"] = [self.cloudwatch.get("log_stream_name")] function_logs = _parse_events_in_message(self.client.get_log_events(**kwargs)) - if hasattr(self.aws.cloudwatch, "request_id") and self.aws.cloudwatch.request_id: + if self.cloudwatch.get("request_id", False): function_logs = self._parse_logs_with_requestid(function_logs) except ClientError as cerr: logger.warning("Error getting the function logs: %s" % cerr) return function_logs - - def get_batch_job_log(self, jobs_info): + + def _get_batch_job_log(self, jobs_info: List) -> str: """Returns Batch logs for an specific job.""" batch_logs = "" if jobs_info: @@ -104,3 +95,27 @@ def get_batch_job_log(self, jobs_info): for event in response.get("events", {})] batch_logs += '\n'.join(msgs) return batch_logs + + def create_log_group(self) -> Dict: + """Creates a CloudWatch Log Group.""" + creation_args = self._get_log_group_name_arg() + creation_args['tags'] = self.resources_info.get('lambda').get('tags') + response = self.client.create_log_group(**creation_args) + # Set retention policy into the log group + retention_args = self._get_log_group_name_arg() + retention_args['retentionInDays'] = self.cloudwatch.get('log_retention_policy_in_days') + self.client.set_log_retention_policy(**retention_args) + return response + + def delete_log_group(self, log_group_name: str) -> Dict: + """Deletes a CloudWatch Log Group.""" + return self.client.delete_log_group(log_group_name) + + def get_aws_logs(self) -> str: + """Returns Cloudwatch logs for an specific lambda function and batch job (if any).""" + aws_logs = self._get_lambda_logs() + batch_logs = "" + if self.resources_info.get('cloudwatch').get('request_id', False): + batch_jobs = Batch(self.resources_info).get_jobs_with_request_id() + batch_logs = self._get_batch_job_log(batch_jobs["jobs"]) + return aws_logs + batch_logs if batch_logs else aws_logs diff --git a/scar/providers/aws/controller.py b/scar/providers/aws/controller.py index 38a2f6e5..294cf2ed 100644 --- a/scar/providers/aws/controller.py +++ b/scar/providers/aws/controller.py @@ -15,277 +15,387 @@ import os from typing import Dict +from copy import deepcopy from scar.cmdtemplate import Commands from scar.providers.aws.apigateway import APIGateway from scar.providers.aws.batchfunction import Batch from scar.providers.aws.cloudwatchlogs import CloudWatchLogs from scar.providers.aws.iam import IAM from scar.providers.aws.lambdafunction import Lambda -from scar.providers.aws.properties import AwsProperties, ScarProperties +# from scar.providers.aws.properties import AwsProperties, ScarProperties from scar.providers.aws.resourcegroups import ResourceGroups -from scar.providers.aws.s3 import S3 +from scar.providers.aws.s3 import S3, get_bucket_and_folders from scar.providers.aws.validators import AWSValidator -from scar.providers.aws.properties import ApiGatewayProperties import scar.exceptions as excp import scar.logger as logger import scar.providers.aws.response as response_parser -from scar.utils import lazy_property, StrUtils, FileUtils +from scar.utils import StrUtils, FileUtils, SupervisorUtils _ACCOUNT_ID_REGEX = r'\d{12}' -def _get_storage_provider_id(storage_provider: str, env_vars: Dict) -> str: - """Searches the storage provider id in the environment variables: - get_provider_id(S3, {'STORAGE_AUTH_S3_USER_41807' : 'scar'}) - returns -> 41807""" - res = "" - for env_key in env_vars.keys(): - if env_key.startswith(f'STORAGE_AUTH_{storage_provider}'): - res = env_key.split('_', 4)[-1] - break - return res +def _get_owner(resources_info: Dict): + return IAM(resources_info).get_user_name_or_id() + + +def _check_function_defined(resources_info: Dict): + if Lambda(resources_info).find_function(): + raise excp.FunctionExistsError(function_name=resources_info.get('lambda', {}).get('name', '')) + + +def _check_function_not_defined(resources_info: Dict): + if not Lambda(resources_info).find_function(): + raise excp.FunctionNotFoundError(function_name=resources_info.get('lambda', {}).get('name', '')) + + +def _choose_function(aws_resources: Dict) -> int: + function_names = [resources_info.get('lambda').get('name') for resources_info in aws_resources] + print("Please choose a function:") + print("0) Apply to all") + for idx, element in enumerate(function_names): + print(f"{idx+1}) {element}") + i = input("Enter number: ") + if 0 <= int(i) <= len(function_names): + return int(i) - 1 + return None + + +def _get_all_functions(resources_info: Dict): + arn_list = ResourceGroups(resources_info).get_resource_arn_list(IAM(resources_info).get_user_name_or_id()) + return Lambda(resources_info).get_all_functions(arn_list) + +def _check_preheat_function(resources_info: Dict): + if resources_info.get('lambda').get('preheat', False): + Lambda(resources_info).preheat_function() + +############################################ +### ADD EXTRA PROPERTIES ### +############################################ + + +def _add_extra_aws_properties(scar: Dict, aws_resources: Dict) -> None: + for resources_info in aws_resources: + _add_tags(resources_info) + _add_handler(resources_info) + _add_account_id(resources_info) + _add_output(scar) + _add_config_file_path(scar, resources_info) + + +def _add_tags(resources_info: Dict): + resources_info['lambda']['tags'] = {"createdby": "scar", "owner": _get_owner(resources_info)} + + +def _add_account_id(resources_info: Dict): + resources_info['iam']['account_id'] = StrUtils.find_expression(resources_info['iam']['role'], _ACCOUNT_ID_REGEX) + + +def _add_handler(resources_info: Dict): + resources_info['lambda']['handler'] = f"{resources_info.get('lambda').get('name')}.lambda_handler" + + +def _add_output(scar_info: Dict): + scar_info['cli_output'] = response_parser.OutputType.PLAIN_TEXT.value + if scar_info.get("json", False): + scar_info['cli_output'] = response_parser.OutputType.JSON.value + # Override json ouput if both of them are defined + if scar_info.get("verbose", False): + scar_info['cli_output'] = response_parser.OutputType.VERBOSE.value + if scar_info.get("output_file", False): + scar_info['cli_output'] = response_parser.OutputType.BINARY.value + + +def _add_config_file_path(scar_info: Dict, resources_info: Dict): + if scar_info.get("conf_file", False): + resources_info['lambda']['config_path'] = os.path.dirname(scar_info.get("conf_file")) + # Update the path of the files based on the path of the yaml (if any) + if resources_info['lambda'].get('init_script', False): + resources_info['lambda']['init_script'] = FileUtils.join_paths(resources_info['lambda']['config_path'], + resources_info['lambda']['init_script']) + if resources_info['lambda'].get('image_file', False): + resources_info['lambda']['image_file'] = FileUtils.join_paths(resources_info['lambda']['config_path'], + resources_info['lambda']['image_file']) + if resources_info['lambda'].get('run_script', False): + resources_info['lambda']['run_script'] = FileUtils.join_paths(resources_info['lambda']['config_path'], + resources_info['lambda']['run_script']) + +############################################ +### AWS CONTROLLER ### +############################################ class AWS(Commands): """AWS controller. Used to manage all the AWS calls and functionalities.""" - @lazy_property - def aws_lambda(self): - """It's called 'aws_lambda' because 'lambda' - it's a restricted word in python.""" - aws_lambda = Lambda(self.aws_properties, - self.scar_properties.supervisor_version) - return aws_lambda - - @lazy_property - def batch(self): - batch = Batch(self.aws_properties, - self.scar_properties.supervisor_version) - return batch - - @lazy_property - def cloudwatch_logs(self): - cloudwatch_logs = CloudWatchLogs(self.aws_properties) - return cloudwatch_logs - - @lazy_property - def api_gateway(self): - api_gateway = APIGateway(self.aws_properties) - return api_gateway - - @lazy_property - def aws_s3(self): - aws_s3 = S3(self.aws_properties) - return aws_s3 - - @lazy_property - def resource_groups(self): - resource_groups = ResourceGroups(self.aws_properties) - return resource_groups - - @lazy_property - def iam(self): - iam = IAM(self.aws_properties) - return iam + def __init__(self, func_call): + self.raw_args = FileUtils.load_tmp_config_file() + AWSValidator.validate_kwargs(self.raw_args) + self.aws_resources = self.raw_args.get('functions', {}).get('aws', {}) + self.storage_providers = self.raw_args.get('storage_providers', {}) + self.scar_info = self.raw_args.get('scar', {}) + _add_extra_aws_properties(self.scar_info, self.aws_resources) + # Call the user's command + getattr(self, func_call)() + +############################################ +### AWS COMMANDS ### +############################################ @excp.exception(logger) - def init(self): - if self.aws_lambda.find_function(): - raise excp.FunctionExistsError(function_name=self.aws_properties.lambdaf.name) - # We have to create the gateway before creating the function - self._create_api_gateway() - self._create_lambda_function() - self._create_log_group() - self._create_s3_buckets() - # The api_gateway permissions are added after the function is created - self._add_api_gateway_permissions() - self._create_batch_environment() - self._preheat_function() + def init(self) -> None: + for resources_info in self.aws_resources: + resources_info = deepcopy(resources_info) + _check_function_defined(resources_info) + # We have to create the gateway before creating the function + self._create_api_gateway(resources_info) + # Check the specified supervisor version + resources_info['lambda']['supervisor']['version'] = SupervisorUtils.check_supervisor_version( + resources_info.get('lambda').get('supervisor').get('version')) + self._create_lambda_function(resources_info) + self._create_log_group(resources_info) + self._create_s3_buckets(resources_info) + # The api_gateway permissions must be added after the function is created + self._add_api_gateway_permissions(resources_info) + self._create_batch_environment(resources_info) + _check_preheat_function(resources_info) @excp.exception(logger) def invoke(self): - self._update_local_function_properties() - response = self.aws_lambda.call_http_endpoint() - response_parser.parse_http_response(response, - self.aws_properties.lambdaf.name, - self.aws_properties.lambdaf.asynchronous, - self.aws_properties.output, - getattr(self.scar_properties, "output_file", "")) + index = 0 + if len(self.aws_resources) > 1: + index = _choose_function(self.aws_resources) + if index >= 0: + resources_info = self.aws_resources[index] + response = Lambda(resources_info).call_http_endpoint() + response_parser.parse_http_response(response, + resources_info, + self.scar_info) @excp.exception(logger) def run(self): - if hasattr(self.aws_properties, "s3") and hasattr(self.aws_properties.s3, "input_bucket"): - self._process_input_bucket_calls() - else: - if self.aws_lambda.is_asynchronous(): - self.aws_lambda.set_asynchronous_call_parameters() - self.aws_lambda.launch_lambda_instance() - - @excp.exception(logger) - def update(self): - if hasattr(self.aws_properties.lambdaf, "all") and self.aws_properties.lambdaf.all: - self._update_all_functions(self._get_all_functions()) - else: - self.aws_lambda.update_function_configuration() + index = 0 + if len(self.aws_resources) > 1: + index = _choose_function(self.aws_resources) + if index >= 0: + resources_info = self.aws_resources[index] + using_s3_bucket = False + if resources_info.get('lambda').get('input', False): + for storage in resources_info.get('lambda').get('input'): + if storage.get('storage_provider') == 's3': + print("This function has an associated 'S3' input bucket.") + response = input(f"Do you want to launch the function using the files in '{storage.get('path')}'? [Y/n] ") + if response in ('Y', 'y'): + using_s3_bucket = True + self._process_s3_input_bucket_calls(resources_info, storage) + if not using_s3_bucket: + response = Lambda(resources_info).launch_lambda_instance() + if self.scar_info.get("output_file", False): + response['OutputFile'] = self.scar_info.get("output_file") + response['OutputType'] = self.scar_info.get("cli_output") + response_parser.parse_invocation_response(**response) @excp.exception(logger) def ls(self): - if hasattr(self.aws_properties, "s3"): - file_list = self.aws_s3.get_bucket_file_list() + # If a bucket is defined, then we list their files + resources_info = self.aws_resources[0] + if resources_info.get('lambda').get('input', False): + file_list = S3(resources_info).get_bucket_file_list() for file_info in file_list: - print(file_info) + logger.info(file_info) else: - lambda_functions = self._get_all_functions() - response_parser.parse_ls_response(lambda_functions, - self.aws_properties.output) + # Return the resources of the region in the scar's configuration file + aws_resources = _get_all_functions(self.aws_resources[0]) + response_parser.parse_ls_response(aws_resources, self.scar_info.get('cli_output')) @excp.exception(logger) def rm(self): - if hasattr(self.aws_properties.lambdaf, "all") and self.aws_properties.lambdaf.all: - self._delete_all_resources() + if self.scar_info.get('all', False): + # Return the resources of the region in the scar's configuration file + for resources_info in _get_all_functions(self.aws_resources[0]): + self._delete_resources(resources_info) else: - function_info = self.aws_lambda.get_function_info(self.aws_properties.lambdaf.name) - self._delete_resources(function_info) + index = 0 + if len(self.aws_resources) > 1: + index = _choose_function(self.aws_resources) + # -1 means apply to all functions + if index == -1: + for resources_info in self.aws_resources: + _check_function_not_defined(resources_info) + self._delete_resources(resources_info) + else: + resources_info = self.aws_resources[index] + _check_function_not_defined(resources_info) + self._delete_resources(resources_info) @excp.exception(logger) def log(self): - aws_log = self.cloudwatch_logs.get_aws_log() - batch_logs = self._get_batch_logs() - aws_log += batch_logs if batch_logs else "" - print(aws_log) + index = 0 + if len(self.aws_resources) > 1: + index = _choose_function(self.aws_resources) + # We only return the logs of one function each time + if index >= 0: + logger.info(CloudWatchLogs(self.aws_resources[index]).get_aws_logs()) @excp.exception(logger) def put(self): - self.upload_file_or_folder_to_s3() + self._upload_file_or_folder_to_s3(self.aws_resources[0]) @excp.exception(logger) def get(self): - self.download_file_or_folder_from_s3() + self._download_file_or_folder_from_s3(self.aws_resources[0]) + +############################################################################# +### Methods to create AWS resources ### +############################################################################# - @AWSValidator.validate() @excp.exception(logger) - def parse_arguments(self, **kwargs): - self.raw_kwargs = kwargs - self.aws_properties = AwsProperties(kwargs.get('aws', {})) - self.scar_properties = ScarProperties(kwargs.get('scar', {})) - self.add_extra_aws_properties() - - def add_extra_aws_properties(self): - self._add_tags() - self._add_output() - self._add_account_id() - self._add_config_file_path() - - def _add_tags(self): - self.aws_properties.tags = {"createdby": "scar", "owner": self.iam.get_user_name_or_id()} - - def _add_output(self): - self.aws_properties.output = response_parser.OutputType.PLAIN_TEXT - if hasattr(self.scar_properties, "json") and self.scar_properties.json: - self.aws_properties.output = response_parser.OutputType.JSON - # Override json ouput if both of them are defined - if hasattr(self.scar_properties, "verbose") and self.scar_properties.verbose: - self.aws_properties.output = response_parser.OutputType.VERBOSE - if hasattr(self.scar_properties, "output_file") and self.scar_properties.output_file: - self.aws_properties.output = response_parser.OutputType.BINARY - self.aws_properties.output_file = self.scar_properties.output_file - - def _add_account_id(self): - self.aws_properties.account_id = StrUtils.find_expression(self.aws_properties.iam.role, - _ACCOUNT_ID_REGEX) - - def _add_config_file_path(self): - if hasattr(self.scar_properties, "conf_file") and self.scar_properties.conf_file: - self.aws_properties.config_path = os.path.dirname(self.scar_properties.conf_file) - - def _get_all_functions(self): - arn_list = self.resource_groups.get_resource_arn_list(self.iam.get_user_name_or_id()) - return self.aws_lambda.get_all_functions(arn_list) - - def _get_batch_logs(self) -> str: - logs = "" - if hasattr(self.aws_properties.cloudwatch, "request_id") and \ - self.batch.exist_job(self.aws_properties.cloudwatch.request_id): - batch_jobs = self.batch.describe_jobs(self.aws_properties.cloudwatch.request_id) - logs = self.cloudwatch_logs.get_batch_job_log(batch_jobs["jobs"]) - return logs + def _create_api_gateway(self, resources_info: Dict): + if resources_info.get("api_gateway", {}).get('name', False): + APIGateway(resources_info).create_api_gateway() @excp.exception(logger) - def _create_lambda_function(self): - response = self.aws_lambda.create_function() - acc_key = self.aws_lambda.client.get_access_key() + def _create_lambda_function(self, resources_info: Dict) -> None: + lambda_client = Lambda(resources_info) + response = lambda_client.create_function() response_parser.parse_lambda_function_creation_response(response, - self.aws_properties.lambdaf.name, - acc_key, - self.aws_properties.output) + self.scar_info.get('cli_output'), + lambda_client.get_access_key()) @excp.exception(logger) - def _create_log_group(self): - response = self.cloudwatch_logs.create_log_group() + def _create_log_group(self, resources_info: Dict) -> None: + cloudwatch_logs = CloudWatchLogs(resources_info) + response = cloudwatch_logs.create_log_group() response_parser.parse_log_group_creation_response(response, - self.cloudwatch_logs.get_log_group_name(), - self.aws_properties.output) + cloudwatch_logs.get_log_group_name(), + self.scar_info.get('cli_output')) + + @excp.exception(logger) + def _create_s3_buckets(self, resources_info: Dict) -> None: + if resources_info.get('lambda').get('input', False): + s3_service = S3(resources_info) + for bucket in resources_info.get('lambda').get('input'): + if bucket.get('storage_provider') == 's3': + bucket_name, folders = s3_service.create_bucket_and_folders(bucket.get('path')) + Lambda(resources_info).link_function_and_bucket(bucket_name) + s3_service.set_input_bucket_notification(bucket_name, folders) + if not folders: + logger.info(f'Input bucket "{bucket_name}" successfully created') + + if resources_info.get('lambda').get('output', False): + s3_service = S3(resources_info) + for bucket in resources_info.get('lambda').get('output'): + if bucket.get('storage_provider') == 's3': + bucket_name, folders = s3_service.create_bucket_and_folders(bucket.get('path')) + if not folders: + logger.info(f'Output bucket "{bucket_name}" successfully created') @excp.exception(logger) - def _create_s3_buckets(self): - if hasattr(self.aws_properties, "s3"): - if hasattr(self.aws_properties.s3, "input_bucket"): - self.aws_s3.create_input_bucket(create_input_folder=True) - self.aws_lambda.link_function_and_input_bucket() - self.aws_s3.set_input_bucket_notification() - if hasattr(self.aws_properties.s3, "output_bucket"): - self.aws_s3.create_output_bucket() - - def _create_api_gateway(self): - if hasattr(self.aws_properties, "api_gateway"): - self.api_gateway.create_api_gateway() - - def _add_api_gateway_permissions(self): - if hasattr(self.aws_properties, "api_gateway"): - self.aws_lambda.add_invocation_permission_from_api_gateway() - - def _create_batch_environment(self): - if self.aws_properties.execution_mode == "batch" or \ - self.aws_properties.execution_mode == "lambda-batch": - self.batch.create_batch_environment() - - def _preheat_function(self): - # If preheat is activated, the function is launched at the init step - if hasattr(self.scar_properties, "preheat"): - self.aws_lambda.preheat_function() - - def _process_input_bucket_calls(self): - s3_file_list = self.aws_s3.get_bucket_file_list() + def _add_api_gateway_permissions(self, resources_info: Dict): + if resources_info.get("api_gateway").get('name', False): + Lambda(resources_info).add_invocation_permission_from_api_gateway() + + @excp.exception(logger) + def _create_batch_environment(self, resources_info: Dict) -> None: + mode = resources_info.get('lambda').get('execution_mode') + if mode in ("batch", "lambda-batch"): + Batch(resources_info).create_batch_environment() + +############################################################################# +### Methods to delete AWS resources ### +############################################################################# + + def _delete_resources(self, resources_info: Dict) -> None: + # Delete associated api + self._delete_api_gateway(resources_info) + # Delete associated log + self._delete_logs(resources_info) + # Delete associated notifications + self._delete_bucket_notifications(resources_info) + # Delete function + self._delete_lambda_function(resources_info) + # Delete resources batch + self._delete_batch_resources(resources_info) + + def _delete_api_gateway(self, resources_info: Dict) -> None: + api_gateway_id = Lambda(resources_info).get_function_configuration().get('Environment').get('Variables').get('API_GATEWAY_ID') + if api_gateway_id: + resources_info['lambda']['environment']['Variables']['API_GATEWAY_ID'] = api_gateway_id + response = APIGateway(resources_info).delete_api_gateway() + response_parser.parse_delete_api_response(response, + api_gateway_id, + self.scar_info.get('cli_output')) + + def _delete_logs(self, resources_info: Dict): + cloudwatch_logs = CloudWatchLogs(resources_info) + log_group_name = cloudwatch_logs.get_log_group_name(resources_info.get('lambda').get('name')) + response = cloudwatch_logs.delete_log_group(log_group_name) + response_parser.parse_delete_log_response(response, + log_group_name, + self.scar_info.get('cli_output')) + + def _delete_bucket_notifications(self, resources_info: Dict) -> None: + lambda_client = Lambda(resources_info) + function_name = resources_info.get('lambda').get('name') + resources_info['lambda']['arn'] = lambda_client.get_function_configuration(function_name).get('FunctionArn') + resources_info['lambda']['input'] = lambda_client.get_fdl_config(function_name).get('input', False) + if resources_info.get('lambda').get('input'): + for input_storage in resources_info.get('lambda').get('input'): + if input_storage.get('storage_provider') == 's3': + bucket_name = input_storage.get('path').split("/", 1)[0] + S3(resources_info).delete_bucket_notification(bucket_name) + + def _delete_lambda_function(self, resources_info: Dict) -> None: + response = Lambda(resources_info).delete_function() + response_parser.parse_delete_function_response(response, + resources_info.get('lambda').get('name'), + self.scar_info.get('cli_output')) + + def _delete_batch_resources(self, resources_info: Dict) -> None: + batch = Batch(resources_info) + if batch.exist_compute_environments(): + batch.delete_compute_environment() + +########################################################### +### Methods to manage S3 resources ### +########################################################### + + def _process_s3_input_bucket_calls(self, resources_info: Dict, storage: Dict) -> None: + s3_service = S3(resources_info) + lambda_service = Lambda(resources_info) + s3_file_list = s3_service.get_bucket_file_list(storage) + bucket_name, _ = get_bucket_and_folders(storage.get('path')) logger.info(f"Files found: '{s3_file_list}'") # First do a request response invocation to prepare the lambda environment if s3_file_list: - s3_event = self.aws_s3.get_s3_event(s3_file_list.pop(0)) - self.aws_lambda.launch_request_response_event(s3_event) + s3_event = s3_service.get_s3_event(bucket_name, s3_file_list.pop(0)) + lambda_service.launch_request_response_event(s3_event) # If the list has more elements, invoke functions asynchronously if s3_file_list: - s3_event_list = self.aws_s3.get_s3_event_list(s3_file_list) - self.aws_lambda.process_asynchronous_lambda_invocations(s3_event_list) + s3_event_list = s3_service.get_s3_event_list(bucket_name, s3_file_list) + lambda_service.process_asynchronous_lambda_invocations(s3_event_list) - def upload_file_or_folder_to_s3(self): - path_to_upload = self.scar_properties.path - self.aws_s3.create_input_bucket() + def _upload_file_or_folder_to_s3(self, resources_info: Dict) -> None: + path_to_upload = self.scar_info.get('path') files = [path_to_upload] if os.path.isdir(path_to_upload): files = FileUtils.get_all_files_in_directory(path_to_upload) + s3_service = S3(resources_info) + storage_path = resources_info.get('lambda').get('input')[0].get('path') + bucket, folder = s3_service.create_bucket_and_folders(storage_path) for file_path in files: - self.aws_s3.upload_file(folder_name=self.aws_properties.s3.input_folder, - file_path=file_path) + s3_service.upload_file(bucket=bucket, folder_name=folder, file_path=file_path) def _get_download_file_path(self, file_key=None): file_path = file_key - if hasattr(self.scar_properties, "path") and self.scar_properties.path: - file_path = FileUtils.join_paths(self.scar_properties.path, file_path) + if self.scar_info.get('path', False): + file_path = FileUtils.join_paths(self.scar_info.get('path'), file_path) return file_path - def download_file_or_folder_from_s3(self): - bucket_name = self.aws_properties.s3.input_bucket - s3_file_list = self.aws_s3.get_bucket_file_list() + def _download_file_or_folder_from_s3(self, resources_info: Dict) -> None: + + s3_service = S3(resources_info) + s3_file_list = s3_service.get_bucket_file_list() for s3_file in s3_file_list: # Avoid download s3 'folders' if not s3_file.endswith('/'): @@ -294,76 +404,5 @@ def download_file_or_folder_from_s3(self): dir_path = os.path.dirname(file_path) if dir_path and not os.path.isdir(dir_path): os.makedirs(dir_path, exist_ok=True) - self.aws_s3.download_file(bucket_name, s3_file, file_path) - - def _update_all_functions(self, lambda_functions): - for function_info in lambda_functions: - self.aws_lambda.update_function_configuration(function_info) - - def _update_local_function_properties(self, function_info): - self._reset_aws_properties() - """Update the defined properties with the AWS information.""" - if function_info: - self.aws_properties.lambdaf.update_properties(**function_info) - if 'API_GATEWAY_ID' in self.aws_properties.lambdaf.environment['Variables']: - api_gtw_id = self.aws_properties.lambdaf.environment['Variables'].get('API_GATEWAY_ID', - "") - if hasattr(self.aws_properties, 'api_gateway'): - self.aws_properties.api_gateway.id = api_gtw_id - else: - self.aws_properties.api_gateway = ApiGatewayProperties({'id' : api_gtw_id}) - -############################################################################# -### Methods to delete AWS resources ### -############################################################################# - - def _delete_all_resources(self): - for function_info in self._get_all_functions(): - self._delete_resources(function_info) - - def _delete_resources(self, function_info): - function_name = function_info['FunctionName'] - if not self.aws_lambda.find_function(function_name): - raise excp.FunctionNotFoundError(function_name=function_name) - # Delete associated api - self._delete_api_gateway(function_info['Environment']['Variables']) - # Delete associated log - self._delete_logs(function_name) - # Delete associated notifications - self._delete_bucket_notifications(function_info['FunctionArn'], - function_info['Environment']['Variables']) - # Delete function - self._delete_lambda_function(function_name) - # Delete resources batch - self._delete_batch_resources(function_name) - - def _delete_api_gateway(self, function_env_vars): - api_gateway_id = function_env_vars.get('API_GATEWAY_ID') - if api_gateway_id: - response = self.api_gateway.delete_api_gateway(api_gateway_id) - response_parser.parse_delete_api_response(response, api_gateway_id, - self.aws_properties.output) - - def _delete_logs(self, function_name): - log_group_name = self.cloudwatch_logs.get_log_group_name(function_name) - response = self.cloudwatch_logs.delete_log_group(log_group_name) - response_parser.parse_delete_log_response(response, - log_group_name, - self.aws_properties.output) - - def _delete_bucket_notifications(self, function_arn, function_env_vars): - s3_provider_id = _get_storage_provider_id('S3', function_env_vars) - input_bucket_id = f'STORAGE_PATH_INPUT_{s3_provider_id}' if s3_provider_id else '' - if input_bucket_id in function_env_vars: - input_path = function_env_vars[input_bucket_id] - input_bucket_name = input_path.split("/", 1)[0] - self.aws_s3.delete_bucket_notification(input_bucket_name, function_arn) - - def _delete_lambda_function(self, function_name): - response = self.aws_lambda.delete_function(function_name) - response_parser.parse_delete_function_response(response, function_name, - self.aws_properties.output) - - def _delete_batch_resources(self, function_name): - if self.batch.exist_compute_environments(function_name): - self.batch.delete_compute_environment(function_name) + bucket, _ = get_bucket_and_folders(resources_info.get('lambda').get('input')[0].get('path')) + s3_service.download_file(bucket, s3_file, file_path) diff --git a/scar/providers/aws/functioncode.py b/scar/providers/aws/functioncode.py index 34697415..50d75f57 100644 --- a/scar/providers/aws/functioncode.py +++ b/scar/providers/aws/functioncode.py @@ -11,106 +11,100 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +from _ast import Or """Module with methods and classes to create the function deployment package.""" +from typing import Dict from zipfile import ZipFile from scar.providers.aws.udocker import Udocker from scar.providers.aws.validators import AWSValidator from scar.exceptions import exception import scar.logger as logger -from scar.http.request import get_file -from scar.utils import FileUtils, lazy_property, GitHubUtils, \ - GITHUB_USER, GITHUB_SUPERVISOR_PROJECT +from scar.utils import FileUtils -_INIT_SCRIPT_NAME = "init_script.sh" +def create_function_config(resources_info): + function_cfg = {'storage_providers': FileUtils.load_tmp_config_file().get('storage_providers', {})} + function_cfg.update(resources_info.get('lambda')) + return function_cfg class FunctionPackager(): """Class to manage the deployment package creation.""" - @lazy_property - def udocker(self): - """Udocker client""" - udocker = Udocker(self.aws, self.scar_tmp_function_folder_path, self._supervisor_zip_path) - return udocker - - def __init__(self, aws_properties, supervisor_version): - self.aws = aws_properties - self.supervisor_version = supervisor_version - self.scar_tmp_function_folder = FileUtils.create_tmp_dir() - self.scar_tmp_function_folder_path = self.scar_tmp_function_folder.name - self._supervisor_zip_path = FileUtils.join_paths(self.aws.lambdaf.tmp_folder_path, 'faas.zip') - - self.package_args = {} + def __init__(self, resources_info: Dict, supervisor_zip_path: str): + self.resources_info = resources_info + self.supervisor_zip_path = supervisor_zip_path + # Temporal folder to store the supervisor and udocker files + self.tmp_payload_folder = FileUtils.create_tmp_dir() @exception(logger) - def create_zip(self): + def create_zip(self, lambda_payload_path: str) -> None: """Creates the lambda function deployment package.""" - self._download_faas_supervisor_zip() self._extract_handler_code() self._manage_udocker_images() self._add_init_script() self._add_extra_payload() - self._zip_scar_folder() + self._copy_function_configuration() + self._zip_scar_folder(lambda_payload_path) self._check_code_size() - def _download_faas_supervisor_zip(self) -> None: - supervisor_zip_url = GitHubUtils.get_source_code_url( - GITHUB_USER, - GITHUB_SUPERVISOR_PROJECT, - self.supervisor_version) - with open(self._supervisor_zip_path, "wb") as thezip: - thezip.write(get_file(supervisor_zip_url)) - def _extract_handler_code(self) -> None: - function_handler_dest = FileUtils.join_paths(self.scar_tmp_function_folder_path, f"{self.aws.lambdaf.name}.py") + function_handler_dest = FileUtils.join_paths(self.tmp_payload_folder.name, f"{self.resources_info.get('lambda').get('name')}.py") file_path = "" - with ZipFile(self._supervisor_zip_path) as thezip: + with ZipFile(self.supervisor_zip_path) as thezip: for file in thezip.namelist(): if file.endswith("function_handler.py"): - file_path = FileUtils.join_paths(self.aws.lambdaf.tmp_folder_path, file) - thezip.extract(file, self.aws.lambdaf.tmp_folder_path) + file_path = FileUtils.join_paths(FileUtils.get_tmp_dir(), file) + # Extracts the complete folder structure and the file (cannot avoid) + thezip.extract(file, FileUtils.get_tmp_dir()) break - FileUtils.copy_file(file_path, function_handler_dest) + if file_path: + # Copy only the handler to the payload folder + FileUtils.copy_file(file_path, function_handler_dest) - def _manage_udocker_images(self): - if hasattr(self.aws.lambdaf, "image") and \ - hasattr(self.aws, "s3") and \ - hasattr(self.aws.s3, "deployment_bucket"): - self.udocker.download_udocker_image() - if hasattr(self.aws.lambdaf, "image_file"): - if hasattr(self.aws, "config_path"): - self.aws.lambdaf.image_file = FileUtils.join_paths(self.aws.config_path, - self.aws.lambdaf.image_file) - self.udocker.prepare_udocker_image() + def _copy_function_configuration(self): + cfg_file_path = FileUtils.join_paths(self.tmp_payload_folder.name, "function_config.yaml") + function_cfg = create_function_config(self.resources_info) + FileUtils.write_yaml(cfg_file_path, function_cfg) - def _add_init_script(self): - if hasattr(self.aws.lambdaf, "init_script"): - if hasattr(self.aws, "config_path"): - self.aws.lambdaf.init_script = FileUtils.join_paths(self.aws.config_path, - self.aws.lambdaf.init_script) - FileUtils.copy_file(self.aws.lambdaf.init_script, - FileUtils.join_paths(self.scar_tmp_function_folder_path, _INIT_SCRIPT_NAME)) - self.aws.lambdaf.environment['Variables']['INIT_SCRIPT_PATH'] = \ - f"/var/task/{_INIT_SCRIPT_NAME}" - - def _add_extra_payload(self): - if hasattr(self.aws.lambdaf, "extra_payload"): - logger.info("Adding extra payload from {0}".format(self.aws.lambdaf.extra_payload)) - FileUtils.copy_dir(self.aws.lambdaf.extra_payload, self.scar_tmp_function_folder_path) - self.aws.lambdaf.environment['Variables']['EXTRA_PAYLOAD'] = "/var/task" - - def _zip_scar_folder(self): - FileUtils.zip_folder(self.aws.lambdaf.zip_file_path, - self.scar_tmp_function_folder_path, - "Creating function package") + def _manage_udocker_images(self): + if self.resources_info.get('lambda').get('container').get('image_file', False) or \ + self.resources_info.get('lambda').get('deployment').get('bucket', False): + Udocker(self.resources_info, self.tmp_payload_folder.name, self.supervisor_zip_path).prepare_udocker_image() + + def _add_init_script(self) -> None: + """Copy the init script defined by the user to the payload folder.""" + if self.resources_info.get('lambda').get('init_script', False): + init_script_path = self.resources_info.get('lambda').get('init_script') + FileUtils.copy_file(init_script_path, + FileUtils.join_paths(self.tmp_payload_folder.name, + FileUtils.get_file_name(init_script_path))) + + def _add_extra_payload(self) -> None: + if self.resources_info.get('lambda').get('extra_payload', False): + payload_path = self.resources_info.get('lambda').get('extra_payload') + logger.info(f"Adding extra payload '{payload_path}'") + if FileUtils.is_file(payload_path): + FileUtils.copy_file(self.resources_info.get('lambda').get('extra_payload'), + self.tmp_payload_folder.name) + else: + FileUtils.copy_dir(self.resources_info.get('lambda').get('extra_payload'), + self.tmp_payload_folder.name) + del(self.resources_info['lambda']['extra_payload']) + + def _zip_scar_folder(self, lambda_payload_path: str) -> None: + """Zips the tmp folder with all the function's files and + save it in the expected path of the payload.""" + FileUtils.zip_folder(lambda_payload_path, + self.tmp_payload_folder.name, + "Creating function package.") def _check_code_size(self): # Check if the code size fits within the AWS limits - if hasattr(self.aws, "s3") and hasattr(self.aws.s3, "deployment_bucket"): - AWSValidator.validate_s3_code_size(self.scar_tmp_function_folder_path, - self.aws.lambdaf.max_s3_payload_size) + if self.resources_info.get('lambda').get('deployment').get('bucket', False): + AWSValidator.validate_s3_code_size(self.tmp_payload_folder.name, + self.resources_info.get('lambda').get('deployment').get('max_s3_payload_size')) else: - AWSValidator.validate_function_code_size(self.scar_tmp_function_folder_path, - self.aws.lambdaf.max_payload_size) + AWSValidator.validate_function_code_size(self.tmp_payload_folder.name, + self.resources_info.get('lambda').get('deployment').get('max_payload_size')) diff --git a/scar/providers/aws/iam.py b/scar/providers/aws/iam.py index b00713d6..5130a585 100644 --- a/scar/providers/aws/iam.py +++ b/scar/providers/aws/iam.py @@ -17,6 +17,9 @@ class IAM(GenericClient): + def __init__(self, resources_info) -> None: + super().__init__(resources_info.get('iam')) + def get_user_name_or_id(self): user = self.client.get_user_info() if user: diff --git a/scar/providers/aws/lambdafunction.py b/scar/providers/aws/lambdafunction.py index 59cfe936..2a27901c 100644 --- a/scar/providers/aws/lambdafunction.py +++ b/scar/providers/aws/lambdafunction.py @@ -14,194 +14,123 @@ import base64 import json -import random +import io +from typing import Dict from multiprocessing.pool import ThreadPool +from zipfile import ZipFile, BadZipfile +import yaml from botocore.exceptions import ClientError +from scar.http.request import call_http_endpoint, get_file from scar.providers.aws import GenericClient from scar.providers.aws.functioncode import FunctionPackager from scar.providers.aws.lambdalayers import LambdaLayers from scar.providers.aws.s3 import S3 from scar.providers.aws.validators import AWSValidator import scar.exceptions as excp -import scar.http.request as request import scar.logger as logger -import scar.providers.aws.response as response_parser -from scar.utils import lazy_property, DataTypesUtils, FileUtils, StrUtils +from scar.utils import DataTypesUtils, FileUtils, StrUtils, SupervisorUtils +from scar.parser.cfgfile import ConfigFileParser + MAX_CONCURRENT_INVOCATIONS = 500 +ASYNCHRONOUS_CALL = {"invocation_type": "Event", + "log_type": "None", + "asynchronous": "True"} +REQUEST_RESPONSE_CALL = {"invocation_type": "RequestResponse", + "log_type": "Tail", + "asynchronous": "False"} class Lambda(GenericClient): - @lazy_property - def layers(self): - layers = LambdaLayers(self.client, self.supervisor_version) - return layers - - @lazy_property - def s3(self): - s3 = S3(self.aws) - return s3 - - def __init__(self, aws_properties, supervisor_version): - super().__init__(aws_properties) - self.supervisor_version = supervisor_version - self._initialize_properties(aws_properties) - - def _initialize_properties(self, aws_properties): - self.aws.lambdaf.environment = {'Variables': {}} - self.aws.lambdaf.invocation_type = "RequestResponse" - self.aws.lambdaf.log_type = "Tail" - self.aws.lambdaf.layers = [] - self.aws.lambdaf.tmp_folder = FileUtils.create_tmp_dir() - self.aws.lambdaf.tmp_folder_path = self.aws.lambdaf.tmp_folder.name - self.aws.lambdaf.zip_file_path = FileUtils.join_paths(self.aws.lambdaf.tmp_folder_path, 'function.zip') - if hasattr(self.aws.lambdaf, "name"): - self.aws.lambdaf.handler = "{0}.lambda_handler".format(self.aws.lambdaf.name) - if not hasattr(self.aws.lambdaf, "asynchronous"): - self.aws.lambdaf.asynchronous = False - self._set_default_call_parameters() - - def _set_default_call_parameters(self): - self.asynchronous_call_parameters = {"invocation_type": "Event", - "log_type": "None", - "asynchronous": "True"} - self.request_response_call_parameters = {"invocation_type": "RequestResponse", - "log_type": "Tail", - "asynchronous": "False"} - - def _get_creations_args(self): - return {'FunctionName': self.aws.lambdaf.name, - 'Runtime': self.aws.lambdaf.runtime, - 'Role': self.aws.iam.role, - 'Handler': self.aws.lambdaf.handler, - 'Code': self.aws.lambdaf.code, - 'Environment': self.aws.lambdaf.environment, - 'Description': self.aws.lambdaf.description, - 'Timeout': self.aws.lambdaf.time, - 'MemorySize': self.aws.lambdaf.memory, - 'Tags': self.aws.tags, - 'Layers': self.aws.lambdaf.layers} + def __init__(self, resources_info: Dict) -> None: + super().__init__(resources_info.get('lambda', {})) + self.resources_info = resources_info + self.function = resources_info.get('lambda', {}) + self.supervisor_version = resources_info.get('lambda').get('supervisor').get('version') + + def _get_creations_args(self, zip_payload_path: str, supervisor_zip_path: str) -> Dict: + return {'FunctionName': self.function.get('name'), + 'Runtime': self.function.get('runtime'), + 'Role': self.resources_info.get('iam').get('role'), + 'Handler': self.function.get('handler'), + 'Code': self._get_function_code(zip_payload_path, supervisor_zip_path), + 'Environment': self.function.get('environment'), + 'Description': self.function.get('description'), + 'Timeout': self.function.get('timeout'), + 'MemorySize': self.function.get('memory'), + 'Tags': self.function.get('tags'), + 'Layers': self.function.get('layers')} def is_asynchronous(self): - return self.aws.lambdaf.asynchronous + return self.function.get('asynchronous', False) + + def get_access_key(self) -> str: + """Returns the access key belonging to the boto_profile used.""" + return self.client.get_access_key() @excp.exception(logger) def create_function(self): - self._manage_supervisor_layer() - self._set_environment_variables() - self._set_function_code() - creation_args = self._get_creations_args() + # Create tmp folders + supervisor_path = FileUtils.create_tmp_dir() + tmp_folder = FileUtils.create_tmp_dir() + # Download supervisor + supervisor_zip_path = SupervisorUtils.download_supervisor( + self.supervisor_version, + supervisor_path.name + ) + # Manage supervisor layer + self._manage_supervisor_layer(supervisor_zip_path) + # Create function + zip_payload_path = FileUtils.join_paths(tmp_folder.name, 'function.zip') + self._set_image_id() + creation_args = self._get_creations_args(zip_payload_path, supervisor_zip_path) response = self.client.create_function(**creation_args) if response and "FunctionArn" in response: - self.aws.lambdaf.arn = response.get('FunctionArn', "") + self.function['arn'] = response.get('FunctionArn', "") return response - def _manage_supervisor_layer(self): - self.layers.check_faas_supervisor_layer() - self.aws.lambdaf.layers.append(self.layers.get_latest_supervisor_layer_arn()) - - def _add_lambda_environment_variable(self, key, value): - if key and value: - self.aws.lambdaf.environment['Variables'][key] = value + def _set_image_id(self): + image = self.function.get('container').get('image') + if image: + self.function['environment']['Variables']['IMAGE_ID'] = image - def _add_custom_environment_variables(self, env_vars, prefix=''): - if type(env_vars) is dict: - for key, val in env_vars.items(): - # Add an specific prefix to be able to find the variables defined by the user - self._add_lambda_environment_variable('{0}{1}'.format(prefix, key), val) - else: - for env_var in env_vars: - key_val = env_var.split("=") - # Add an specific prefix to be able to find the variables defined by the user - self._add_lambda_environment_variable('{0}{1}'.format(prefix, key_val[0]), key_val[1]) - - def _set_environment_variables(self): - # Add required variables - self._set_required_environment_variables() - # Add explicitly user defined variables - if hasattr(self.aws.lambdaf, "environment_variables"): - self._add_custom_environment_variables(self.aws.lambdaf.environment_variables, prefix='CONT_VAR_') - # Add explicitly user defined variables - if hasattr(self.aws.lambdaf, "lambda_environment"): - self._add_custom_environment_variables(self.aws.lambdaf.lambda_environment) - - def _set_required_environment_variables(self): - self._add_lambda_environment_variable('SUPERVISOR_TYPE', 'LAMBDA') - self._add_lambda_environment_variable('TIMEOUT_THRESHOLD', str(self.aws.lambdaf.timeout_threshold)) - self._add_lambda_environment_variable('LOG_LEVEL', self.aws.lambdaf.log_level) - self._add_udocker_variables() - self._add_execution_mode() - self._add_s3_environment_vars() - if hasattr(self.aws.lambdaf, "image"): - self._add_lambda_environment_variable('IMAGE_ID', self.aws.lambdaf.image) - if hasattr(self.aws, "api_gateway"): - self._add_lambda_environment_variable('API_GATEWAY_ID', self.aws.api_gateway.id) - - def _add_udocker_variables(self): - self._add_lambda_environment_variable('UDOCKER_EXEC', "/opt/udocker/udocker.py") - self._add_lambda_environment_variable('UDOCKER_DIR', "/tmp/shared/udocker") - self._add_lambda_environment_variable('UDOCKER_LIB', "/opt/udocker/lib/") - self._add_lambda_environment_variable('UDOCKER_BIN', "/opt/udocker/bin/") - - def _add_execution_mode(self): - self._add_lambda_environment_variable('EXECUTION_MODE', self.aws.execution_mode) - # if (self.aws.execution_mode == 'lambda-batch' or self.aws.execution_mode == 'batch'): - # self._add_lambda_environment_variable('BATCH_SUPERVISOR_IMG', self.aws.batch.supervisor_image) - - def _add_s3_environment_vars(self): - if hasattr(self.aws, "s3"): - provider_id = random.randint(1, 1000001) - - if hasattr(self.aws.s3, "input_bucket"): - self._add_lambda_environment_variable( - f'STORAGE_PATH_INPUT_{provider_id}', - self.aws.s3.storage_path_input - ) - - if hasattr(self.aws.s3, "output_bucket"): - self._add_lambda_environment_variable( - f'STORAGE_PATH_OUTPUT_{provider_id}', - self.aws.s3.storage_path_output - ) - else: - self._add_lambda_environment_variable( - f'STORAGE_PATH_OUTPUT_{provider_id}', - self.aws.s3.storage_path_input - ) - self._add_lambda_environment_variable( - f'STORAGE_AUTH_S3_USER_{provider_id}', - 'scar' - ) + def _manage_supervisor_layer(self, supervisor_zip_path: str) -> None: + layers_client = LambdaLayers(self.resources_info, self.client, supervisor_zip_path) + self.function.get('layers', []).append(layers_client.get_supervisor_layer_arn()) @excp.exception(logger) - def _set_function_code(self): - # Zip all the files and folders needed - FunctionPackager(self.aws, self.supervisor_version).create_zip() - if hasattr(self.aws, "s3") and hasattr(self.aws.s3, 'deployment_bucket'): - self._upload_to_S3() - self.aws.lambdaf.code = {"S3Bucket": self.aws.s3.deployment_bucket, "S3Key": self.aws.s3.file_key} + def _get_function_code(self, zip_payload_path: str, supervisor_zip_path: str) -> Dict: + '''Zip all the files and folders needed.''' + code = {} + FunctionPackager(self.resources_info, supervisor_zip_path).create_zip(zip_payload_path) + if self.function.get('deployment').get('bucket', False): + file_key = f"lambda/{self.function.get('name')}.zip" + s3_client = S3(self.resources_info) + s3_client.create_bucket(self.function.get('deployment').get('bucket')) + s3_client.upload_file(bucket=self.function.get('deployment').get('bucket'), + file_path=zip_payload_path, + file_key=file_key) + code = {"S3Bucket": self.function.get('deployment').get('bucket'), + "S3Key": file_key} else: - self.aws.lambdaf.code = {"ZipFile": FileUtils.read_file(self.aws.lambdaf.zip_file_path, mode="rb")} + code = {"ZipFile": FileUtils.read_file(zip_payload_path, mode="rb")} + return code - def _upload_to_S3(self): - self.aws.s3.input_bucket = self.aws.s3.deployment_bucket - self.aws.s3.file_key = 'lambda/{0}.zip'.format(self.aws.lambdaf.name) - self.s3.upload_file(file_path=self.aws.lambdaf.zip_file_path, file_key=self.aws.s3.file_key) + def delete_function(self): + return self.client.delete_function(self.resources_info.get('lambda').get('name')) - def delete_function(self, function_name): - return self.client.delete_function(function_name) - - def link_function_and_input_bucket(self): - kwargs = {'FunctionName' : self.aws.lambdaf.name, - 'Principal' : "s3.amazonaws.com", - 'SourceArn' : 'arn:aws:s3:::{0}'.format(self.aws.s3.input_bucket)} + def link_function_and_bucket(self, bucket_name: str) -> None: + kwargs = {'FunctionName': self.function.get('name'), + 'Principal': "s3.amazonaws.com", + 'SourceArn': f'arn:aws:s3:::{bucket_name}'} self.client.add_invocation_permission(**kwargs) def preheat_function(self): logger.info("Preheating function") self._set_request_response_call_parameters() - return self.launch_lambda_instance() + self.launch_lambda_instance() + logger.info("Preheating successful") def _launch_async_event(self, s3_event): self.set_asynchronous_call_parameters() @@ -212,14 +141,13 @@ def launch_request_response_event(self, s3_event): return self._launch_s3_event(s3_event) def _launch_s3_event(self, s3_event): - self.aws.lambdaf.payload = s3_event + self.function['payload'] = s3_event logger.info(f"Sending event for file '{s3_event['Records'][0]['s3']['object']['key']}'") return self.launch_lambda_instance() def process_asynchronous_lambda_invocations(self, s3_event_list): if (len(s3_event_list) > MAX_CONCURRENT_INVOCATIONS): - s3_file_chunk_list = DataTypesUtils.divide_list_in_chunks(s3_event_list, MAX_CONCURRENT_INVOCATIONS) - for s3_file_chunk in s3_file_chunk_list: + for s3_file_chunk in DataTypesUtils.divide_list_in_chunks(s3_event_list, MAX_CONCURRENT_INVOCATIONS): self._launch_concurrent_lambda_invocations(s3_file_chunk) else: self._launch_concurrent_lambda_invocations(s3_event_list) @@ -230,107 +158,87 @@ def _launch_concurrent_lambda_invocations(self, s3_event_list): pool.close() def launch_lambda_instance(self): + if self.is_asynchronous(): + self.set_asynchronous_call_parameters() response = self._invoke_lambda_function() response_args = {'Response' : response, - 'FunctionName' : self.aws.lambdaf.name, - 'OutputType' : self.aws.output, - 'IsAsynchronous' : self.aws.lambdaf.asynchronous} - if hasattr(self.aws, "output_file"): - response_args['OutputFile'] = self.aws.output_file - response_parser.parse_invocation_response(**response_args) + 'FunctionName' : self.function.get('name'), + 'IsAsynchronous' : self.function.get('asynchronous')} + return response_args def _get_invocation_payload(self): # Default payload - payload = self.aws.lambdaf.payload if hasattr(self.aws.lambdaf, 'payload') else {} - if not payload: + payload = self.function.get('payload', {}) + if not payload: # Check for defined run script - if hasattr(self.aws.lambdaf, "run_script"): - script_path = self.aws.lambdaf.run_script - if hasattr(self.aws, "config_path"): - script_path = FileUtils.join_paths(self.aws.config_path, script_path) + if self.function.get("run_script", False): + script_path = self.function.get("run_script") # We first code to base64 in bytes and then decode those bytes to allow the json lib to parse the data # https://stackoverflow.com/questions/37225035/serialize-in-json-a-base64-encoded-data#37239382 - payload = { "script" : StrUtils.bytes_to_base64str(FileUtils.read_file(script_path, 'rb')) } + payload = {"script": StrUtils.bytes_to_base64str(FileUtils.read_file(script_path, 'rb'))} # Check for defined commands # This overrides any other function payload - if hasattr(self.aws.lambdaf, "c_args"): - payload = {"cmd_args" : json.dumps(self.aws.lambdaf.c_args)} + if self.function.get("c_args", False): + payload = {"cmd_args": json.dumps(self.function.get("c_args"))} return json.dumps(payload) def _invoke_lambda_function(self): - invoke_args = {'FunctionName' : self.aws.lambdaf.name, - 'InvocationType' : self.aws.lambdaf.invocation_type, - 'LogType' : self.aws.lambdaf.log_type, - 'Payload' : self._get_invocation_payload()} + invoke_args = {'FunctionName': self.function.get('name'), + 'InvocationType': self.function.get('invocation_type'), + 'LogType': self.function.get('log_type'), + 'Payload': self._get_invocation_payload()} return self.client.invoke_function(**invoke_args) def set_asynchronous_call_parameters(self): - self.aws.lambdaf.update_properties(**self.asynchronous_call_parameters) + self.function.update(ASYNCHRONOUS_CALL) def _set_request_response_call_parameters(self): - self.aws.lambdaf.update_properties(**self.request_response_call_parameters) - - def _update_environment_variables(self, function_info, update_args): - # To update the environment variables we need to retrieve the - # variables defined in lambda and update them with the new values - env_vars = self.aws.lambdaf.environment - if hasattr(self.aws.lambdaf, "environment_variables"): - for env_var in self.aws.lambdaf.environment_variables: - key_val = env_var.split("=") - # Add an specific prefix to be able to find the variables defined by the user - env_vars['Variables']['CONT_VAR_{0}'.format(key_val[0])] = key_val[1] - if hasattr(self.aws.lambdaf, "timeout_threshold"): - env_vars['Variables']['TIMEOUT_THRESHOLD'] = str(self.aws.lambdaf.timeout_threshold) - if hasattr(self.aws.lambdaf, "log_level"): - env_vars['Variables']['LOG_LEVEL'] = self.aws.lambdaf.log_level - function_info['Environment']['Variables'].update(env_vars['Variables']) - update_args['Environment'] = function_info['Environment'] - - def _update_supervisor_layer(self, function_info, update_args): - if hasattr(self.aws.lambdaf, "supervisor_layer"): - # Set supervisor layer Arn - function_layers = [self.layers.get_latest_supervisor_layer_arn()] - # Add the rest of layers (if exist) - if 'Layers' in function_info: - function_layers.extend([layer for layer in function_info['Layers'] if self.layers.layer_name not in layer['Arn']]) - update_args['Layers'] = function_layers - - def update_function_configuration(self, function_info=None): - if not function_info: - function_info = self.get_function_info() - update_args = {'FunctionName' : function_info['FunctionName'] } -# if hasattr(self.aws.lambdaf, "memory"): -# update_args['MemorySize'] = self.aws.lambdaf.memory -# else: -# update_args['MemorySize'] = function_info['MemorySize'] -# if hasattr(self.aws.lambdaf, "time"): -# update_args['Timeout'] = self.aws.lambdaf.time -# else: -# update_args['Timeout'] = function_info['Timeout'] - self._update_environment_variables(function_info, update_args) - self._update_supervisor_layer(function_info, update_args) - self.client.update_function_configuration(**update_args) - logger.info("Function '{}' updated successfully.".format(function_info['FunctionName'])) + self.function.update(REQUEST_RESPONSE_CALL) def _get_function_environment_variables(self): - return self.get_function_info()['Environment'] + return self.get_function_configuration()['Environment'] + + def merge_aws_and_local_configuration(self, aws_conf: Dict) -> Dict: + result = ConfigFileParser().get_properties().get('aws') + result['lambda']['name'] = aws_conf['FunctionName'] + result['lambda']['arn'] = aws_conf['FunctionArn'] + result['lambda']['timeout'] = aws_conf['Timeout'] + result['lambda']['memory'] = aws_conf['MemorySize'] + result['lambda']['environment']['Variables'] = aws_conf['Environment']['Variables'].copy() + result['lambda']['layers'] = aws_conf['Layers'].copy() + result['lambda']['supervisor']['version'] = aws_conf['SupervisorVersion'] + return result def get_all_functions(self, arn_list): try: - return [self.get_function_info(function_arn) for function_arn in arn_list] + return [self.merge_aws_and_local_configuration(self.get_function_configuration(function_arn)) + for function_arn in arn_list] except ClientError as cerr: - print (f"Error getting function info by arn: {cerr}") - - def get_function_info(self, function_name_or_arn=None): - name_arn = function_name_or_arn if function_name_or_arn else self.aws.lambdaf.name - return self.client.get_function_info(name_arn) + print(f"Error getting function info by arn: {cerr}") + + def get_function_configuration(self, arn: str = None) -> Dict: + function = arn if arn else self.function.get('name') + return self.client.get_function_configuration(function) + + def get_fdl_config(self, arn: str = None) -> Dict: + function = arn if arn else self.function.get('name') + function_info = self.client.get_function(function) + dep_pack_url = function_info.get('Code').get('Location') + dep_pack = get_file(dep_pack_url) + # Extract function_config.yaml + try: + with ZipFile(io.BytesIO(dep_pack)) as thezip: + with thezip.open('function_config.yaml') as cfg_yaml: + return yaml.safe_load(cfg_yaml) + except (KeyError, BadZipfile): + return {} @excp.exception(logger) def find_function(self, function_name_or_arn=None): try: # If this call works the function exists - name_arn = function_name_or_arn if function_name_or_arn else self.aws.lambdaf.name - self.get_function_info(name_arn) + name_arn = function_name_or_arn if function_name_or_arn else self.function.get('name', '') + self.get_function_configuration(name_arn) return True except ClientError as ce: # Function not found @@ -340,17 +248,19 @@ def find_function(self, function_name_or_arn=None): raise def add_invocation_permission_from_api_gateway(self): - kwargs = {'FunctionName' : self.aws.lambdaf.name, - 'Principal' : 'apigateway.amazonaws.com', - 'SourceArn' : 'arn:aws:execute-api:{0}:{1}:{2}/*'.format(self.aws.region, - self.aws.account_id, - self.aws.api_gateway.id)} + api = self.resources_info.get('api_gateway') # Add Testing permission + kwargs = {'FunctionName': self.function.get('name'), + 'Principal': api.get('service_id'), + 'SourceArn': api.get('source_arn_testing').format(api_region=api.get('region'), + account_id=self.resources_info.get('iam').get('account_id'), + api_id=api.get('id'))} self.client.add_invocation_permission(**kwargs) # Add Invocation permission - kwargs['SourceArn'] = 'arn:aws:execute-api:{0}:{1}:{2}/scar/ANY'.format(self.aws.region, - self.aws.account_id, - self.aws.api_gateway.id) + kwargs['SourceArn'] = api.get('source_arn_invocation').format(api_region=api.get('region'), + account_id=self.resources_info.get('iam').get('account_id'), + api_id=api.get('id'), + stage_name=api.get('stage_name')) self.client.add_invocation_permission(**kwargs) def get_api_gateway_id(self): @@ -360,31 +270,32 @@ def get_api_gateway_id(self): def _get_api_gateway_url(self): api_id = self.get_api_gateway_id() if not api_id: - raise excp.ApiEndpointNotFoundError(self.aws.lambdaf.name) - return f'https://{api_id}.execute-api.{self.aws.region}.amazonaws.com/scar/launch' + raise excp.ApiEndpointNotFoundError(self.function.get('name')) + return self.resources_info.get('api_gateway').get('endpoint').format(api_id=api_id, + api_region=self.resources_info.get('api_gateway').get('region'), + stage_name=self.resources_info.get('api_gateway').get('stage_name')) def call_http_endpoint(self): invoke_args = {'headers' : {'X-Amz-Invocation-Type':'Event'} if self.is_asynchronous() else {}} - if hasattr(self.aws, "api_gateway"): - self._set_invoke_args(invoke_args) - return request.call_http_endpoint(self._get_api_gateway_url(), **invoke_args) + self._set_invoke_args(invoke_args) + return call_http_endpoint(self._get_api_gateway_url(), **invoke_args) def _set_invoke_args(self, invoke_args): - if hasattr(self.aws.api_gateway, "data_binary"): - invoke_args['data'] = self._get_b64encoded_binary_data(self.aws.api_gateway.data_binary) - invoke_args['headers'] = {'Content-Type': 'application/octet-stream'} - if hasattr(self.aws.api_gateway, "parameters"): - invoke_args['params'] = self._parse_http_parameters(self.aws.api_gateway.parameters) - if hasattr(self.aws.api_gateway, "json_data"): - invoke_args['data'] = self._parse_http_parameters(self.aws.api_gateway.json_data) - invoke_args['headers'] = {'Content-Type': 'application/json'} + if self.resources_info.get('api_gateway').get('data_binary', False): + invoke_args['data'] = self._get_b64encoded_binary_data() + invoke_args['headers'].update({'Content-Type': 'application/octet-stream'}) + if self.resources_info.get('api_gateway').get('parameters', False): + invoke_args['params'] = self._parse_http_parameters(self.resources_info.get('api_gateway').get('parameters')) + if self.resources_info.get('api_gateway').get('json_data', False): + invoke_args['data'] = self._parse_http_parameters(self.resources_info.get('api_gateway').get('json_data')) + invoke_args['headers'].update({'Content-Type': 'application/json'}) def _parse_http_parameters(self, parameters): return parameters if type(parameters) is dict else json.loads(parameters) @excp.exception(logger) - def _get_b64encoded_binary_data(self, data_path): - if data_path: - AWSValidator.validate_http_payload_size(data_path, self.is_asynchronous()) - with open(data_path, 'rb') as data_file: - return base64.b64encode(data_file.read()) + def _get_b64encoded_binary_data(self): + data_path = self.resources_info.get('api_gateway').get('data_binary') + AWSValidator.validate_http_payload_size(data_path, self.is_asynchronous()) + with open(data_path, 'rb') as data_file: + return base64.b64encode(data_file.read()) diff --git a/scar/providers/aws/lambdalayers.py b/scar/providers/aws/lambdalayers.py index c3f9609a..f85d5a58 100644 --- a/scar/providers/aws/lambdalayers.py +++ b/scar/providers/aws/lambdalayers.py @@ -12,57 +12,18 @@ # See the License for the specific language governing permissions and # limitations under the License. """Module with methods and classes to manage the Lambda layers.""" - -import io import shutil -from typing import Dict +from typing import Dict, List import zipfile -from tabulate import tabulate -import scar.http.request as request import scar.logger as logger -from scar.utils import lazy_property, FileUtils, GitHubUtils, StrUtils, \ - GITHUB_USER, GITHUB_SUPERVISOR_PROJECT - - -def _create_tmp_folders() -> None: - tmp_zip_folder = FileUtils.create_tmp_dir() - layer_code_folder = FileUtils.create_tmp_dir() - return (tmp_zip_folder.name, layer_code_folder.name) - - -def _download_supervisor(supervisor_version: str, tmp_zip_path: str) -> str: - supervisor_zip_url = GitHubUtils.get_source_code_url(GITHUB_USER, GITHUB_SUPERVISOR_PROJECT, - supervisor_version) - supervisor_zip = request.get_file(supervisor_zip_url) - with zipfile.ZipFile(io.BytesIO(supervisor_zip)) as thezip: - for file in thezip.namelist(): - # Remove the parent folder path - parent_folder, file_name = file.split("/", 1) - if file_name.startswith("extra") or file_name.startswith("faassupervisor"): - thezip.extract(file, tmp_zip_path) - return parent_folder - - -def _copy_supervisor_files(parent_folder: str, tmp_zip_path: str, layer_code_path: str) -> None: - supervisor_path = FileUtils.join_paths(tmp_zip_path, parent_folder, 'faassupervisor') - shutil.move(supervisor_path, FileUtils.join_paths(layer_code_path, 'python', 'faassupervisor')) - - -def _copy_extra_files(parent_folder: str, tmp_zip_path: str, layer_code_path: str) -> None: - extra_folder_path = FileUtils.join_paths(tmp_zip_path, parent_folder, 'extra') - files = FileUtils.get_all_files_in_directory(extra_folder_path) - for file_path in files: - FileUtils.unzip_folder(file_path, layer_code_path) - - -def _create_layer_zip(layer_zip_path: str, layer_code_path: str) -> None: - FileUtils.zip_folder(layer_zip_path, layer_code_path) +from scar.utils import FileUtils +from scar.providers.aws.clients.lambdafunction import LambdaClient class Layer(): """Class used for layer management.""" - def __init__(self, lambda_client) -> None: + def __init__(self, lambda_client: LambdaClient) -> None: self.lambda_client = lambda_client def _find(self, layer_name: str) -> Dict: @@ -82,9 +43,12 @@ def exists(self, layer_name: str) -> bool: return True return False + def list_versions(self, layer_name: str) -> List: + return self.lambda_client.list_layer_versions(layer_name) + def delete(self, **kwargs: Dict) -> Dict: """Deletes a layer.""" - layer_args = {'LayerName' : kwargs['name']} + layer_args = {'LayerName': kwargs['name']} if 'version' in kwargs: layer_args['VersionNumber'] = int(kwargs['version']) else: @@ -92,7 +56,7 @@ def delete(self, **kwargs: Dict) -> Dict: layer_args['VersionNumber'] = version_info.get('Version', -1) return self.lambda_client.delete_layer_version(**layer_args) - def get_latest_layer_info(self, layer_name: str) -> str: + def get_latest_layer_info(self, layer_name: str) -> Dict: """Returns the latest matching version of the layer with 'layer_name'.""" layer = self._find(layer_name) return layer['LatestMatchingVersion'] if layer else {} @@ -101,76 +65,67 @@ def get_latest_layer_info(self, layer_name: str) -> str: class LambdaLayers(): """"Class used to manage the lambda supervisor layer.""" - _SUPERVISOR_LAYER_NAME = 'faas-supervisor' - - @lazy_property - def layer(self): - """Property used to manage the lambda layers.""" - layer = Layer(self.lambda_client) - return layer - - def __init__(self, lambda_client, supervisor_version: str) -> None: - self.lambda_client = lambda_client - self.supervisor_version = supervisor_version + # To avoid circular inheritance we need to receive the LambdaClient + def __init__(self, resources_info: Dict, lambda_client: LambdaClient, supervisor_zip_path: str): + self.resources_info = resources_info + self.supervisor_zip_path = supervisor_zip_path + self.layer_name = resources_info.get('lambda').get('supervisor').get('layer_name') + self.supervisor_version = resources_info.get('lambda').get('supervisor').get('version') + self.layer = Layer(lambda_client) def _get_supervisor_layer_props(self, layer_zip_path: str) -> Dict: - return {'LayerName' : self._SUPERVISOR_LAYER_NAME, - 'Description' : self.supervisor_version, - 'Content' : {'ZipFile': FileUtils.read_file(layer_zip_path, mode="rb")}, - 'LicenseInfo' : 'Apache 2.0'} - - def _create_layer(self) -> None: - tmp_zip_path, layer_code_path = _create_tmp_folders() - layer_zip_path = FileUtils.join_paths(FileUtils.get_tmp_dir(), - f"{self._SUPERVISOR_LAYER_NAME}.zip") - parent_folder = _download_supervisor(self.supervisor_version, tmp_zip_path) - _copy_supervisor_files(parent_folder, tmp_zip_path, layer_code_path) - _copy_extra_files(parent_folder, tmp_zip_path, layer_code_path) - _create_layer_zip(layer_zip_path, layer_code_path) - self.layer.create(**self._get_supervisor_layer_props(layer_zip_path)) - FileUtils.delete_file(layer_zip_path) - - def _create_supervisor_layer(self) -> None: - logger.info("Creating faas-supervisor layer.") - self._create_layer() - logger.info("Faas-supervisor layer created.") - - def _update_supervisor_layer(self) -> None: - logger.info("Updating faas-supervisor layer.") - self._create_layer() - logger.info("Faas-supervisor layer updated.") - - def print_layers_info(self) -> None: - """Prints the lambda layers information.""" - layers_info = self.lambda_client.list_layers() - headers = ['NAME', 'VERSION', 'ARN', 'RUNTIMES'] - table = [] - for layer in layers_info: - table.append([layer.get('LayerName', ""), - layer.get('LatestMatchingVersion', {}).get('Version', -1), - layer.get('LayerArn', ""), - layer.get('LatestMatchingVersion', {}).get('CompatibleRuntimes', '-')]) - print(tabulate(table, headers)) - - def get_latest_supervisor_layer_arn(self) -> str: - """Returns the ARN of the latest supervisor layer.""" - layer_info = self.layer.get_latest_layer_info(self._SUPERVISOR_LAYER_NAME) - return layer_info.get('LayerVersionArn', "") - - def check_faas_supervisor_layer(self): - """Checks if the supervisor layer exists, if not, creates the layer. - If the layer exists and it's not updated, updates the layer.""" - # Get the layer information - layer_info = self.layer.get_latest_layer_info(self._SUPERVISOR_LAYER_NAME) - # Compare supervisor versions - if layer_info and 'Description' in layer_info: - # If the supervisor layer version is lower than the passed version, - # we must update the layer - if StrUtils.compare_versions(layer_info.get('Description', ''), - self.supervisor_version) < 0: - self._update_supervisor_layer() - else: - logger.info("Using existent 'faas-supervisor' layer") - else: - # Layer not found, we have to create it - self._create_supervisor_layer() + return {'LayerName': self.layer_name, + 'Description': self.supervisor_version, + 'Content': {'ZipFile': FileUtils.read_file(layer_zip_path, mode="rb")}, + 'CompatibleRuntimes': ['python3.8', 'python3.7'], + 'LicenseInfo': self.resources_info.get('lambda').get('supervisor').get('license_info')} + + def _create_layer(self) -> str: + # Create tmp folders + tmp_path = FileUtils.create_tmp_dir() + layer_code_path = FileUtils.create_tmp_dir() + # Extract 'extra' and 'faassupervisor' from supervisor_zip_path + with zipfile.ZipFile(self.supervisor_zip_path) as thezip: + for file in thezip.namelist(): + # Remove the parent folder path + parent_folder, file_name = file.split('/', 1) + if file_name.startswith('extra') or file_name.startswith('faassupervisor'): + thezip.extract(file, tmp_path.name) + # Extract content of 'extra' files in layer_code_path + extra_folder_path = FileUtils.join_paths(tmp_path.name, parent_folder, 'extra') + files = FileUtils.get_all_files_in_directory(extra_folder_path) + for file_path in files: + FileUtils.unzip_folder(file_path, layer_code_path.name) + # Copy 'faassupervisor' to layer_code_path + supervisor_folder_path = FileUtils.join_paths(tmp_path.name, parent_folder, 'faassupervisor') + shutil.move(supervisor_folder_path, FileUtils.join_paths(layer_code_path.name, 'python', 'faassupervisor')) + # Create layer zip with content of layer_code_path + layer_zip_path = FileUtils.join_paths(tmp_path.name, f'{self.layer_name}.zip') + FileUtils.zip_folder(layer_zip_path, layer_code_path.name) + # Register the layer + props = self._get_supervisor_layer_props(layer_zip_path) + response = self.layer.create(**props) + return response['LayerVersionArn'] + + def _is_supervisor_created(self) -> bool: + return self.layer.exists(self.layer_name) + + def _is_supervisor_version_created(self) -> str: + versions = self.layer.list_versions(self.layer_name) + for version in versions: + if 'Description' in version: + if version['Description'] == self.supervisor_version: + return version['LayerVersionArn'] + return '' + + def get_supervisor_layer_arn(self) -> str: + """Returns the ARN of the specified supervisor layer version. + If the layer or version doesn't exists, creates the layer.""" + if self._is_supervisor_created(): + is_created = self._is_supervisor_version_created() + if is_created != '': + logger.info(f'Using existent \'{self.layer_name}\' layer.') + return is_created + logger.info((f'Creating lambda layer with \'{self.layer_name}\'' + f' version \'{self.supervisor_version}\'.')) + return self._create_layer() diff --git a/scar/providers/aws/launchtemplates.py b/scar/providers/aws/launchtemplates.py index 8dd638c1..685bad38 100644 --- a/scar/providers/aws/launchtemplates.py +++ b/scar/providers/aws/launchtemplates.py @@ -30,11 +30,12 @@ 'RequestId': 'XXX', 'RetryAttempts': 0}}""" +from typing import Dict from string import Template from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from scar.providers.aws import GenericClient -from scar.utils import GitHubUtils, StrUtils +from scar.utils import SupervisorUtils, StrUtils import scar.exceptions as excp import scar.logger as logger @@ -42,11 +43,6 @@ class LaunchTemplates(GenericClient): """Class to manage the creation and update of launch templates.""" - _TEMPLATE_NAME = 'faas-supervisor' - _SUPERVISOR_GITHUB_REPO = 'faas-supervisor' - _SUPERVISOR_GITHUB_USER = 'grycap' - _SUPERVISOR_GITHUB_ASSET_NAME = 'supervisor' - # Script to download 'faas-supervisor' _LAUNCH_TEMPLATE_SCRIPT = Template( '#!/bin/bash\n' @@ -55,21 +51,17 @@ class LaunchTemplates(GenericClient): 'chmod +x /opt/faas-supervisor/bin/supervisor\n' ) - def __init__(self, aws_properties, supervisor_version): - super().__init__(aws_properties) - self.supervisor_version = supervisor_version - if self.supervisor_version == 'latest': - self.supervisor_version = GitHubUtils.get_latest_release( - self._SUPERVISOR_GITHUB_USER, - self._SUPERVISOR_GITHUB_REPO - ) + def __init__(self, resources_info: Dict): + super().__init__(resources_info.get('batch')) + self.supervisor_version = resources_info.get('lambda').get('supervisor').get('version') + self.template_name = resources_info.get('batch').get('compute_resources').get('launch_template_name') @excp.exception(logger) def _is_supervisor_created(self) -> bool: """Checks if 'faas-supervisor' launch template is created""" params = {'Filters': [ {'Name': 'launch-template-name', - 'Values': [self._TEMPLATE_NAME]} + 'Values': [self.template_name]} ]} response = self.client.describe_launch_templates(params) return ('LaunchTemplates' in response and @@ -80,12 +72,12 @@ def _is_supervisor_version_created(self) -> int: """Checks if the supervisor version specified is created. Returns the Launch Template version or -1 if it does not exists""" response = self.client.describe_launch_template_versions( - {'LaunchTemplateName': self._TEMPLATE_NAME}) + {'LaunchTemplateName': self.template_name}) versions = response['LaunchTemplateVersions'] # Get ALL versions while ('NextToken' in response and response['NextToken']): response = self.client.describe_launch_template_versions( - {'LaunchTemplateName': self._TEMPLATE_NAME, + {'LaunchTemplateName': self.template_name, 'NextToken': response['NextToken']}) versions.extend(response['LaunchTemplateVersions']) @@ -116,10 +108,7 @@ def _create_supervisor_user_data(self) -> str: chmod +x /opt/faas-supervisor/bin/supervisor --===============3595946014116037730==--""" multipart = MIMEMultipart() - url = GitHubUtils.get_asset_url(self._SUPERVISOR_GITHUB_USER, - self._SUPERVISOR_GITHUB_REPO, - self._SUPERVISOR_GITHUB_ASSET_NAME, - self.supervisor_version) + url = SupervisorUtils.get_supervisor_binary_url(self.supervisor_version) script = self._LAUNCH_TEMPLATE_SCRIPT.substitute( supervisor_binary_url=url) content = MIMEText(script, 'x-shellscript') @@ -133,14 +122,14 @@ def get_launch_template_version(self) -> int: if self._is_supervisor_created(): is_created = self._is_supervisor_version_created() if is_created is not -1: - logger.info('Using existent \'faas-supervisor\' launch template.') + logger.info(f"Using existent '{self.template_name}' launch template.") return is_created else: logger.info((f"Creating launch template version with 'faas-supervisor' " f"version '{self.supervisor_version}'.")) user_data = self._create_supervisor_user_data() response = self.client.create_launch_template_version( - self._TEMPLATE_NAME, + self.template_name, self.supervisor_version, {'UserData': user_data}) return response['LaunchTemplateVersion']['VersionNumber'] @@ -149,7 +138,7 @@ def get_launch_template_version(self) -> int: f"version '{self.supervisor_version}'.")) user_data = self._create_supervisor_user_data() response = self.client.create_launch_template( - self._TEMPLATE_NAME, + self.template_name, self.supervisor_version, {'UserData': user_data}) return response['LaunchTemplate']['LatestVersionNumber'] diff --git a/scar/providers/aws/properties.py b/scar/providers/aws/properties.py deleted file mode 100644 index f365411a..00000000 --- a/scar/providers/aws/properties.py +++ /dev/null @@ -1,191 +0,0 @@ -# Copyright (C) GRyCAP - I3M - UPV -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -"""Module with classes and methods to manage the different -properties needed by SCAR and the boto clients.""" - - -class ScarProperties(dict): - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - - -class AwsProperties(dict): - - def __init__(self, *args, **kwargs): - """ - {'account_id': '914332', - 'batch': see_batch_props_class, - 'boto_profile': 'default', - 'cloudwatch': see_cloudwatch_props_class, - 'config_path': 'cowsay', - 'execution_mode': 'lambda', - 'iam': see_iam_props_class, - 'lambda': see_lambdaf_props_class, - 'output': , - 'region': 'us-east-1', - 's3': see_s3_props_class, - 'tags': {'createdby': 'scar', 'owner': 'alpegon'}} - """ - super().__init__(*args, **kwargs) - self.__dict__ = self - self._initialize_properties() - - def _initialize_properties(self): - if hasattr(self, "api_gateway"): - self.api_gateway = ApiGatewayProperties(self.api_gateway) - if hasattr(self, "batch"): - self.batch = BatchProperties(self.batch) - if hasattr(self, "cloudwatch"): - self.cloudwatch = CloudWatchProperties(self.cloudwatch) - if hasattr(self, "iam"): - self.iam = IamProperties(self.iam) - if hasattr(self, "lambda"): - self.lambdaf = LambdaProperties(self.__dict__['lambda']) - self.__dict__.pop('lambda', None) - if hasattr(self, "s3"): - self.s3 = S3Properties(self.s3) - - -class ApiGatewayProperties(dict): - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - - -class BatchProperties(dict): - """ - Example of dictionary used to initialize the class properties: - {'comp_type': 'EC2', - 'desired_v_cpus': 0, - 'instance_types': ['m3.medium'], - 'max_v_cpus': 2, - 'min_v_cpus': 0, - 'security_group_ids': ['sg-2568'], - 'state': 'ENABLED', - 'subnets': ['subnet-568', - 'subnet-569', - 'subnet-570', - 'subnet-571', - 'subnet-572'], - 'type': 'MANAGED'} - """ - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - - -class LambdaProperties(dict): - """ - Example of dictionary used to initialize the class properties: - {'asynchronous': False, - 'description': 'Automatically generated lambda function', - 'environment': {'Variables': {'EXECUTION_MODE': 'lambda', - 'INPUT_BUCKET': 'test1', - 'LOG_LEVEL': 'INFO', - 'SUPERVISOR_TYPE': 'LAMBDA', - 'TIMEOUT_THRESHOLD': '10', - 'UDOCKER_BIN': '/opt/udocker/bin/', - 'UDOCKER_DIR': '/tmp/shared/udocker', - 'UDOCKER_EXEC': '/opt/udocker/udocker.py', - 'UDOCKER_LIB': '/opt/udocker/lib/'}}, - 'extra_payload': '/test/', - 'handler': 'test.lambda_handler', - 'image_file': 'minicow.tar.gz', - 'init_script': 'test.sh', - 'invocation_type': 'RequestResponse', - 'layers': ['arn:aws:lambda:us-east-1:914332:layer:faas-supervisor:1'], - 'log_level': 'INFO', - 'log_type': 'Tail', - 'memory': 512, - 'name': 'test', - 'runtime': 'python3.6', - 'time': 300, - 'timeout_threshold': 10, - 'zip_file_path': '/tmp/function.zip'} - """ - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - - def update_properties(self, **kwargs): - if 'ResponseMetadata' in kwargs: - # Parsing RAW function info - self.description = kwargs['Description'] - self.environment = kwargs['Environment'] - self.arn = kwargs['FunctionArn'] - self.name = kwargs['FunctionName'] - self.handler = kwargs['Handler'] - self.layers = [layer['Arn'] for layer in kwargs['Layers']] - self.memory = kwargs['MemorySize'] - self.time = kwargs['Timeout'] - self.role = kwargs['Role'] - self.runtime = kwargs['Runtime'] - else: - self.__dict__.update(**kwargs) - - -class IamProperties(dict): - """ - Example of dictionary used to initialize the class properties: - {'role': 'arn:aws:iam::914332:role/invented-role'} - """ - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - - -class S3Properties(dict): - """ - Example of dictionary used to initialize the class properties: - {'input_bucket': 'test1'} - """ - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self - self.process_storage_paths() - - def process_storage_paths(self): - if hasattr(self, "input_bucket"): - self.storage_path_input = self.input_bucket - input_path = self.input_bucket.split("/", 1) - if len(input_path) > 1: - # There are folders defined - self.input_bucket = input_path[0] - self.input_folder = input_path[1] - - if hasattr(self, "output_bucket"): - self.storage_path_output = self.output_bucket - output_path = self.output_bucket.split("/", 1) - if len(output_path) > 1: - # There are folders defined - self.output_bucket = output_path[0] - self.output_folder = output_path[1] - - -class CloudWatchProperties(dict): - """ - Example of dictionary used to initialize the class properties: - {'log_retention_policy_in_days': 30} - """ - - def __init__(self, *args, **kwargs): - super().__init__(*args, **kwargs) - self.__dict__ = self diff --git a/scar/providers/aws/resourcegroups.py b/scar/providers/aws/resourcegroups.py index 26945832..c0be463c 100644 --- a/scar/providers/aws/resourcegroups.py +++ b/scar/providers/aws/resourcegroups.py @@ -22,6 +22,9 @@ class ResourceGroups(GenericClient): """Class to manage AWS Resource Groups""" + def __init__(self, resources_info) -> None: + super().__init__(resources_info.get('lambda')) + def get_resource_arn_list(self, iam_user_id: str, resource_type: str = 'lambda') -> List: """Returns a list of ARNs filtered by the resource_type passed and the tags created by scar.""" diff --git a/scar/providers/aws/response.py b/scar/providers/aws/response.py index eac9eb11..5cbb5ebb 100644 --- a/scar/providers/aws/response.py +++ b/scar/providers/aws/response.py @@ -13,10 +13,12 @@ # limitations under the License. import json +from typing import Dict from enum import Enum from tabulate import tabulate import scar.logger as logger from scar.utils import StrUtils +from requests import Response class OutputType(Enum): @@ -25,10 +27,15 @@ class OutputType(Enum): VERBOSE = 3 BINARY = 4 - -def parse_http_response(response, function_name, asynch, output_type, output_file): +def parse_http_response(response: Response, resources_info: Dict, scar_info: Dict) -> None: + '''Process the response generated by an API Gateway invocation.''' + output_type = scar_info.get('cli_output') + function_name = resources_info.get('lambda').get('name') + asynch = resources_info.get('lambda').get('asynchronous') + text_message = "" if response.ok: - if output_type == OutputType.BINARY: + if output_type == OutputType.BINARY.value: + output_file = scar_info.get('output_file', '') with open(output_file, "wb") as out: out.write(StrUtils.decode_base64(response.text)) text_message = f"Output saved in file '{output_file}'" @@ -39,10 +46,10 @@ def parse_http_response(response, function_name, asynch, output_type, output_fil else: text_message += f"\nLog Group Name: {response.headers['amz-log-group-name']}\n" text_message += f"Log Stream Name: {response.headers['amz-log-stream-name']}\n" - text_message += json.loads(response.text)["udocker_output"] + text_message += StrUtils.base64_to_utf8_string(response.text) else: if asynch and response.status_code == 502: - text_message = f"Function '{function_name}' launched sucessfully." + text_message = f"Function '{function_name}' launched successfully." else: error = json.loads(response.text) if 'message' in error: @@ -53,31 +60,33 @@ def parse_http_response(response, function_name, asynch, output_type, output_fil def _print_generic_response(response, output_type, aws_output, text_message=None, json_output=None, verbose_output=None, output_file=None): - if output_type == OutputType.BINARY: + if output_type == OutputType.BINARY.value: with open(output_file, "wb") as out: out.write(StrUtils.decode_base64(response['Payload']['body'])) - elif output_type == OutputType.PLAIN_TEXT: + elif output_type == OutputType.PLAIN_TEXT.value: output = text_message logger.info(output) else: - if output_type == OutputType.JSON: + if output_type == OutputType.JSON.value: output = json_output if json_output else {aws_output : {'RequestId' : response['ResponseMetadata']['RequestId'], 'HTTPStatusCode' : response['ResponseMetadata']['HTTPStatusCode']}} - elif output_type == OutputType.VERBOSE: + elif output_type == OutputType.VERBOSE.value: output = verbose_output if verbose_output else {aws_output : response} logger.info_json(output) -def parse_lambda_function_creation_response(response, function_name, access_key, output_type): +def parse_lambda_function_creation_response(response, output_type, access_key): if response: aws_output = 'LambdaOutput' - text_message = f"Function '{function_name}' successfully created." - json_message = {aws_output : {'AccessKey' : access_key, - 'FunctionArn' : response['FunctionArn'], - 'Timeout' : response['Timeout'], - 'MemorySize' : response['MemorySize'], - 'FunctionName' : response['FunctionName']}} + text_message = f"Function '{response['FunctionName']}' successfully created." + json_message = {aws_output : { + 'AccessKey' : access_key, + 'FunctionArn' : response['FunctionArn'], + 'Timeout' : response['Timeout'], + 'MemorySize' : response['MemorySize'], + 'FunctionName' : response['FunctionName']} + } _print_generic_response(response, output_type, aws_output, text_message, json_output=json_message) @@ -105,36 +114,32 @@ def parse_delete_api_response(response, api_id, output_type): _print_generic_response(response, output_type, 'APIGateway', text_message) -def parse_ls_response(lambda_functions, output_type): +def parse_ls_response(aws_resources: Dict, output_type: int) -> None: aws_output = 'Functions' result = [] text_message = "" - if output_type == OutputType.VERBOSE: - result = lambda_functions + if output_type == OutputType.VERBOSE.value: + result = aws_resources else: - for lambdaf in lambda_functions: - result.append(_parse_lambda_function_info(lambdaf)) + for resources_info in aws_resources: + result.append(_parse_lambda_function_info(resources_info)) text_message = _get_table(result) - json_message = { aws_output : result } + json_message = {aws_output: result} _print_generic_response('', output_type, aws_output, text_message, json_output=json_message, verbose_output=json_message) -def _parse_lambda_function_info(function_info): - name = function_info.get('FunctionName', "-") - memory = function_info.get('MemorySize', "-") - timeout = function_info.get('Timeout', "-") - image_id = function_info['Environment']['Variables'].get('IMAGE_ID', "-") - api_gateway = function_info['Environment']['Variables'].get('API_GATEWAY_ID', "-") +def _parse_lambda_function_info(resources_info: Dict) -> Dict: + api_gateway = resources_info.get('lambda').get('environment').get('Variables').get('API_GATEWAY_ID', "-") if api_gateway != '-': - region = function_info['FunctionArn'].split(':')[3] - api_gateway = f"https://{api_gateway}.execute-api.{region}.amazonaws.com/scar/launch" - super_version = function_info.get('SupervisorVersion', '-') - return {'Name' : name, - 'Memory' : memory, - 'Timeout' : timeout, - 'Image_id': image_id, + stage_name = resources_info.get('api_gateway').get('stage_name') + region = resources_info.get('api_gateway').get('region') + api_gateway = f"https://{api_gateway}.execute-api.{region}.amazonaws.com/{stage_name}/launch" + return {'Name': resources_info.get('lambda').get('name', "-"), + 'Memory': resources_info.get('lambda').get('memory', "-"), + 'Timeout': resources_info.get('lambda').get('timeout', "-"), + 'Image_id': resources_info.get('lambda').get('environment').get('Variables').get('IMAGE_ID', "-"), 'Api_gateway': api_gateway, - 'Sup_version': super_version} + 'Sup_version': resources_info.get('lambda').get('supervisor').get('version', '-')} def _get_table(functions_info): diff --git a/scar/providers/aws/s3.py b/scar/providers/aws/s3.py index e6375263..75a5b0dc 100644 --- a/scar/providers/aws/s3.py +++ b/scar/providers/aws/s3.py @@ -13,73 +13,73 @@ # limitations under the License. import os +from typing import Tuple, Dict, List from scar.providers.aws import GenericClient import scar.exceptions as excp import scar.logger as logger from scar.utils import FileUtils -from scar.providers.aws.properties import S3Properties + + +def get_bucket_and_folders(storage_path: str) -> Tuple: + output_bucket = storage_path + output_folders = "" + output_path = storage_path.split("/", 1) + if len(output_path) > 1: + # There are folders defined + output_bucket = output_path[0] + output_folders = output_path[1] + return (output_bucket, output_folders) class S3(GenericClient): - def __init__(self, aws_properties): - super().__init__(aws_properties) - if hasattr(self.aws, 's3'): - if type(self.aws.s3) is dict: - self.aws.s3 = S3Properties(self.aws.s3) - self._initialize_properties() - - def _initialize_properties(self): - if not hasattr(self.aws.s3, "input_folder"): - self.aws.s3.input_folder = '' - if hasattr(self.aws.lambdaf, "name"): - self.aws.s3.input_folder = "{0}/input/".format(self.aws.lambdaf.name) - elif not self.aws.s3.input_folder.endswith("/"): - self.aws.s3.input_folder = "{0}/".format(self.aws.s3.input_folder) + def __init__(self, resources_info): + super().__init__(resources_info.get('s3')) + self.resources_info = resources_info @excp.exception(logger) - def create_bucket(self, bucket_name): + def create_bucket(self, bucket_name) -> None: if not self.client.find_bucket(bucket_name): self.client.create_bucket(bucket_name) - def create_output_bucket(self): - self.create_bucket(self.aws.s3.output_bucket) - @excp.exception(logger) - def add_bucket_folder(self): - if self.aws.s3.input_folder: - self.upload_file(folder_name=self.aws.s3.input_folder) - - def create_input_bucket(self, create_input_folder=False): - self.create_bucket(self.aws.s3.input_bucket) - if create_input_folder: - self.add_bucket_folder() - - def set_input_bucket_notification(self): + def add_bucket_folder(self, bucket: str, folders: str) -> None: + if not self.client.is_folder(bucket, folders): + self.upload_file(bucket, folder_name=folders) + + def create_bucket_and_folders(self, storage_path: str) -> Tuple: + bucket, folders = get_bucket_and_folders(storage_path) + self.create_bucket(bucket) + if folders: + self.add_bucket_folder(bucket, folders) + return bucket, folders + + def set_input_bucket_notification(self, bucket_name: str, folders: str) -> None: # First check that the function doesn't have other configurations - bucket_conf = self.client.get_notification_configuration(self.aws.s3.input_bucket) - trigger_conf = self.get_trigger_configuration() + bucket_conf = self.client.get_notification_configuration(bucket_name) + trigger_conf = self.get_trigger_configuration(folders) lambda_conf = [trigger_conf] if "LambdaFunctionConfigurations" in bucket_conf: lambda_conf = bucket_conf["LambdaFunctionConfigurations"] lambda_conf.append(trigger_conf) - notification = { "LambdaFunctionConfigurations": lambda_conf } - self.client.put_notification_configuration(self.aws.s3.input_bucket, notification) + notification = {"LambdaFunctionConfigurations": lambda_conf} + self.client.put_notification_configuration(bucket_name, notification) - def delete_bucket_notification(self, bucket_name, function_arn): + def delete_bucket_notification(self, bucket_name): bucket_conf = self.client.get_notification_configuration(bucket_name) if bucket_conf and "LambdaFunctionConfigurations" in bucket_conf: lambda_conf = bucket_conf["LambdaFunctionConfigurations"] - filter_conf = [x for x in lambda_conf if x['LambdaFunctionArn'] != function_arn] - notification = { "LambdaFunctionConfigurations": filter_conf } + filter_conf = [x for x in lambda_conf if x['LambdaFunctionArn'] != self.resources_info.get('lambda').get('arn')] + notification = {"LambdaFunctionConfigurations": filter_conf} self.client.put_notification_configuration(bucket_name, notification) - logger.info("Bucket notifications successfully deleted") + logger.info("Bucket notifications successfully deleted.") - def get_trigger_configuration(self): - return {"LambdaFunctionArn": self.aws.lambdaf.arn, - "Events": [ "s3:ObjectCreated:*" ], - "Filter": { "Key": { "FilterRules": [{ "Name": "prefix", "Value": self.aws.s3.input_folder }]}} - } + def get_trigger_configuration(self, folders: str) -> Dict: + conf = {"LambdaFunctionArn": self.resources_info.get('lambda').get('arn'), + "Events": ["s3:ObjectCreated:*"]} + if folders != '': + conf['Filter'] = {"Key": {"FilterRules": [{"Name": "prefix", "Value": f'{folders}/'}]}} + return conf def get_file_key(self, folder_name=None, file_path=None, file_key=None): if file_key: @@ -94,8 +94,8 @@ def get_file_key(self, folder_name=None, file_path=None, file_key=None): return file_key @excp.exception(logger) - def upload_file(self, folder_name=None, file_path=None, file_key=None): - kwargs = {'Bucket' : self.aws.s3.input_bucket} + def upload_file(self, bucket: str, folder_name: str=None, file_path: str=None, file_key: str=None) -> None: + kwargs = {'Bucket': bucket} kwargs['Key'] = self.get_file_key(folder_name, file_path, file_key) if file_path: try: @@ -103,34 +103,48 @@ def upload_file(self, folder_name=None, file_path=None, file_key=None): except FileNotFoundError: raise excp.UploadFileNotFoundError(file_path=file_path) if folder_name and not file_path: - logger.info("Folder '{0}' created in bucket '{1}'".format(kwargs['Key'], kwargs['Bucket'])) + kwargs['ContentType'] = 'application/x-directory' + logger.info(f"Folder '{kwargs['Key']}' created in bucket '{kwargs['Bucket']}'.") else: - logger.info("Uploading file '{0}' to bucket '{1}' with key '{2}'".format(file_path, kwargs['Bucket'], kwargs['Key'])) + logger.info(f"Uploading file '{file_path}' to bucket '{kwargs['Bucket']}' with key '{kwargs['Key']}'.") self.client.upload_file(**kwargs) @excp.exception(logger) - def get_bucket_file_list(self): - bucket_name = self.aws.s3.input_bucket + def get_bucket_file_list(self, storage: Dict=None): + files = [] + if storage: + files = self._list_storage_files(storage) + else: + for storage_info in self.resources_info.get('lambda').get('input'): + if storage_info.get('storage_provider') == 's3': + files.extend(self._list_storage_files(storage_info)) + return files + + def _list_storage_files(self, storage: Dict) -> List: + files = [] + bucket_name, folder_path = get_bucket_and_folders(storage.get('path')) if self.client.find_bucket(bucket_name): kwargs = {"Bucket" : bucket_name} - if hasattr(self.aws.s3, "input_folder") and self.aws.s3.input_folder: - kwargs["Prefix"] = self.aws.s3.input_folder - return self.client.list_files(**kwargs) + if folder_path: + kwargs["Prefix"] = folder_path + files = (self.client.list_files(**kwargs)) else: raise excp.BucketNotFoundError(bucket_name=bucket_name) + return files - def get_s3_event(self, s3_file_key): - return {"Records": [{"eventSource": "aws:s3", - "s3" : {"bucket" : {"name": self.aws.s3.input_bucket, - "arn": f'arn:aws:s3:::{self.aws.s3.input_bucket}'}, - "object" : {"key": s3_file_key}}}]} + def get_s3_event(self, bucket_name, file_key): + event = self.resources_info.get("s3").get("event") + event['Records'][0]['s3']['bucket']['name'] = bucket_name + event['Records'][0]['s3']['bucket']['arn'] = event['Records'][0]['s3']['bucket']['arn'].format(bucket_name=bucket_name) + event['Records'][0]['s3']['object']['key'] = file_key + return event - def get_s3_event_list(self, s3_file_keys): - return [self.get_s3_event(s3_key) for s3_key in s3_file_keys] + def get_s3_event_list(self, bucket_name, file_keys): + return [self.get_s3_event(bucket_name, file_key) for file_key in file_keys] def download_file(self, bucket_name, file_key, file_path): kwargs = {'Bucket' : bucket_name, 'Key' : file_key} - logger.info("Downloading file '{0}' from bucket '{1}' in path '{2}'".format(file_key, bucket_name, file_path)) + logger.info(f"Downloading file '{file_key}' from bucket '{bucket_name}' in path '{file_path}'.") with open(file_path, 'wb') as file: kwargs['Fileobj'] = file self.client.download_file(**kwargs) diff --git a/scar/providers/aws/udocker.py b/scar/providers/aws/udocker.py index 9a1671bc..dbfd75d1 100644 --- a/scar/providers/aws/udocker.py +++ b/scar/providers/aws/udocker.py @@ -13,7 +13,7 @@ # limitations under the License. from zipfile import ZipFile -from scar.utils import FileUtils, SysUtils +from scar.utils import FileUtils, SysUtils, StrUtils def _extract_udocker_zip(supervisor_zip_path) -> None: @@ -29,76 +29,46 @@ def _extract_udocker_zip(supervisor_zip_path) -> None: class Udocker(): - def __init__(self, aws_properties, function_tmp_folder, supervisor_zip_path): - self.aws = aws_properties - self.function_tmp_folder = function_tmp_folder - self.udocker_dir = FileUtils.join_paths(self.function_tmp_folder, "udocker") - self.udocker_dir_orig = "" - self._initialize_udocker(supervisor_zip_path) + _CONTAINER_NAME = "udocker_container" - def _initialize_udocker(self, supervisor_zip_path): - self.udocker_code = FileUtils.join_paths(self.udocker_dir, "udocker.py") - self.udocker_exec = ['python3', self.udocker_code] + def __init__(self, resources_info: str, tmp_payload_folder_path: str, supervisor_zip_path: str): + self.resources_info = resources_info + self._tmp_payload_folder_path = tmp_payload_folder_path + self._udocker_dir = FileUtils.join_paths(self._tmp_payload_folder_path, "udocker") + self._udocker_dir_orig = "" + self._udocker_code = FileUtils.join_paths(self._udocker_dir, "udocker.py") + self._udocker_exec = ['python3', self._udocker_code] self._install_udocker(supervisor_zip_path) - def _install_udocker(self, supervisor_zip_path): + def _install_udocker(self, supervisor_zip_path: str) -> None: udocker_zip_path = _extract_udocker_zip(supervisor_zip_path) with ZipFile(udocker_zip_path) as thezip: - thezip.extractall(self.function_tmp_folder) + thezip.extractall(self._tmp_payload_folder_path) - def save_tmp_udocker_env(self): + def _save_tmp_udocker_env(self): # Avoid override global variables if SysUtils.is_variable_in_environment("UDOCKER_DIR"): - self.udocker_dir_orig = SysUtils.get_environment_variable("UDOCKER_DIR") + self._udocker_dir_orig = SysUtils.get_environment_variable("UDOCKER_DIR") # Set temporal global vars - SysUtils.set_environment_variable("UDOCKER_DIR", self.udocker_dir) + SysUtils.set_environment_variable("UDOCKER_DIR", self._udocker_dir) - def restore_udocker_env(self): - if self.udocker_dir_orig: - SysUtils.set_environment_variable("UDOCKER_DIR", self.udocker_dir_orig) + def _restore_udocker_env(self): + if self._udocker_dir_orig: + SysUtils.set_environment_variable("UDOCKER_DIR", self._udocker_dir_orig) else: SysUtils.delete_environment_variable("UDOCKER_DIR") def _set_udocker_local_registry(self): - self.aws.lambdaf.environment['Variables']['UDOCKER_REPOS'] = '/var/task/udocker/repos/' - self.aws.lambdaf.environment['Variables']['UDOCKER_LAYERS'] = '/var/task/udocker/layers/' + self.resources_info['lambda']['environment']['Variables']['UDOCKER_REPOS'] = '/var/task/udocker/repos/' + self.resources_info['lambda']['environment']['Variables']['UDOCKER_LAYERS'] = '/var/task/udocker/layers/' - def _create_udocker_container(self): - """Check if the container fits in the limits of the deployment.""" - if hasattr(self.aws, "s3") and hasattr(self.aws.s3, "deployment_bucket"): - self._validate_container_size(self.aws.lambdaf.max_s3_payload_size) - else: - self._validate_container_size(self.aws.lambdaf.max_payload_size) - - def _validate_container_size(self, max_payload_size): - if FileUtils.get_tree_size(self.udocker_dir) < (max_payload_size / 2): - ucmd = self.udocker_exec + ["create", "--name=lambda_cont", self.aws.lambdaf.image] - SysUtils.execute_command_with_msg(ucmd, cli_msg="Creating container structure") - - elif FileUtils.get_tree_size(self.udocker_dir) > max_payload_size: - FileUtils.delete_folder(FileUtils.join_paths(self.udocker_dir, "containers")) - - else: - self.aws.lambdaf.environment['Variables']['UDOCKER_LAYERS'] = \ - '/var/task/udocker/containers/' - - def download_udocker_image(self): - self.save_tmp_udocker_env() - SysUtils.execute_command_with_msg(self.udocker_exec + ["pull", self.aws.lambdaf.image], - cli_msg="Downloading container image") - self._create_udocker_container() - self._set_udocker_local_registry() - self.restore_udocker_env() def prepare_udocker_image(self): - self.save_tmp_udocker_env() - image_path = FileUtils.join_paths(FileUtils.get_tmp_dir(), "udocker_image.tar.gz") - FileUtils.copy_file(self.aws.lambdaf.image_file, image_path) - cmd_out = SysUtils.execute_command_with_msg(self.udocker_exec + ["load", "-i", image_path], + self._save_tmp_udocker_env() + cmd_out = SysUtils.execute_command_with_msg(self._udocker_exec + ["load", "-i", + self.resources_info.get('lambda').get('container').get('image_file')], cli_msg="Loading image file") # Get the image name from the command output - self.aws.lambdaf.image = cmd_out.split('\n')[1] - self._create_udocker_container() - self.aws.lambdaf.environment['Variables']['IMAGE_ID'] = self.aws.lambdaf.image + self.resources_info['lambda']['container']['image'] = cmd_out.split('\n')[1] self._set_udocker_local_registry() - self.restore_udocker_env() + self._restore_udocker_env() diff --git a/scar/providers/aws/validators.py b/scar/providers/aws/validators.py index 89f4c7c9..31c8f8bf 100644 --- a/scar/providers/aws/validators.py +++ b/scar/providers/aws/validators.py @@ -15,7 +15,6 @@ from scar.exceptions import ValidatorError, S3CodeSizeError, \ FunctionCodeSizeError, InvocationPayloadError -from scar.validator import GenericValidator from scar.utils import FileUtils, StrUtils VALID_LAMBDA_NAME_REGEX = (r"(arn:(aws[a-zA-Z-]*)?:lambda:)?([a-z]{2}(-gov)?-[a-z]+-\d{1}:)?(" @@ -26,18 +25,19 @@ MAX_POST_BODY_SIZE_ASYNC = KB * 95 -class AWSValidator(GenericValidator): +class AWSValidator(): """Class with methods to validate AWS properties.""" - @classmethod + @staticmethod def validate_kwargs(cls, **kwargs): - prov_args = kwargs['aws'] - if 'iam' in prov_args: - cls.validate_iam(prov_args['iam']) - if 'lambda' in prov_args: - cls.validate_lambda(prov_args['lambda']) - if 'batch' in prov_args: - cls.validate_batch(prov_args['batch']) + aws_functions = kwargs.get('functions', {}).get('aws', {}) + for function in aws_functions: + if 'iam' in function: + cls.validate_iam(function['iam']) + if 'lambda' in function: + cls.validate_lambda(function['lambda']) + if 'batch' in function: + cls.validate_batch(function['batch']) @staticmethod def validate_iam(iam_properties): @@ -48,7 +48,7 @@ def validate_iam(iam_properties): parameter_value=iam_properties, error_msg=error_msg) - @classmethod + @staticmethod def validate_lambda(cls, lambda_properties): if 'name' in lambda_properties: cls.validate_function_name(lambda_properties['name']) @@ -57,7 +57,7 @@ def validate_lambda(cls, lambda_properties): if 'time' in lambda_properties: cls.validate_time(lambda_properties['time']) - @classmethod + @staticmethod def validate_batch(cls, batch_properties): if 'vcpus' in batch_properties: cls.validate_batch_vcpus(batch_properties['vcpus']) diff --git a/scar/scarcli.py b/scar/scarcli.py index c250c786..97efb23b 100755 --- a/scar/scarcli.py +++ b/scar/scarcli.py @@ -17,76 +17,52 @@ import sys sys.path.append('.') -from scar.cmdtemplate import Commands from scar.parser.cfgfile import ConfigFileParser from scar.parser.cli import CommandParser -from scar.parser.yaml import YamlParser from scar.providers.aws.controller import AWS +from scar.utils import FileUtils +import scar.parser.fdl as fdl import scar.exceptions as excp import scar.logger as logger -from scar.utils import DataTypesUtils -class ScarCLI(Commands): - - def __init__(self): - self.cloud_provider = AWS() - - def init(self): - self.cloud_provider.init() - - def invoke(self): - self.cloud_provider.invoke() - - def run(self): - self.cloud_provider.run() - - def update(self): - self.cloud_provider.update() - - def ls(self): - self.cloud_provider.ls() - - def rm(self): - self.cloud_provider.rm() - - def log(self): - self.cloud_provider.log() - - def put(self): - self.cloud_provider.put() - - def get(self): - self.cloud_provider.get() - - @excp.exception(logger) - def parse_arguments(self): - """ - Merge the scar.conf parameters, the cmd parameters and the yaml - file parameters in a single dictionary. - - The precedence of parameters is CMD >> YAML >> SCAR.CONF - That is, the CMD parameter will override any other configuration, - and the YAML parameters will override the SCAR.CONF settings - """ - merged_args = ConfigFileParser().get_properties() - cmd_args = CommandParser(self).parse_arguments() - if 'conf_file' in cmd_args['scar'] and cmd_args['scar']['conf_file']: - yaml_args = YamlParser(cmd_args['scar']).parse_arguments() - merged_args = DataTypesUtils.merge_dicts(yaml_args, merged_args) - merged_args = DataTypesUtils.merge_dicts(cmd_args, merged_args) - self.cloud_provider.parse_arguments(**merged_args) - merged_args['scar']['func']() +@excp.exception(logger) +def parse_arguments(): + """ + Merge the scar.conf parameters, the cmd parameters and the yaml + file parameters in a single dictionary. + + The precedence of parameters is CMD >> YAML >> SCAR.CONF + That is, the CMD parameter will override any other configuration, + and the YAML parameters will override the SCAR.CONF settings + """ + config_args = ConfigFileParser().get_properties() + func_call, cmd_args = CommandParser().parse_arguments() + if 'conf_file' in cmd_args['scar'] and cmd_args['scar']['conf_file']: + yaml_args = FileUtils.load_yaml(cmd_args['scar']['conf_file']) + # YAML >> SCAR.CONF + merged_args = fdl.merge_conf(config_args, yaml_args) + merged_args = fdl.merge_cmd_yaml(cmd_args, merged_args) + else: + # CMD >> SCAR.CONF + merged_args = fdl.merge_conf(config_args, cmd_args) + #self.cloud_provider.parse_arguments(merged_args) + FileUtils.create_tmp_config_file(merged_args) + return func_call def main(): logger.init_execution_trace() try: - ScarCLI().parse_arguments() + func_call = parse_arguments() + # Default provider + # If more providers, analyze the arguments and build the required one + AWS(func_call) logger.end_execution_trace() except Exception as excp: print(excp) logger.exception(excp) logger.end_execution_trace_with_errors() + if __name__ == "__main__": main() diff --git a/scar/utils.py b/scar/utils.py index a6a82ed8..cb38fbe4 100644 --- a/scar/utils.py +++ b/scar/utils.py @@ -23,15 +23,18 @@ import tempfile import uuid import sys +from zipfile import ZipFile +from io import BytesIO from typing import Optional, Dict, List, Generator, Union, Any from distutils import dir_util from packaging import version +import yaml import scar.logger as logger import scar.http.request as request -from scar.exceptions import GitHubTagNotFoundError +from scar.exceptions import GitHubTagNotFoundError, YamlFileNotFoundError + +COMMANDS = ['scar-config'] -GITHUB_USER = 'grycap' -GITHUB_SUPERVISOR_PROJECT = 'faas-supervisor' def lazy_property(func): # Skipped type hinting: https://github.com/python/mypy/issues/3157 @@ -74,26 +77,14 @@ def delete_environment_variable(variable: str) -> None: del os.environ[variable] @staticmethod - def execute_command_with_msg(command: List[str], cmd_wd: Optional[str] = None, - cli_msg: str = '') -> str: + def execute_command_with_msg(command: List[str], cmd_wd: Optional[str]=None, + cli_msg: str='') -> str: """Execute the specified command and return the result.""" cmd_out = subprocess.check_output(command, cwd=cmd_wd).decode('utf-8') logger.debug(cmd_out) logger.info(cli_msg) return cmd_out[:-1] - @staticmethod - def get_filtered_env_vars(key_filter: str) -> Dict: - """Returns the global variables that start with the - key_filter provided and removes the filter used.""" - size = len(key_filter) - env_vars = {} - for key, val in os.environ.items(): - # Find global variables with the specified prefix - if key.startswith(key_filter): - env_vars[key[size:]] = val - return env_vars - @staticmethod def get_user_home_path() -> str: """Returns the path of the current user's home.""" @@ -115,14 +106,29 @@ def merge_dicts(dict1: Dict, dict2: Dict) -> Dict: 'dict2' has precedence over 'dict1'.""" for key, val in dict2.items(): if val is not None: - if key not in dict1: - dict1[key] = val - elif isinstance(val, dict): + if isinstance(val, dict) and key in dict1: dict1[key] = DataTypesUtils.merge_dicts(dict1[key], val) - elif isinstance(val, list): + elif isinstance(val, list) and key in dict1: dict1[key] += val + else: + dict1[key] = val return dict1 + @staticmethod + def merge_dicts_with_copy(dict1: Dict, dict2: Dict) -> Dict: + """Merge 'dict1' and 'dict2' dicts into a new Dict. + 'dict2' has precedence over 'dict1'.""" + result = dict1.copy() + for key, val in dict2.items(): + if val is not None: + if isinstance(val, dict) and key in result: + result[key] = DataTypesUtils.merge_dicts_with_copy(result[key], val) + elif isinstance(val, list) and key in result: + result[key] += val + else: + result[key] = val + return result + @staticmethod def divide_list_in_chunks(elements: List, chunk_size: int) -> Generator[List, None, None]: """Yield successive n-sized chunks from th elements list.""" @@ -187,6 +193,12 @@ def create_tmp_dir() -> tempfile.TemporaryDirectory: When the context is finished, the folder is automatically deleted.""" return tempfile.TemporaryDirectory() + @staticmethod + def create_tmp_file(**kwargs) -> tempfile.NamedTemporaryFile: + """Creates a directory in the temporal folder of the system. + When the context is finished, the folder is automatically deleted.""" + return tempfile.NamedTemporaryFile(**kwargs) + @staticmethod def get_tree_size(path: str) -> int: """Return total size of files in given path and subdirs.""" @@ -216,7 +228,7 @@ def get_file_size(file_path: str) -> int: @staticmethod def create_file_with_content(path: str, content: Optional[Union[str, bytes]], - mode: str = 'w') -> None: + mode: str='w') -> None: """Creates a new file with the passed content. If the content is a dictionary, first is converted to a string.""" with open(path, mode) as fwc: @@ -225,7 +237,7 @@ def create_file_with_content(path: str, fwc.write(content) @staticmethod - def read_file(file_path: str, mode: str = 'r') -> Optional[Union[str, bytes]]: + def read_file(file_path: str, mode: str='r') -> Optional[Union[str, bytes]]: """Reads the whole specified file and returns the content.""" with open(file_path, mode) as content_file: return content_file.read() @@ -256,7 +268,7 @@ def extract_tar_gz(tar_path: str, destination_path: str) -> None: tar.extractall(path=destination_path) @staticmethod - def unzip_folder(zip_path: str, folder_where_unzip_path: str, msg: str = '') -> None: + def unzip_folder(zip_path: str, folder_where_unzip_path: str, msg: str='') -> None: """Must use the unzip binary to preserve the file properties and the symlinks.""" zip_exe = '/usr/bin/unzip' SysUtils.execute_command_with_msg([zip_exe, zip_path], @@ -264,7 +276,7 @@ def unzip_folder(zip_path: str, folder_where_unzip_path: str, msg: str = '') -> cli_msg=msg) @staticmethod - def zip_folder(zip_path: str, folder_to_zip_path: str, msg: str = '') -> None: + def zip_folder(zip_path: str, folder_to_zip_path: str, msg: str='') -> None: """Must use the zip binary to preserve the file properties and the symlinks.""" zip_exe = '/usr/bin/zip' SysUtils.execute_command_with_msg([zip_exe, '-r9y', zip_path, '.'], @@ -272,10 +284,43 @@ def zip_folder(zip_path: str, folder_to_zip_path: str, msg: str = '') -> None: cli_msg=msg) @staticmethod - def is_file(file_path): + def is_file(file_path: str): """Test whether a path is a regular file.""" return os.path.isfile(file_path) + @staticmethod + def load_yaml(file_path: str) -> Dict: + """Returns the content of a YAML file as a Dict.""" + if os.path.isfile(file_path): + with open(file_path) as cfg_file: + return yaml.safe_load(cfg_file) + else: + raise YamlFileNotFoundError(file_path=file_path) + + @staticmethod + def write_yaml(file_path: str, content: Dict) -> None: + with open(file_path, 'w') as cfg_file: + yaml.safe_dump(content, cfg_file) + + @staticmethod + def create_tmp_config_file(cfg_args): + cfg_path = FileUtils.join_paths(SysUtils.get_user_home_path(), ".scar", "scar_tmp.yaml") + os.environ['SCAR_TMP_CFG'] = cfg_path + FileUtils.write_yaml(cfg_path, cfg_args) + + @staticmethod + def load_tmp_config_file(): + return FileUtils.load_yaml(os.environ['SCAR_TMP_CFG']) + + @staticmethod + def get_file_name(file_path: str) -> str: + return os.path.basename(file_path) + + @staticmethod + def extract_zip_from_url(url: str, dest_path: str) -> None: + with ZipFile(BytesIO(url)) as thezip: + thezip.extractall(dest_path) + class StrUtils: """Common methods for string management.""" @@ -303,12 +348,12 @@ def utf8_to_base64_string(value: str) -> str: """Encode a 'utf-8' string using Base64 and return the encoded value as a string.""" return StrUtils.encode_base64(bytes(value, 'utf-8')).decode('utf-8') - + @staticmethod def bytes_to_base64str(value, encoding='utf-8') -> str: """Encode a 'utf-8' string using Base64 and return the encoded value as a string.""" - return StrUtils.encode_base64(value).decode(encoding) + return StrUtils.encode_base64(value).decode(encoding) @staticmethod def dict_to_base64_string(value: Dict) -> str: @@ -358,14 +403,17 @@ def get_latest_release(user: str, project: str) -> str: def exists_release_in_repo(user: str, project: str, tag_name: str) -> bool: """Check if a tagged release exists in a repository.""" url = f'https://api.github.com/repos/{user}/{project}/releases/tags/{tag_name}' - response = json.loads(request.get_file(url)) + response = request.get_file(url) + if not response: + return False + response = json.loads(response) if 'message' in response and response['message'] == 'Not Found': return False return True @staticmethod def get_asset_url(user: str, project: str, asset_name: str, - tag_name: str = 'latest') -> Optional[str]: + tag_name: str='latest') -> Optional[str]: """Get the download asset url from the specified github tagged project.""" if tag_name == 'latest': url = f'https://api.github.com/repos/{user}/{project}/releases/latest' @@ -382,7 +430,7 @@ def get_asset_url(user: str, project: str, asset_name: str, return None @staticmethod - def get_source_code_url(user: str, project: str, tag_name: str = 'latest') -> str: + def get_source_code_url(user: str, project: str, tag_name: str='latest') -> str: """Get the source code's url from the specified github tagged project.""" source_url = "" repo_url = "" @@ -398,3 +446,52 @@ def get_source_code_url(user: str, project: str, tag_name: str = 'latest') -> st if isinstance(response, dict): source_url = response.get('zipball_url') return source_url + + +class SupervisorUtils: + """Common methods for FaaS Supervisor management. + https://github.com/grycap/faas-supervisor/""" + + _SUPERVISOR_GITHUB_REPO = 'faas-supervisor' + _SUPERVISOR_GITHUB_USER = 'grycap' + _SUPERVISOR_GITHUB_ASSET_NAME = 'supervisor' + + @classmethod + def download_supervisor(cls, supervisor_version: str, path: str) -> str: + """Downloads the FaaS Supervisor .zip package to the specified path.""" + supervisor_zip_path = FileUtils.join_paths(path, 'faas-supervisor.zip') + supervisor_zip_url = GitHubUtils.get_source_code_url( + cls._SUPERVISOR_GITHUB_USER, + cls._SUPERVISOR_GITHUB_REPO, + supervisor_version) + with open(supervisor_zip_path, "wb") as thezip: + thezip.write(request.get_file(supervisor_zip_url)) + return supervisor_zip_path + + @classmethod + def check_supervisor_version(cls, supervisor_version: str) -> str: + """Checks if the specified version exists in FaaS Supervisor's GitHub + repository. Returns the version if exists and 'latest' if not.""" + if GitHubUtils.exists_release_in_repo(cls._SUPERVISOR_GITHUB_USER, + cls._SUPERVISOR_GITHUB_REPO, + supervisor_version): + return supervisor_version + latest_version = SupervisorUtils.get_latest_release() + if supervisor_version != 'latest': + logger.info('Defined supervisor version does not exists.') + logger.info(f'Using latest supervisor release: \'{latest_version}\'.') + return latest_version + + @classmethod + def get_supervisor_binary_url(cls, supervisor_version: str) -> str: + """Returns the supervisor's binary download url.""" + return GitHubUtils.get_asset_url(cls._SUPERVISOR_GITHUB_USER, + cls._SUPERVISOR_GITHUB_REPO, + cls._SUPERVISOR_GITHUB_ASSET_NAME, + supervisor_version) + + @classmethod + def get_latest_release(cls) -> str: + """Returns the latest FaaS Supervisor version.""" + return GitHubUtils.get_latest_release(cls._SUPERVISOR_GITHUB_USER, + cls._SUPERVISOR_GITHUB_REPO) diff --git a/scar/validator.py b/scar/validator.py deleted file mode 100644 index 0f9a3a40..00000000 --- a/scar/validator.py +++ /dev/null @@ -1,36 +0,0 @@ -# Copyright (C) GRyCAP - I3M - UPV -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import abc - -class GenericValidator(metaclass=abc.ABCMeta): - ''' All the different cloud provider validators must inherit - from this class to ensure that the commands are defined consistently''' - - @classmethod - def validate(cls): - ''' - A decorator that wraps the passed in function and validates the dictionary parameters passed - ''' - def decorator(func): - def wrapper(*args, **kwargs): - cls.validate_kwargs(**kwargs) - return func(*args, **kwargs) - return wrapper - return decorator - - @classmethod - @abc.abstractmethod - def validate_kwargs(**kwargs): - pass diff --git a/scar/version.py b/scar/version.py index a0cd79a3..dc73d7d2 100644 --- a/scar/version.py +++ b/scar/version.py @@ -12,4 +12,4 @@ # See the License for the specific language governing permissions and # limitations under the License. -__version__ = '3.2.2' \ No newline at end of file +__version__ = '4.0.0' diff --git a/test/__init__.py b/test/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/functional/__init__.py b/test/functional/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/functional/aws.py b/test/functional/aws.py index 2ed62191..4688974e 100644 --- a/test/functional/aws.py +++ b/test/functional/aws.py @@ -35,9 +35,9 @@ def create_function(self, function_name): cmd = self.get_cmd(["init","-n", function_name, "-i", "centos:7"]) cmd_out = self.execute_command(cmd) self.assertTrue("Packing udocker files" in cmd_out) - self.assertTrue("Creating function package" in cmd_out) - self.assertTrue("Function '{0}' successfully created".format(function_name) in cmd_out) - self.assertTrue("Log group '/aws/lambda/{0}' successfully created".format(function_name) in cmd_out) + self.assertTrue("Creating function package." in cmd_out) + self.assertTrue("Function '{0}' successfully created.".format(function_name) in cmd_out) + self.assertTrue("Log group '/aws/lambda/{0}' successfully created.".format(function_name) in cmd_out) def test_empty_ls_table(self): cmd = self.get_cmd(["ls"]) diff --git a/test/functional/parser/__init__.py b/test/functional/parser/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/functional/parser/fdl.py b/test/functional/parser/fdl.py new file mode 100644 index 00000000..587a9ab2 --- /dev/null +++ b/test/functional/parser/fdl.py @@ -0,0 +1,52 @@ +# Copyright (C) GRyCAP - I3M - UPV +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import unittest +from scar.parser.fdl import FDLParser + + +class Test(unittest.TestCase): + + def testFDL(self): + ''' Expected return value (except random ids): + [{'env_vars':{ + 'STORAGE_AUTH_MINIO_PASS_TMPXCMCNA9S': 'mpass', + 'STORAGE_AUTH_MINIO_USER_TMPXCMCNA9S': 'muser', + 'STORAGE_PATH_INPUT_TMPOTOWSDYE': 's3-bucket/test1', + 'STORAGE_PATH_INPUT_TMPXCMCNA9S': 'my-bucket/test', + 'STORAGE_PATH_OUTPUT_TMPOTOWSDYE': 's3-bucket/test1-output', + 'STORAGE_PATH_OUTPUT_TMPXCMCNA9S': 'my-bucket/test-output', + 'STORAGE_PATH_SUFIX_TMPOTOWSDYE': 'avi', + 'STORAGE_PATH_SUFIX_TMPXCMCNA9S': 'wav:srt'}, + 'name': 'function1'}, + {'env_vars': { + 'STORAGE_AUTH_MINIO_PASS_TMPXCMCNA9S': 'mpass', + 'STORAGE_AUTH_MINIO_USER_TMPXCMCNA9S': 'muser', + 'STORAGE_PATH_INPUT_TMPXCMCNA9S': 'my-bucket2/test', + 'STORAGE_PATH_OUTPUT_TMPXCMCNA9S': 'my-bucket2/test-output', + 'STORAGE_PATH_PREFIX_TMPXCMCNA9S': 'my_file'}, + 'name': 'function2'}] + ''' + result = FDLParser().parse_yaml('fdl.yaml') + self.assertEqual(len(result), 2) + for function in result: + self.assertTrue(('name' in function) and ('env_vars' in function)) + if function['name'] == 'function1': + self.assertEqual(len(function['env_vars'].items()), 8) + elif function['name'] == 'function2': + self.assertEqual(len(function['env_vars'].items()), 5) + + +if __name__ == "__main__": + unittest.main() diff --git a/test/functional/parser/fdl.yaml b/test/functional/parser/fdl.yaml new file mode 100644 index 00000000..6d432a70 --- /dev/null +++ b/test/functional/parser/fdl.yaml @@ -0,0 +1,39 @@ +functions: + - name: function1 + input: + - name: minio-local # Match with automatically generated id -> 123 + path: my-bucket/test #STORAGE_PATH_INTPUT_123=my-bucket/test + - name: s3-bucket # Match with automatically generated id -> 456 + path: s3-bucket/test1 #STORAGE_PATH_INTPUT_456=s3-bucket/test1 + output: + - name: minio-local # Match with automatically generated id -> 123 + path: my-bucket/test-output #STORAGE_PATH_OUTPUT_123=my-bucket/test-output + files: + sufix: #STORAGE_PATH_SUFIX_123=wav:srt + - wav + - srt + - name: s3-bucket # Match with automatically generated id -> 456 + path: s3-bucket/test1-output #STORAGE_PATH_OUTPUT_456=s3-bucket/test1-output + files: + sufix: # Possible values: 'prefix', 'sufix' + - avi #STORAGE_PATH_SUFIX_123=avi + - name: function2 + input: + - name: minio-local # Match with automatically generated id -> 123 + path: my-bucket2/test #STORAGE_PATH_INTPUT_123=my-bucket/test + output: + - name: minio-local # Match with automatically generated id -> 123 + path: my-bucket2/test-output #STORAGE_PATH_OUTPUT_123=my-bucket/test-output + files: + prefix: #STORAGE_PATH_SUFIX_123=wav:srt + - my_file + + +storages: + - name: minio-local # Generate random id -> 123 + type: minio # Possible values: 'minio', 's3', 'onedata' + auth: # Possible values: 'user', 'pass', 'token', 'space', 'host' + user: muser #STORAGE_AUTH_MINIO_USER_123=muser + pass: mpass #STORAGE_AUTH_MINIO_PASS_123=mpass + - name: s3-bucket # Generate random id -> 456 + type: S3 diff --git a/video-process.yaml b/video-process.yaml new file mode 100644 index 00000000..804177a8 --- /dev/null +++ b/video-process.yaml @@ -0,0 +1,35 @@ +functions: + aws: + - lambda: + name: scar-batch-ffmpeg-split + image: grycap/ffmpeg + init_script: split-video.sh + execution_mode: batch + log_level: debug + input: + - storage_provider: s3 + path: scar-ffmpeg + output: + - storage_provider: s3 + path: scar-ffmpeg/scar-batch-ffmpeg-split/video-output + - lambda: + name: scar-lambda-darknet + image: grycap/darknet + memory: 3008 + log_level: debug + init_script: yolo-sample-object-detection.sh + input: + - storage_provider: s3 + path: scar-ffmpeg/scar-batch-ffmpeg-split/video-output + output: + - storage_provider: s3 + path: scar-ffmpeg/scar-batch-ffmpeg-split/image-output + +storage_providers: + s3: + - name: s3-bucket + minio: + - name: minio-bucket + auth: + user: muser + pass: mpass