start of work for aws prices and region filtering (#6)

* start of work for aws prices and region filtering * automation docker build * addition of "Regions" attribute to AWS instance data, allowing for --region to be a filter string (Google already had it) * addition of sorting based on field, and ascending order of results * a parameter in settings to ask to use the cache only (set now to false, will be true when we can provide remote cache) * filtering by regions for both instances AWS/Google * moved original design notes into separate doc in docs (preparing for prettier docs at some point) Signed-off-by: vsoch <[email protected]>
converged-computing · Dec 7, 2022 · f28af61 · f28af61
1 parent 5a52791
commit f28af61
Show file tree

Hide file tree

Showing 17 changed files with 385 additions and 108 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -0,0 +1 @@
+.gitignore
diff --git a/.github/workflows/build-deploy.yaml b/.github/workflows/build-deploy.yaml
@@ -0,0 +1,48 @@
+name: build cloud-select
+
+on:
+
+  # Publish packages on release
+  release:
+    types: [published]
+
+  pull_request: []
+
+  # On push to main we build and deploy images
+  push:
+    branches:
+    - main
+
+jobs:
+  build:
+    permissions:
+      packages: write
+
+    runs-on: ubuntu-latest
+    name: Build
+    steps:
+    - name: Checkout
+      uses: actions/checkout@v3
+
+    - name: Build Container
+      run: docker build -t ghcr.io/converged-computing/cloud-select:latest .
+
+    - name: GHCR Login
+      if: (github.event_name != 'pull_request')
+      uses: docker/login-action@v2
+      with:
+        registry: ghcr.io
+        username: ${{ github.actor }}
+        password: ${{ secrets.GITHUB_TOKEN }}
+
+    - name: Tag and Push Release Image
+      if: (github.event_name == 'release')
+      run: |
+        tag=${GITHUB_REF#refs/tags/}
+        echo "Tagging and releasing ghcr.io/converged-computing/cloud-select:${tag}"
+        docker tag ghcr.io/converged-computing/cloud-select:latest converged-computing/cloud-select:${tag}
+        docker push converged-computing/cloud-select:${tag}
+
+    - name: Deploy
+      if: (github.event_name != 'pull_request')
+      run: docker push ghcr.io/converged-computing/cloud-select:latest
diff --git a/.gitignore b/.gitignore
@@ -7,13 +7,7 @@ env
 build
 docs/_build
 release
-_site
 dist/
-OLD
 __pycache__
-*.simg
-*.sif
 *.img
 /.eggs
-/modules
-/views
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,16 @@
+FROM ubuntu
+
+# docker build -t cloud-select .
+
+LABEL MAINTAINER @vsoch
+ENV PATH /opt/conda/bin:${PATH}
+ENV LANG C.UTF-8
+RUN apt-get update && \
+    apt-get install -y wget && \
+    wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
+    bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
+    rm Miniconda3-latest-Linux-x86_64.sh
+
+WORKDIR /code
+COPY . /code
+RUN pip install -e .[all] && pip install ipython
diff --git a/README.md b/README.md
@@ -112,62 +112,112 @@ Ask for a specific cloud on the command line (note you can also ask via your set
 $ cloud-select --cloud google instance --cpus-min 200 --cpus-max 400
 ```
 
-## Design
+#### Sorting
 
-We can follow the design of the [aws selector tool](https://github.com/aws/amazon-ec2-instance-selector).
+By default we sort results based on the order the solver produces them.
+However, you can ask to sort your results by an attribute, e.g., here is memory:
 
-## Details
+```bash
+$ cloud-select --sort-by memory instance
+```
+
+By default, when sort is enabled on an attribute we do descending, so the largest values
+are at the top. You can ask to reverse that with `--asc` for ascending, meaning we sort
+from least to greatest:
+
+```bash
+$ cloud-select --asc --sort-by memory instance
+```
+
+#### Max Results
+
+You can always change the max results (which defaults to 25):
+
+```bash
+$ cloud-select --max-results 100 instance
+```
+
+We currently sort from greatest to least. Set max results to 0 to set no limit.
+
+```bash
+$ cloud-select --max-results 0 instance
+```
+
+Note that this argument comes before the instance command.
 
-It is non-trivial to find the correct instances, or more generally, do cost comparison across clouds. A tool that can intelligently map a resource request to a set of options, and then present the user with a set of options (or a tool) can alleviate this current challenge. Importantly, we don't want to provide one answer, as the tool needs to be agnostic and not suggest a specific cloud.
+#### Regions
 
-### Implementation Idea
+For regions, note that you have a default set in your settings.yml. E.g.,:
 
-The implementation needs three parts: 1. a database of contender machines that is automatically updated at some frequency, 2. a tool that can parse this database and select a subset based on a user criteria, and 3. a final mapping of each in that selection to a cost estimate (using live or active APIs).
+```yaml
+google:
+  regions: ["us-east1", "us-west1", "us-central1"]
+
+aws:
+  regions: ["us-east-1"]
+```
 
-1. Start with APIs that can list instance types. We likely want to filter down into different groups.
-2. Think about how to do a mapping across clouds. Likely this means being able to generalize (e.g., describe based on memory, size, GPU or other features, etc)
-3. Save metadata about instances given the above attributes.
-4. Can we generate a solve to find an optimal instance?
+These are used for API calls to retrieve a filtered set, but not to filter that set.
+You should generally be more verbose in this set, as it will be the meta set we further
+filter. When appropriate, "global" is also added to find resources across regions. For
+a one-off region for a query:
+
+```bash
+$ cloud-select  instance  --region east
+```
 
-As an example use case, we could create a simple web app (and underlying user interface) that allows to define a jobspec
-Jobspec → filter to top options → price API.
+Since region names are non consistent across clouds, the above is just a regular expression.
+This means that to change region:
 
-> Why Python?
+- edit settings.yml to change the global set you use
+- add `--region` to a particular query to filter (within the set above).
 
-To start, I was thinking we should use Python APIs for quick prototyping
+If you have a cache with older data (and different regions) you will want to clear it.
+If we eventually store the cache by region this might be easier to manage,
+however this isn't done yet to maintain simplicity of design.
 
-> Why use ASP / clingo and do a solve?
+**Note** We use regions and zones a bit generously - on a high level a region encompasses
+many zones, and thus a specification of `regions` (as shown below) typically
+indicates regions, but under the hood we might be filtering the specific zones.
+A result might generally be labeled with "region" and include a zone name.
 
-Given matching requests for amounts, this is probablhy overkill - we could have iterables over a range of options filter this very easily.
-The honest answer is that I thought it would be more fun to try using ASP. We can always
-remove it for a simpler solution, as it does go against my better jugment to add extra dependencies that aren't needed.
-That said, if the solve becomes more complex, it could be cool to have it.
+#### Cache Only
 
+To set a global setting to only use the cache (and skip trying to authenticate)
+you cat set `cache_only` in your settings.yml to true:
 
-## Previous Art
+```yaml
+cache_only: true
+```
 
-- AWS already has an instance selector in Go https://github.com/aws/amazon-ec2-instance-selector
-- GCP has one in perl https://github.com/Cyclenerd/google-cloud-compute-machine-types
+This will be the default when we are able to provide a remote cache,
+as then you won't be required to have your own credential to use the
+tool out of the box!
 
-I think I'm still going to use Python for faster prototyping.
 
 ## TODO and Questions
 
-- Are we allowed to provide a cache of instance types (e.g., automated update in GitHub?)
-- should be able to set custom instances per cloud - either directly for a cloud, or generic string to match (e.g., "east")
-- some logic to standardize regions (e.g., "east")
-- add tests and testing workflow
-  - properties testing for handling min/max/numbers
-- Add Docker build / automated builds
-- ensure that required set of attributes for each instance are returned (e.g., name, cpu, memory)
-- how to handle instances that don't have an attribute of interest? Should we unselect them?
-- pretty branded documentation
-- selection should have sorting ability
+See our current [design document](docs/design.md) for background about design.
+
+- [ ]create cache of instance types and maybe prices in GitHub (e.g., automated update)
+- [ ]add tests and testing workflow
+  - [ ]properties testing for handling min/max/numbers
+  - [ ] ensure that required set of attributes for each instance are returned (e.g., name, cpu, memory)
+- [ ] how to handle instances that don't have an attribute of interest? Should we unselect them?
+- [ ] pretty branded documentation and spell checking
+- [ ] add GPU memory - available in AWS and I cannot find for GCP
+- [ ] should cache be organized by region to allow easier filter (data for AWS doesn't have that attribute)
+- [ ] need to do something with costs
+- [ ] test performance of using solver vs. not
+
+### Future desires
+
+These are either "nice to have" or small details we can improve upon. Aka, not top priority.
+
 - should we allow currency outside USD? Probably not for now.
-- aws instance listing (based on regions) should validate regions - an invalid regions simply returns no results
 - could eventually support different resource types (beyond compute or types of prices, e.g., pre-emptible vs. on demand)
-- add GPU memory - available in AWS not sure gcp
-- add AWS description from metadata (similar to GCP)
+- aws instance listing (based on regions) should validate regions - an invalid regions simply returns no results
+- for AWS description, when appropriate convert to TB (like Google does)
 
 Planning for minimizing cost:
 

diff --git a/cloud_select/client/__init__.py b/cloud_select/client/__init__.py
@@ -112,6 +112,7 @@ def get_parser():
         dest="max_results",
         help="Maximum results to return per cloud provider.",
         type=int,
+        default=25,
     )
     parser.add_argument(
         "--cloud",
@@ -121,6 +122,19 @@ def get_parser():
         action="append",
     )
 
+    parser.add_argument(
+        "--sort-by",
+        dest="sort_by",
+        help="Sort by a result attribute.",
+        choices=["name", "cpus", "gpus", "memory"],
+    )
+    parser.add_argument(
+        "--asc",
+        dest="ascending",
+        help="Sort results ascending instead of descending (default)",
+        action="store_true",
+        default=False,
+    )
     parser.add_argument(
         "--cache-expire",
         dest="cache_expire",

diff --git a/cloud_select/client/instance.py b/cloud_select/client/instance.py
@@ -27,6 +27,10 @@ def main(args, parser, extra, subparser):
     # Update config settings on the fly
     cli.settings.update_params(args.config_params)
 
+    # If max results is 0, set to None (no limit)
+    if args.max_results == 0:
+        args.max_results = None
+
     # Are we writing ASP to an output file?
     asp_out = None
     out = args.out
@@ -55,4 +59,9 @@ def main(args, parser, extra, subparser):
         utils.write_json(out, instances)
     else:
         t = table.Table(instances)
-        t.table(title="Cloud Instances Selected")
+        t.table(
+            title="Cloud Instances Selected",
+            sort_by=args.sort_by,
+            limit=args.max_results,
+            ascending=args.ascending,
+        )
diff --git a/cloud_select/main/client.py b/cloud_select/main/client.py
@@ -71,7 +71,13 @@ def get_clouds(self, force=False, lookup=False):
         # We should always be able to get cloud classes, even without auth
         # The class knows how to parse the data types into a standard space
         for cloud_name, CloudClass in self._cloudclass.items():
-            self._clouds[cloud_name] = CloudClass()
+
+            # Regions default to settings then defaults
+            cloud_settings = getattr(self.settings, cloud_name)
+            self._clouds[cloud_name] = CloudClass(
+                regions=cloud_settings.get("regions"),
+                cache_only=self.settings.cache_only,
+            )
         return self._clouds if lookup else list(self._clouds.values())
 
     def instances(self):
@@ -130,8 +136,6 @@ def update_from_cache(self, items, datatype):
     def instance_select(self, max_results=20, out=None, **kwargs):
         """
         Select an instance.
-
-        We don't currently do anything with kwargs (but will eventually to filter)
         """
         # Start with already cached data
         instances = self.update_from_cache(self.instances(), "instances")
@@ -145,6 +149,11 @@ def instance_select(self, max_results=20, out=None, **kwargs):
         if self.settings.disable_prices is not True:
             prices = self.update_from_cache(self.prices(), "prices")
 
+        # Attributes that can't go into the solver
+        region = kwargs.get("region")
+        if "region" in kwargs:
+            del kwargs["region"]
+
         # By here we have a lookup *by cloud) of instance groups
         # Filter down kwargs (properties) to those relevant to instances
         properties = solve.Properties(schemas.instance_properties, **kwargs)
@@ -154,6 +163,10 @@ def instance_select(self, max_results=20, out=None, **kwargs):
         # 2. filter down to desired set based on these common functions
         for cloud_name, instance_group in instances.items():
 
+            # Do we have a request to filter by region?
+            if region is not None:
+                instance_group.filter_region(region)
+
             # Do we have prices for the cloud?
             if cloud_name in prices:
                 instance_group.add_instance_prices(prices[cloud_name])