Skip to content

Commit

Permalink
Update to the new OML Notebooks format (oracle-samples#354)
Browse files Browse the repository at this point in the history
Old Notebooks are being kept for compatibility under a notebooks-classic sub-folder, while SQL, Python and R notebooks are replaced with the new OML Notebooks engine, and a new REST folder is created to accommodate the new REST APIs for OML Services.
  • Loading branch information
marancibia authored May 17, 2024
1 parent 1c1bb32 commit 9146946
Show file tree
Hide file tree
Showing 258 changed files with 339 additions and 42 deletions.
28 changes: 28 additions & 0 deletions machine-learning/notebooks/notebooks-classic/python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Oracle Machine Learning for Python
Oracle Machine Learning for Python (OML4Py) supports scalable in-database data exploration and preparation using native Python syntax, scalable in-database algorithms for machine learning model building and scoring, and automated machine learning (AutoML). Users can also invoke user-defined Python functions from Python, SQL and REST APIs using database-spawned Python engines. OML4Py increases data scientist productivity and reduces solution deployment complexity.

Python is a major programming language used for data science and machine learning. OML4Py is a feature on Oracle Autonomous Database that provides Python users access to powerful in-database functionality supporting data scientists for both scalability, performance, and ease of solution deployment.

Oracle Machine Learning Notebooks is a collaborative user interface for data scientists and business and data analysts who perform machine learning in Oracle Autonomous Database.

Oracle Machine Learning Notebooks enables data scientists, citizen data scientists, and data analysts to work together to explore their data visually and develop analytical methodologies in the Autonomous Database. Oracle's high performance, parallel and scalable in-Database implementations of machine learning algorithms are exposed via SQL and PL/SQL using notebook technologies. Oracle Machine Learning enables teams to collaborate to build, assess, and deploy machine learning models, while increasing data scientist productivity Oracle Machine Learning focuses on ease of use and simplified machine learning for data science – from preparation through deployment – all in Oracle Autonomous Database.

Based on Apache Zeppelin notebook technology, Oracle Machine Learning Notebooks provides a common platform with a single interface that can connect to multiple data sources and access multiple back-end Autonomous Database servers. Multi-user collaboration enables the same notebook document to be opened simultaneously by different users, such that changes made by one user to a notebook are instantaneously reflected to all users viewing that notebook. To support enterprise requirements for security, authentication, and auditing, Oracle Machine Learning supports privilege-based access to data, models, and notebooks, as well as being integrated with Oracle security protocols.

Key Features:

* Collaborative UI for data scientists
* Enables sharing of notebooks and templates with permissions and execution scheduling
* Access to 30+ parallel, scalable in-Database implementations of machine learning algorithms
* Python, R, SQL and PL/SQL scripting language supported
* Enables and supports deployments of enterprise machine learning methodologies in all versions of Autonomous Database

This current folder contains the examples based on Oracle Machine Learning for Python (OML4Py), from data preparation and transformation, to the machine learning algorithms and scoring.

The specific denomination "OfficeHours_" in the name of the file indicates that the Notebook was presented at one of the previous sessions of the [AskTOM Oracle Machine Learning Office Hours](https://asktom.oracle.com/pls/apex/asktom.search?oh=6801#sessions).

See [Oracle Machine Learning Notebooks - Get Started](https://docs.oracle.com/en/database/oracle/machine-learning/oml-notebooks/) for more information.

#### Copyright (c) 2023 Oracle Corporation and/or its affilitiates.

###### [The Universal Permissive License (UPL), Version 1.0](https://oss.oracle.com/licenses/upl/)
28 changes: 28 additions & 0 deletions machine-learning/notebooks/notebooks-classic/r/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Oracle Machine Learning for R
Oracle Machine Learning for R (OML4R) supports scalable in-database data exploration and preparation using native R syntax, scalable in-database algorithms for machine learning model building and scoring. Users can also invoke user-defined Python functions from R, SQL and REST APIs using database-spawned R engines. OML4R increases data scientist productivity and reduces solution deployment complexity.

R is a major programming language used for data science and machine learning. OML4R is a feature on Oracle Autonomous Database that provides R users access to powerful in-database functionality supporting data scientists for both scalability, performance, and ease of solution deployment.

Oracle Machine Learning Notebooks is a collaborative user interface for data scientists and business and data analysts who perform machine learning in Oracle Autonomous Database.

Oracle Machine Learning Notebooks enables data scientists, citizen data scientists, and data analysts to work together to explore their data visually and develop analytical methodologies in the Autonomous Database. Oracle's high performance, parallel and scalable in-Database implementations of machine learning algorithms are exposed via SQL and PL/SQL using notebook technologies. Oracle Machine Learning enables teams to collaborate to build, assess, and deploy machine learning models, while increasing data scientist productivity Oracle Machine Learning focuses on ease of use and simplified machine learning for data science – from preparation through deployment – all in Oracle Autonomous Database.

Based on Apache Zeppelin notebook technology, Oracle Machine Learning Notebooks provides a common platform with a single interface that can connect to multiple data sources and access multiple back-end Autonomous Database servers. Multi-user collaboration enables the same notebook document to be opened simultaneously by different users, such that changes made by one user to a notebook are instantaneously reflected to all users viewing that notebook. To support enterprise requirements for security, authentication, and auditing, Oracle Machine Learning supports privilege-based access to data, models, and notebooks, as well as being integrated with Oracle security protocols.

Key Features:

* Collaborative UI for data scientists
* Enables sharing of notebooks and templates with permissions and execution scheduling
* Access to 30+ parallel, scalable in-Database implementations of machine learning algorithms
* Python, R, SQL and PL/SQL scripting language supported
* Enables and supports deployments of enterprise machine learning methodologies in all versions of Autonomous Database

This current folder contains the examples based on Oracle Machine Learning for R (OML4R), from data preparation and transformation, to the machine learning algorithms and scoring.

The specific denomination "OfficeHours_" in the name of the file indicates that the Notebook was presented at one of the previous sessions of the [AskTOM Oracle Machine Learning Office Hours](https://asktom.oracle.com/pls/apex/asktom.search?oh=6801#sessions).

See [Oracle Machine Learning Notebooks - Get Started](https://docs.oracle.com/en/database/oracle/machine-learning/oml-notebooks/) for more information.

#### Copyright (c) 2023 Oracle Corporation and/or its affilitiates.

###### [The Universal Permissive License (UPL), Version 1.0](https://oss.oracle.com/licenses/upl/)
File renamed without changes.
26 changes: 26 additions & 0 deletions machine-learning/notebooks/notebooks-classic/sql/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Oracle Machine Learning Notebooks
Oracle Machine Learning Notebooks is a collaborative user interface for data scientists and business and data analysts who perform machine learning in Oracle Autonomous Database.

Oracle Machine Learning Notebooks enables data scientists, citizen data scientists, and data analysts to work together to explore their data visually and develop analytical methodologies in the Autonomous Data Warehouse Cloud. Oracle's high performance, parallel and scalable in-Database implementations of machine learning algorithms are exposed via SQL and PL/SQL using notebook technologies. Oracle Machine Learning enables teams to collaborate to build, assess, and deploy machine learning models, while increasing data scientist productivity Oracle Machine Learning focuses on ease of use and simplified machine learning for data science – from preparation through deployment – all in Oracle Autonomous Database.

Based on Apache Zeppelin notebook technology, Oracle Machine Learning Notebooks provides a common platform with a single interface that can connect to multiple data sources and access multiple back-end Autonomous Database servers. Multi-user collaboration enables the same notebook document to be opened simultaneously by different users, such that changes made by one user to a notebook are instantaneously reflected to all users viewing that notebook. To support enterprise requirements for security, authentication, and auditing, Oracle Machine Learning supports privilege-based access to data, models, and notebooks, as well as being integrated with Oracle security protocols.

Key Features:

* Collaborative UI for data scientists
* Enables sharing of notebooks and templates with permissions and execution scheduling
* Access to 30+ parallel, scalable in-Database implementations of machine learning algorithms
* SQL, PL/SQL scripts, R and Python language supported
* Enables and supports deployments of enterprise machine learning methodologies in all versions of Autonomous Database

The examples here cover a range of functionality and methods, from data preparation and data cleansing to machine learning algorithms and scoring.

The specific denomination "21c or 23c" in the name of the file means that the algorithm used in that demo is supported on that release only.

See [Oracle Machine Learning Notebooks - Get Started](https://docs.oracle.com/en/database/oracle/machine-learning/oml-notebooks/) for more information.

Last updated on: June 2023

#### Copyright (c) 2023 Oracle Corporation and/or its affilitiates.

###### [The Universal Permissive License (UPL), Version 1.0](https://oss.oracle.com/licenses/upl/)

Large diffs are not rendered by default.

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions machine-learning/notebooks/python/OML4Py -5- AutoML.dsnb

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[{"layout":null,"template":null,"templateConfig":null,"name":"OML4Py Data Cleaning Duplicates Removal","description":null,"readOnly":false,"type":"low","paragraphs":[{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":null,"title":null,"message":["%md"," "],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","# OML4Py Data Cleaning: Duplicates Removal","In this notebook, we demonstrate how to remove duplicate records using OML4Py.","","We use the customer insurance lifetime value data set which contains customer financial information, lifetime value, and whether or not the customer bought insurance. ","","The dataset `CUSTOMER_INSURANCE_LTV_PY` is generated by the `\"OML Run-me-first\"` notebook, which `MUST` be run before this notebook.","","---","","###### IMPORTANT: The `\"OML Run-me-first\"` notebook is available under the menu `Templates -> Examples` and is a pre-requisite to the current notebook.","","---","","Copyright (c) 2024 Oracle Corporation ","###### <a href=\"https://oss.oracle.com/licenses/upl/\" onclick=\"return ! window.open('https://oss.oracle.com/licenses/upl/');\">The Universal Permissive License (UPL), Version 1.0<\/a>","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"For more information ...","message":["%md","","* <a href=\"https://docs.oracle.com/en/cloud/paas/autonomous-data-warehouse-cloud/index.html\" target=\"_blank\">Oracle ADW Documentation<\/a>","* <a href=\"https://github.com/oracle/oracle-db-examples/tree/master/machine-learning\" target=\"_blank\">OML folder on Oracle GitHub<\/a>","* <a href=\"https://www.oracle.com/machine-learning\" target=\"_blank\">OML Web Page<\/a>","* <a href=\"https://docs.oracle.com/en/database/oracle/machine-learning/oml4py/1/mlpug/clean-data.html\" target=\"_blank\">OML4Py Data Cleaning<\/a>","","","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Import python libraries ","message":["%python","","import warnings","warnings.filterwarnings('ignore')","","import pandas as pd","import oml"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Get proxy object for CUSTOMER_INSURANCE_LTV_PY table","message":["%python","","CUST_DF = oml.sync(table = 'CUSTOMER_INSURANCE_LTV_PY')"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Count number of unique customer IDs","message":["%python","","CUST_DF['CUSTOMER_ID'].nunique()"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":5,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Show table dimensions - note there are more rows than customer IDs","message":["%python","","CUST_DF.shape"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":7,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Remove duplicate rows","message":["%python","","CUST_DF = CUST_DF.drop_duplicates()"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Check number of unique customer IDs","message":["%python","","CUST_DF['CUSTOMER_ID'].nunique()"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":5,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Check dimension of the OML dataframe - duplicated rows removed","message":["%python","","CUST_DF.shape"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":7,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","## End of Script"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"}],"version":"6","snapshot":false,"tags":null}]
Loading

0 comments on commit 9146946

Please sign in to comment.