Clean data is a prerequisite to any good model or useful analysis. When data is present in large amounts or is not stored properly, cleaning activities take up a majority of the time in a project. Datasets are a collection of related data stored in a data structure format. By storing the data in a row-column format dataset, data cleaning activities can be performed more efficiently.
It also ensures that the cleaned dataset can be used for multiple analyses and ML models. In this article, we’ll look at how to create Dataset in Oracle Cloud Analytics using the existing Autonomous Database connection.
This blog will show steps to create a dataset from a pre-existing connection in Oracle Analytics Cloud.
Before creating a dataset it is essential to have a connection to the database where all your data is residing. This connection will be used to access the data and perform cleaning and preparation activities on it.
- On the upper right corner of the Analytics Home page, click on Create and select Dataset.
- A prompt will open asking you to select the dataset’s source. There are 3 options: a. From a local file, b. From a Local Subject Area, and 3. From a Connection. Select your pre-existing connection. For this article, we will use the existing connection to create the Dataset. Loom in the previous article to know how to create Connect. In this case, select Test connection as shown in the following screenshot:
- Once you selected the Connection, a new page will open and on the left side of the page, all the schemas present in your database will be displayed.
- Select the schema whose dataset you want to create. Once selected the screen displays starting few schema rows and provides basic information such as the number of unique values present.
OAC automatically detects the data type of each column (Attribute or Measure), but it can be changed manually depending on the need.
- Click on the Save button in the upper right corner and provide a Name and Description of your dataset as shown in the following screenshot:
- Once the save is done a ‘Dataset saved successfully’ message will be displayed.
Check Dataset Creation
In case the prompt is not displayed or some error occurred, you can check if the dataset was saved in OAC or not by following the steps mentioned below-
- On the top left corner of the home page click on the hamburger menu and select the Data tab as shown in the following screenshot:
- A list of all the saved datasets will be displayed.
This dataset can now be used for further data preparation.
For more videos on OCI please visit the YouTube channel and turn notifications on for regular updates.