Welcome to SINGA-Auto’s Documentation!

What is SINGA-Auto?

SINGA-Auto is a distributed system that trains machine learning (ML) models and deploys trained models, built with ease-of-use in mind. To do so, it leverages on automated machine learning (AutoML).

For Application Developers and Application Users, without any ML expertise, they can:

  • Create a model training job for supported tasks, with their own datasets
  • Deploy an ensemble of trained models for inference
  • Integrate model predictions in their apps over HTTP

For Model Developers, they can:

  • Contribute to SINGA-Auto’s pool of model templates

Check out Quick Setup to deploy/develop SINGA-Auto on your machine, and/or Quick Start to use a deployed instance of SINGA-Auto.

Index

User Guide

Quick Start

This guide assumes you have deployed your own empty instance of SINGA-Auto and you want to try a full train-inference flow as the Super Admin:

  1. Authenticating on SINGA-Auto
  2. Submitting models
  3. Uploading datasets
  4. Creating a model training job
  5. Creating a model serving job after the model training job completes
  6. Making predictions

Follow the sequence of examples below to submit the Fashion MNIST dataset for training and inference. Alternatively, refer and run the scripted version of this quickstart ./examples/scripts/quickstart.py.

To learn more about what else you can do on SINGA-Auto, explore the methods of singa_auto.client.Client.

Note

If you haven’t set up SINGA-Auto on your local machine, refer to Quick Setup before continuing.

Note

Installing the client
  1. Install Python 3.6 such that the python and pip point to the correct installation of Python (see Installing Python)

  2. Clone the project at https://github.com/nusdbsystem/singa-auto (e.g. with Git)

  3. Within the project’s root folder, install SINGA-Auto’s client-side Python dependencies by running:

    pip install -r ./singa_auto/requirements.txt
    
Initializing the client

Example:

from singa_auto.client import Client
client = Client(admin_host='localhost', admin_port=3000) # 'localhost' can be replaced by '127.0.0.1' or other server address
client.login(email='superadmin@singaauto', password='singa_auto')

See also

singa_auto.client.Client.login()

Creating models

To create a model, you’ll need to submit a model class that conforms to the specification by singa_auto.model.BaseModel, written in a single Python file. The model’s implementation should conform to a specific task (see tasks).

Refer to the parameters of singa_auto.client.Client.create_model() for configuring how your model runs on SINGA-Auto, and refer to Model Development Guide to understand more about how to write & test models for SINGA-Auto.

Example:

client.create_model(
    name='TfFeedForward',
    task='IMAGE_CLASSIFICATION',
    model_file_path='examples/models/image_classification/TfFeedForward.py',
    model_class='TfFeedForward',
    dependencies={ 'tensorflow': '1.12.0' }
)

client.create_model(
    name='SkDt',
    task='IMAGE_CLASSIFICATION',
    model_file_path='examples/models/image_classification/SkDt.py',
    model_class='SkDt',
    dependencies={ 'scikit-learn': '0.20.0' }
)

See also

singa_auto.client.Client.create_model()

Listing available models by task

Example:

client.get_available_models(task='IMAGE_CLASSIFICATION')
# While leave the "task" unspecified, the method will retrieve information of all uploaded models
client.get_available_models()

Output:

[{'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'tensorflow': '1.12.0'},
 'id': '45df3f34-53d7-4fb8-a7c2-55391ea10030',
 'name': 'TfFeedForward',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'},
 {'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'scikit-learn': '0.20.0'},
 'id': 'd0ea96ce-478b-4167-8a84-eb36ae631235',
 'name': 'SkDt',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'}]

See also

singa_auto.client.Client.get_available_models()

Creating datasets

You’ll first need to convert your dataset into a format specified by one of the tasks (see tasks), and split them into two files: one for training & one for validation. After doing so, you’ll create 2 corresponding datasets on SINGA-Auto by uploading them from your filesystem.

Example (pre-processing step):

# Run this in shell
python examples/datasets/image_files/load_fashion_mnist.py

Example:

client.create_dataset(
    name='fashion_mnist_train',
    task='IMAGE_CLASSIFICATION',
    dataset_path='data/fashion_mnist_train.zip'
)

client.create_dataset(
    name='fashion_mnist_val',
    task='IMAGE_CLASSIFICATION',
    dataset_path='data/fashion_mnist_val.zip'
)

Output:

{'id': 'ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
'name': 'fashion_mnist_train',
'size_bytes': 36702897,
'task': 'IMAGE_CLASSIFICATION'}

{'id': '7e9a2f8a-c61d-4365-ae4a-601e90892b88',
'name': 'fashion_mnist_val',
'size_bytes': 6116386,
'task': 'IMAGE_CLASSIFICATION'}

See also

singa_auto.client.Client.create_dataset()

Note

The code that preprocesses the original Fashion MNIST dataset is available at ./examples/datasets/image_files/load_mnist_format.py.

Creating a train job

To create a model training job, you’ll specify the train & validation datasets by their IDs, together with your application’s name and its associated task.

After creating a train job, you can monitor it on SINGA-Auto Web Admin (see Using SINGA-Auto’s Web Admin).

Refer to the parameters of singa_auto.client.Client.create_train_job() for configuring how your train job runs on SINGA-Auto, such as enabling GPU usage & specifying which models to use.

Example:

client.create_train_job(
    app='fashion_mnist_app',
    task='IMAGE_CLASSIFICATION',
    train_dataset_id='ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
    val_dataset_id='7e9a2f8a-c61d-4365-ae4a-601e90892b88',
    budget={ 'MODEL_TRIAL_COUNT': 5 }
    model_ids='["652db9f7-d23d-4b79-945b-a56446ceff33"]'
)
# Omitting the GPU_COUNT is the same as letting GPU_COUNT equal to 0, which means training will be hosted on CPU only
# MODEL_TRIAL_COUNT stands for number of trials, minimus MODEL_TRIAL_COUNT is 1 for a valid training
# TIME_HOURS is assigned training time limit in hours.
# train_args={} could be left empty or unspecified, if not in use
client.create_train_job(
    app='fashion_mnist_app',
    task='IMAGE_CLASSIFICATION',
    train_dataset_id='ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
    val_dataset_id='7e9a2f8a-c61d-4365-ae4a-601e90892b88',
    budget={'TIME_HOURS': 0.01,
            'GPU_COUNT': 0,
            'MODEL_TRIAL_COUNT': 1}
    model_ids='["652db9f7-d23d-4b79-945b-a56446ceff33"]',
    train_args={}
)

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}
Using distributed training:
refer to https://pytorch.org/docs/stable/distributed.html

Example:

Output:

{'app': 'DistMinist',
'app_version': 1,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.create_train_job()

Listing train jobs

Example:

client.get_train_jobs_of_app(app='fashion_mnist_app')

Output:

[{'app': 'fashion_mnist_app',
'app_version': 1,
'budget': {'MODEL_TRIAL_COUNT': 5},
'datetime_started': 'Mon, 17 Dec 2018 07:08:05 GMT',
'datetime_stopped': None,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8',
'status': 'RUNNING',
'task': 'IMAGE_CLASSIFICATION',
'val_dataset_id': '7e9a2f8a-c61d-4365-ae4a-601e90892b88',
'train_dataset_id': 'ecf87d2f-6893-4e4b-8ed9-1d9454af9763'}]

See also

singa_auto.client.Client.get_train_jobs_of_app()

Creating an inference job with the latest train job

To create an model serving job, you’ll have to wait for your train job to stop. Then, you’ll submit the app name associated with the train job (with a status of STOPPED). The inference job would be created from the best trials from that train job.

Example:

client.create_inference_job(app='fashion_mnist_app')
# Or with more details specified, such as Number of GPU 'GPU_COUNT'
client.create_inference_job(app='fashion_mnist_app', app_version=1, budget={'GPU_COUNT': 1} )

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'id': '0477d03c-d312-48c5-8612-f9b37b368949',
'predictor_host': '127.0.0.1:30001',
'train_job_id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.create_inference_job()

Listing inference jobs

Example:

client.get_inference_jobs_of_app(app='fashion_mnist_app')

Output:

{'app': 'fashion_mnist_app',
  'app_version': 1,
  'datetime_started': 'Mon, 17 Dec 2018 07:15:12 GMT',
  'datetime_stopped': None,
  'id': '0477d03c-d312-48c5-8612-f9b37b368949',
  'predictor_host': '127.0.0.1:30000',
  'status': 'RUNNING',
  'train_job_id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.get_inference_jobs_of_app()

Making predictions

Send a POST /predict to predictor_host with a body of the following format in JSON:

{
    "query": <query>
}

…where the format of <query> depends on the associated task (see tasks).

The body of the response will be of the following format in JSON:

{
    "prediction": <prediction>
}

…where the format of <prediction> depends on the associated task.

Example:

If predictor_host is 127.0.0.1:30000, run the following in Python:

predictor_host = '127.0.0.1:30000'
query_path = 'examples/data/image_classification/fashion_mnist_test_1.png'

# Load query image as 3D list of pixels
from singa_auto.model import utils
[query] = utils.dataset.load_images([query_path]).tolist()

# Make request to predictor
import requests
import json
res = requests.post('http://{}/predict'.format(predictor_host), json={ 'query': query })
print(res.json())

Output:

{'prediction': [0.9364003576825639, 1.016065009906697e-08, 0.0027604885399341583, 0.00014587241457775235, 6.018594376655528e-06, 1.042887332047826e-09, 0.060679372351310566, 2.024707311532037e-11, 7.901770004536957e-06, 1.5299328026685544e-08],
'predictions': []}
Prediction for QuestionAnswering
The query question should be uploaded by the following format
data={"questions": ["How long individuals are contagious?"]}
res = requests.post('http://{}/predict'.format(predictor_host), json=data)
To print out the prediction result, you should use ‘res.text’
print(res.text)
Prediction for SpeechRecognition
The query data is passed using the following steps

data = [‘data/ldc93s1/ldc93s1/LDC93S1.wav’] data = json.dumps(data) res = requests.post(’http://{}/predict’.format(predictor_host), json=data[0])

To print out the prediction result, you should use ‘res.text’
print(res.text)

If the SINGA-Auto instance is deployed with Kubernetes, all the inference job are at the default Ingress port 3005 with the format of <host>:3005/<app>, where <host> is the host name of the SINGA-Auto instance, and <app> is the name of the application prodvided when we submit train jobs.

Stopping a running inference job

Example:

client.stop_inference_job(app='fashion_mnist_app')

See also

singa_auto.client.Client.stop_inference_job()

Quick Start (Model Developers)

As a Model Developer, you can manage models, datasets, train jobs & inference jobs on SINGA-Auto. This guide only highlights the key methods available to manage models.

To learn about how to manage datasets, train jobs & inference jobs, go to Quick Start (Application Developers).

This guide assumes that you have access to a running instance of SINGA-Auto Admin at <singa_auto_host>:<admin_port> and SINGA-Auto Web Admin at <singa_auto_host>:<web_admin_port>.

To learn more about what else you can do on SINGA-Auto, explore the methods of singa_auto.client.Client

Installing the client
  1. Install Python 3.6 such that the python and pip point to the correct installation of Python (see Installing Python)

  2. Clone the project at https://github.com/nusdbsystem/singa-auto (e.g. with Git)

  3. Within the project’s root folder, install SINGA-Auto’s client-side Python dependencies by running:

    pip install -r ./singa_auto/requirements.txt
    
Initializing the client

Example:

from singa_auto.client import Client
client = Client(admin_host='localhost', admin_port=3000)
client.login(email='superadmin@singa_auto', password='singa_auto')

See also

singa_auto.client.Client.login()

Creating models

To create a model, you’ll need to submit a model class that conforms to the specification by singa_auto.model.BaseModel, written in a single Python file. The model’s implementation should conform to a specific task (see tasks).

Refer to the parameters of singa_auto.client.Client.create_model() for configuring how your model runs on SINGA-Auto, and refer to Model Development Guide to understand more about how to write & test models for SINGA-Auto.

Example:

client.create_model(
    name='TfFeedForward',
    task='IMAGE_CLASSIFICATION',
    model_file_path='examples/models/image_classification/TfFeedForward.py',
    model_class='TfFeedForward',
    dependencies={ 'tensorflow': '1.12.0' }
)

client.create_model(
    name='SkDt',
    task='IMAGE_CLASSIFICATION',
    model_file_path='examples/models/image_classification/SkDt.py',
    model_class='SkDt',
    dependencies={ 'scikit-learn': '0.20.0' }
)

See also

singa_auto.client.Client.create_model()

Listing available models by task

Example:

client.get_available_models(task='IMAGE_CLASSIFICATION')
# While leave the "task" unspecified, the method will retrieve information of all uploaded models
client.get_available_models()

Output:

[{'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'tensorflow': '1.12.0'},
 'id': '45df3f34-53d7-4fb8-a7c2-55391ea10030',
 'name': 'TfFeedForward',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'},
 {'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'scikit-learn': '0.20.0'},
 'id': 'd0ea96ce-478b-4167-8a84-eb36ae631235',
 'name': 'SkDt',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'}]

See also

singa_auto.client.Client.get_available_models()

Deleting a model

Example:

client.delete_model('fb5671f1-c673-40e7-b53a-9208eb1ccc50')

See also

singa_auto.client.Client.delete_model()

Quick Start (Application Developers)

As an App Developer, you can manage datasets, train jobs & inference jobs on SINGA-Auto. This guide walks through a full train-inference flow:

  1. Authenticating on SINGA-Auto
  2. Uploading datasets
  3. Creating a model training job
  4. Creating a model serving job after the model training job completes

This guide assumes that you have access to a running instance of SINGA-Auto Admin at <singa_auto_host>:<admin_port> and SINGA-Auto Web Admin at <singa_auto_host>:<web_admin_port>, and there have been models added to SINGA-Auto under the task of IMAGE_CLASSIFICATION.

To learn more about what else you can do on SINGA-Auto, explore the methods of singa_auto.client.Client.

Installing the client
  1. Install Python 3.6 such that the python and pip point to the correct installation of Python (see Installing Python)

  2. Clone the project at https://github.com/nusdbsystem/singa-auto (e.g. with Git)

  3. Within the project’s root folder, install SINGA-Auto’s client-side Python dependencies by running:

    pip install -r ./singa_auto/requirements.txt
    
Initializing the client

Example:

from singa_auto.client import Client
client = Client(admin_host='localhost', admin_port=3000)
client.login(email='superadmin@singaauto', password='singa_auto')

See also

singa_auto.client.Client.login()

Listing available models by task

Example:

client.get_available_models(task='IMAGE_CLASSIFICATION')
# While leave the "task" unspecified, the method will retrieve information of all uploaded models
client.get_available_models()

Output:

[{'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'tensorflow': '1.12.0'},
 'id': '45df3f34-53d7-4fb8-a7c2-55391ea10030',
 'name': 'TfFeedForward',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'},
 {'access_right': 'PRIVATE',
 'datetime_created': 'Mon, 17 Dec 2018 07:06:03 GMT',
 'dependencies': {'scikit-learn': '0.20.0'},
 'id': 'd0ea96ce-478b-4167-8a84-eb36ae631235',
 'name': 'SkDt',
 'task': 'IMAGE_CLASSIFICATION',
 'user_id': 'fb5671f1-c673-40e7-b53a-9208eb1ccc50'}]

See also

singa_auto.client.Client.get_available_models()

Creating datasets

You’ll first need to convert your dataset into a format specified by one of the tasks (see tasks), and split them into two files: one for training & one for validation. After doing so, you’ll create 2 corresponding datasets on SINGA-Auto by uploading them from your filesystem.

Example (pre-processing step):

# Run this in shell
python examples/datasets/image_files/load_fashion_mnist.py

Example:

client.create_dataset(
    name='fashion_mnist_train',
    task='IMAGE_CLASSIFICATION',
    dataset_path='data/fashion_mnist_train.zip'
)

client.create_dataset(
    name='fashion_mnist_val',
    task='IMAGE_CLASSIFICATION',
    dataset_path='data/fashion_mnist_val.zip'
)

Output:

{'id': 'ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
'name': 'fashion_mnist_train',
'size_bytes': 36702897,
'task': 'IMAGE_CLASSIFICATION'}

{'id': '7e9a2f8a-c61d-4365-ae4a-601e90892b88',
'name': 'fashion_mnist_val',
'size_bytes': 6116386,
'task': 'IMAGE_CLASSIFICATION'}

See also

singa_auto.client.Client.create_dataset()

Note

The code that preprocesses the original Fashion MNIST dataset is available at ./examples/datasets/image_files/load_mnist_format.py.

Creating a train job

To create a model training job, you’ll specify the train & validation datasets by their IDs, together with your application’s name and its associated task.

After creating a train job, you can monitor it on SINGA-Auto Web Admin (see Using SINGA-Auto’s Web Admin).

Refer to the parameters of singa_auto.client.Client.create_train_job() for configuring how your train job runs on SINGA-Auto, such as enabling GPU usage & specifying which models to use.

Example:

client.create_train_job(
    app='fashion_mnist_app',
    task='IMAGE_CLASSIFICATION',
    train_dataset_id='ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
    val_dataset_id='7e9a2f8a-c61d-4365-ae4a-601e90892b88',
    budget={ 'MODEL_TRIAL_COUNT': 5 }
    model_ids='["652db9f7-d23d-4b79-945b-a56446ceff33"]'
)
# Omitting the GPU_COUNT is the same as letting GPU_COUNT equal to 0, which means training will be hosted on CPU only
# MODEL_TRIAL_COUNT stands for number of trials, minimus MODEL_TRIAL_COUNT is 1 for a valid training
# TIME_HOURS is assigned training time limit in hours.
# train_args={} could be left empty or unspecified, if not in use
client.create_train_job(
    app='fashion_mnist_app',
    task='IMAGE_CLASSIFICATION',
    train_dataset_id='ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
    val_dataset_id='7e9a2f8a-c61d-4365-ae4a-601e90892b88',
    budget={'TIME_HOURS': 0.01,
            'GPU_COUNT': 0,
            'MODEL_TRIAL_COUNT': 1}
    model_ids='["652db9f7-d23d-4b79-945b-a56446ceff33"]',
    train_args={}
)

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}
Using distributed training:
refer to https://pytorch.org/docs/stable/distributed.html

Example:

Output:

{'app': 'DistMinist',
'app_version': 1,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.create_train_job()

Listing train jobs

Example:

client.get_train_jobs_of_app(app='fashion_mnist_app')

Output:

[{'app': 'fashion_mnist_app',
'app_version': 1,
'budget': {'MODEL_TRIAL_COUNT': 5},
'datetime_started': 'Mon, 17 Dec 2018 07:08:05 GMT',
'datetime_stopped': None,
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8',
'status': 'RUNNING',
'task': 'IMAGE_CLASSIFICATION',
'val_dataset_id': '7e9a2f8a-c61d-4365-ae4a-601e90892b88',
'train_dataset_id': 'ecf87d2f-6893-4e4b-8ed9-1d9454af9763'}]

See also

singa_auto.client.Client.get_train_jobs_of_app()

Retrieving the latest train job’s details

Example:

client.get_train_job(app='fashion_mnist_app')

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'datetime_started': 'Mon, 17 Dec 2018 07:08:05 GMT',
'datetime_stopped': 'Mon, 17 Dec 2018 07:11:11 GMT',
'id': 'ec4db479-b9b2-4289-8086-52794ffc71c8',
'status': 'STOPPED',
'task': 'IMAGE_CLASSIFICATION'
'val_dataset_id': '7e9a2f8a-c61d-4365-ae4a-601e90892b88',
'train_dataset_id': 'ecf87d2f-6893-4e4b-8ed9-1d9454af9763',
'workers': [{'datetime_started': 'Mon, 17 Dec 2018 07:08:05 GMT',
            'datetime_stopped': 'Mon, 17 Dec 2018 07:11:14 GMT',
            'model_name': 'SkDt',
            'replicas': 2,
            'service_id': '2ada1ff3-84e9-4eca-bac9-241cd8c765ef',
            'status': 'STOPPED'},
            {'datetime_started': 'Mon, 17 Dec 2018 07:08:05 GMT',
            'datetime_stopped': 'Mon, 17 Dec 2018 07:11:42 GMT',
            'model_name': 'TfFeedForward',
            'replicas': 2,
            'service_id': '81ff23a7-ddd0-4a62-9d86-a3cc985ca6fe',
            'status': 'STOPPED'}]}

See also

singa_auto.client.Client.get_train_job()

Listing best trials of the latest train job

Example:

client.get_best_trials_of_train_job(app='fashion_mnist_app')

Output:

[{'datetime_started': 'Mon, 17 Dec 2018 07:09:17 GMT',
'datetime_stopped': 'Mon, 17 Dec 2018 07:11:38 GMT',
'id': '1b7dc65a-87ae-4d42-9a01-67602115a4a4',
'knobs': {'batch_size': 32,
            'epochs': 3,
            'hidden_layer_count': 2,
            'hidden_layer_units': 36,
            'image_size': 32,
            'learning_rate': 0.014650971133579896},
'model_name': 'TfFeedForward',
'score': 0.8269},
{'datetime_started': 'Mon, 17 Dec 2018 07:08:38 GMT',
'datetime_stopped': 'Mon, 17 Dec 2018 07:11:11 GMT',
'id': '0c1f9184-7b46-4aaf-a581-be62bf3f49bf',
'knobs': {'criterion': 'entropy', 'max_depth': 4},
'model_name': 'SkDt',
'score': 0.6686}]

See also

singa_auto.client.Client.get_best_trials_of_train_job()

Creating an inference job with the latest train job

Your app’s users will make queries to the /predict endpoint of predictor_host over HTTP.

To create an model serving job, you’ll have to wait for your train job to stop. Then, you’ll submit the app name associated with the train job (with a status of STOPPED). The inference job would be created from the best trials from that train job.

Example:

client.create_inference_job(app='fashion_mnist_app')
# Or with more details specified, such as Number of GPU 'GPU_COUNT'
client.create_inference_job(app='fashion_mnist_app', app_version=1, budget={'GPU_COUNT': 1} )

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'id': '0477d03c-d312-48c5-8612-f9b37b368949',
'predictor_host': '127.0.0.1:30001',
'train_job_id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.create_inference_job()

Listing inference jobs

Example:

client.get_inference_jobs_of_app(app='fashion_mnist_app')

Output:

{'app': 'fashion_mnist_app',
  'app_version': 1,
  'datetime_started': 'Mon, 17 Dec 2018 07:15:12 GMT',
  'datetime_stopped': None,
  'id': '0477d03c-d312-48c5-8612-f9b37b368949',
  'predictor_host': '127.0.0.1:30000',
  'status': 'RUNNING',
  'train_job_id': 'ec4db479-b9b2-4289-8086-52794ffc71c8'}

See also

singa_auto.client.Client.get_inference_jobs_of_app()

Retrieving details of running inference job

See also

singa_auto.client.Client.get_running_inference_job()

Example:

client.get_running_inference_job(app='fashion_mnist_app')

Output:

{'app': 'fashion_mnist_app',
'app_version': 1,
'datetime_started': 'Mon, 17 Dec 2018 07:25:36 GMT',
'datetime_stopped': None,
'id': '09e5040e-2134-411b-855f-793927c80b4b',
'predictor_host': '127.0.0.1:30000',
'status': 'RUNNING',
'train_job_id': 'ec4db479-b9b2-4289-8086-52794ffc71c8',
'workers': [{'datetime_started': 'Mon, 17 Dec 2018 07:25:36 GMT',
            'datetime_stopped': None,
            'replicas': 2,
            'service_id': '661035bb-3966-46e8-828c-e200960a76c0',
            'status': 'RUNNING',
            'trial': {'id': '1b7dc65a-87ae-4d42-9a01-67602115a4a4',
                        'knobs': {'batch_size': 32,
                                'epochs': 3,
                                'hidden_layer_count': 2,
                                'hidden_layer_units': 36,
                                'image_size': 32,
                                'learning_rate': 0.014650971133579896},
                        'model_name': 'TfFeedForward',
                        'score': 0.8269}},
            {'datetime_started': 'Mon, 17 Dec 2018 07:25:36 GMT',
            'datetime_stopped': None,
            'replicas': 2,
            'service_id': '6a769007-b18f-4271-b3db-8b60ed5fb545',
            'status': 'RUNNING',
            'trial': {'id': '0c1f9184-7b46-4aaf-a581-be62bf3f49bf',
                        'knobs': {'criterion': 'entropy', 'max_depth': 4},
                        'model_name': 'SkDt',
                        'score': 0.6686}}]}
Stopping a running inference job

Example:

client.stop_inference_job(app='fashion_mnist_app')

See also

singa_auto.client.Client.stop_inference_job()

Downloading the trained model for a trial

After running a train job, you might want to download the trained model instance of a trial of the train job, instead of creating an inference job to make predictions. Subsequently, you’ll be able to make batch predictions locally with the trained model instance.

To do this, you must have the trial’s model class file already in your local filesystem, the dependencies of the model must have been installed separately, and the model class must have been imported and passed into this method.

To download the model class file, use the method singa_auto.client.Client.download_model_file().

Example:

In shell,

# Install the dependencies of the `TfFeedForward` model
pip install tensorflow==1.12.0

In Python,

# Find the best trial for model `TfFeedForward`
trials = [x for x in client.get_best_trials_of_train_job(app='fashion_mnist_app')
    if x.get('model_name') == 'TfFeedForward' and x.get('status') == 'COMPLETED']
trial = trials[0]
trial_id = trial.get('id')

# Import the model class
from examples.models.image_classification.TfFeedForward import TfFeedForward

# Load an instance of the model with trial's parameters
model_inst = client.load_trial_model(trial_id, TfFeedForward)

# Make predictions with trained model instance associated with best trial
queries = [[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 0, 7, 0, 37, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 27, 84, 11, 0, 0, 0, 0, 0, 0, 119, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 88, 143, 110, 0, 0, 0, 0, 22, 93, 106, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 53, 129, 120, 147, 175, 157, 166, 135, 154, 168, 140, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 11, 137, 130, 128, 160, 176, 159, 167, 178, 149, 151, 144, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 2, 1, 0, 3, 0, 0, 115, 114, 106, 137, 168, 153, 156, 165, 167, 143, 157, 158, 11, 0],
[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 89, 139, 90, 94, 153, 149, 131, 151, 169, 172, 143, 159, 169, 48, 0],
[0, 0, 0, 0, 0, 0, 2, 4, 1, 0, 0, 0, 98, 136, 110, 109, 110, 162, 135, 144, 149, 159, 167, 144, 158, 169, 119, 0],
[0, 0, 2, 2, 1, 2, 0, 0, 0, 0, 26, 108, 117, 99, 111, 117, 136, 156, 134, 154, 154, 156, 160, 141, 147, 156, 178, 0],
[3, 0, 0, 0, 0, 0, 0, 21, 53, 92, 117, 111, 103, 115, 129, 134, 143, 154, 165, 170, 154, 151, 154, 143, 138, 150, 165, 43],
[0, 0, 23, 54, 65, 76, 85, 118, 128, 123, 111, 113, 118, 127, 125, 139, 133, 136, 160, 140, 155, 161, 144, 155, 172, 161, 189, 62],
[0, 68, 94, 90, 111, 114, 111, 114, 115, 127, 135, 136, 143, 126, 127, 151, 154, 143, 148, 125, 162, 162, 144, 138, 153, 162, 196, 58],
[70, 169, 129, 104, 98, 100, 94, 97, 98, 102, 108, 106, 119, 120, 129, 149, 156, 167, 190, 190, 196, 198, 198, 187, 197, 189, 184, 36],
[16, 126, 171, 188, 188, 184, 171, 153, 135, 120, 126, 127, 146, 185, 195, 209, 208, 255, 209, 177, 245, 252, 251, 251, 247, 220, 206, 49],
[0, 0, 0, 12, 67, 106, 164, 185, 199, 210, 211, 210, 208, 190, 150, 82, 8, 0, 0, 0, 178, 208, 188, 175, 162, 158, 151, 11],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]]
print(model_inst.predict(queries))

See also

singa_auto.client.Client.load_trial_model()

Quick Start (Application Users)

As an App User, you can make predictions on models deployed on SINGA-Auto.

Making a single prediction

Your app developer should have created an inference job and shared predictor_host, the host at which you can send queries to and receive predictions over HTTP.

Send a POST /predict to predictor_host with a body of the following format in JSON:

{
    "query": <query>
}

…where the format of <query> depends on the associated task (see tasks).

The body of the response will be of the following format in JSON:

{
    "prediction": <prediction>
}

…where the format of <prediction> depends on the associated task.

Example:

If predictor_host is 127.0.0.1:30000, run the following in Python:

predictor_host = '127.0.0.1:30000'
query_path = 'examples/data/image_classification/fashion_mnist_test_1.png'

# Load query image as 3D list of pixels
from singa_auto.model import utils
[query] = utils.dataset.load_images([query_path]).tolist()

# Make request to predictor
import requests
import json
res = requests.post('http://{}/predict'.format(predictor_host), json={ 'query': query })
print(res.json())

Output:

{'prediction': [0.9364003576825639, 1.016065009906697e-08, 0.0027604885399341583, 0.00014587241457775235, 6.018594376655528e-06, 1.042887332047826e-09, 0.060679372351310566, 2.024707311532037e-11, 7.901770004536957e-06, 1.5299328026685544e-08],
'predictions': []}
Prediction for QuestionAnswering
The query question should be uploaded by the following format
data={"questions": ["How long individuals are contagious?"]}
res = requests.post('http://{}/predict'.format(predictor_host), json=data)
To print out the prediction result, you should use ‘res.text’
print(res.text)
Prediction for SpeechRecognition
The query data is passed using the following steps

data = [‘data/ldc93s1/ldc93s1/LDC93S1.wav’] data = json.dumps(data) res = requests.post(’http://{}/predict’.format(predictor_host), json=data[0])

To print out the prediction result, you should use ‘res.text’
print(res.text)

If the SINGA-Auto instance is deployed with Kubernetes, all the inference job are at the default Ingress port 3005 with the format of <host>:3005/<app>, where <host> is the host name of the SINGA-Auto instance, and <app> is the name of the application prodvided when we submit train jobs.

Making batch predictions

Similar to making a single prediction, but use the queries attribute instead of query in your request and pass an array of queries instead.

Example:

If predictor_host is 127.0.0.1:30000, run the following in Python:

predictor_host = '127.0.0.1:30000'
query_paths = ['examples/data/image_classification/fashion_mnist_test_1.png',
            'examples/data/image_classification/fashion_mnist_test_2.png']

# Load query image as 3D list of pixels
from singa_auto.model import utils
queries = utils.dataset.load_images(query_paths).tolist()

# Make request to predictor
import requests
res = requests.post('http://{}/predict'.format(predictor_host), json={ 'queries': queries })
print(res.json())

Output:

{'prediction': None,
'predictions': [[0.9364002384732744, 1.0160608354681244e-08, 0.0027604878414422274, 0.0001458720798837021, 6.018587100697914e-06, 1.0428869989809186e-09, 0.06067946175827773, 2.0247028012509993e-11, 7.901745448180009e-06, 1.5299294275905595e-08], [0.866741563402005, 5.757699909736402e-05, 0.0006144539802335203, 0.03480150588134776, 3.4249271266162395e-05, 1.3578344004727683e-09, 0.09774905198545598, 6.071191726436664e-12, 1.5324986861742218e-06, 1.583319586551113e-10]]}

Quick Start (Admins)

As an Admin, you can manage users, datasets, models, train jobs & inference jobs on SINGA-Auto. This guide only highlights the key methods available to manage users.

To learn about how to manage models, go to Quick Start (Model Developers).

To learn about how to manage train & inference jobs, go to Quick Start (Application Developers).

This guide assumes that you have access to a running instance of SINGA-Auto Admin at <singa_auto_host>:<admin_port>, e.g., 127.0.0.1:3000, and SINGA-Auto Web Admin at <singa_auto_host>:<web_admin_port>, e.g., 127.0.0.1:3001.

Installation
  1. Install Python 3.6 such that the python and pip point to the correct installation of Python (see Installing Python)

  2. Clone the project at https://github.com/nusdbsystem/singa-auto (e.g. with Git)

  3. Within the project’s root folder, install SINGA-Auto’s client-side Python dependencies by running:

    pip install -r ./singa_auto/requirements.txt
    
Initializing the client

Example:

from singa_auto.client import Client
client = Client(admin_host='localhost', admin_port=3000) # 'localhost' can be replaced by '127.0.0.1' or other server address
client.login(email='superadmin@singaauto', password='singa_auto')

See also

singa_auto.client.Client.login()

Creating users

Examples:

client.create_user(
    email='admin@singaauto',
    password='singa_auto',
    user_type='ADMIN'
)

client.create_user(
    email='model_developer@singaauto',
    password='singa_auto',
    user_type='MODEL_DEVELOPER'
)

client.create_user(
    email='app_developer@singaauto',
    password='singa_auto',
    user_type='APP_DEVELOPER'
)

See also

singa_auto.client.Client.create_user()

Listing all users

Example:

client.get_users()
[{'email': 'superadmin@singaauto',
'id': 'c815fa08-ce06-467d-941b-afc27684d092',
'user_type': 'SUPERADMIN'},
{'email': 'admin@singaauto',
'id': 'cb2c0d61-acd3-4b65-a5a7-d78aa5648283',
'user_type': 'ADMIN'},
{'email': 'model_developer@singaauto',
'id': 'bfe58183-9c69-4fbd-a7b3-3fdc267b3290',
'user_type': 'MODEL_DEVELOPER'},
{'email': 'app_developer@singaauto',
'id': '958a7d65-aa1d-437f-858e-8837bb3ecf32',
'user_type': 'APP_DEVELOPER'}]

See also

singa_auto.client.Client.get_users()

Banning a user

Example:

client.ban_user('app_developer@singaauto')

See also

singa_auto.client.Client.ban_user()

Using SINGA-Auto’s Web Admin

SINGA-Auto Web Admin is accessible at <singa_auto_host>:<web_admin_port> (e.g. 127.0.0.1:3001 by default).

Log in with the same credentials for SINGA-Auto Admin.

You’re currently able to view your own train jobs & datasets, and additionally create train jobs & datasets for the IMAGE_CLASSIFICATION task.

Supported Tasks

Each task has an associated a Dataset Format, a Query Format and a Prediction Format.

A task’s Dataset Format specifies the format of the dataset files. Datasets are prepared by Application Developers when they create Train Jobs and received by Model Developers when they define singa_auto.model.BaseModel.train and singa_auto.model.BaseModel.evaluate.

A task’s Query Format specifies the format of queries when they are passed to models. Queries are generated by Application Users when they send queries to Inference Jobs and received by Model Developers when they define singa_auto.model.BaseModel.predict.

A task’s Prediction Format specifies the format of predictions made by models. Predictions are generated by Model Developers when they define singa_auto.model.BaseModel.predict and received by Application Users as predictions to their queries sent to Inference Jobs.

IMAGE_SEGMENTATION
Dataset Format

dataset-type: SEGMENTATION_IMAGES

note

We use the same annotation format as Pascal VOC segmentation dataset

  • An image and its corresponding mask should have the same width and length while the number of channels can be different. For example, an image can have three channels representing RGB values but its mask should only have one grayscale channel.
  • In the mask image, each pixel’s grayscale value represents its label, while there can be a specific value represents the pixel is meaningless (the same definition as ignore_lable in some loss function) such as paddings or borders.
Query Format

An image file in the following common formats: .jpg, .jpeg, .png, .gif, .bmp, or .tiff.

Prediction Format

A W x H single-channel mask image file with each pixel’s grayscale value representing its label.

IMAGE_CLASSIFICATION
Dataset Format

dataset-type: IMAGE_FILES

  • There is only 1 tag column of class, corresponding to the class of the image as an integer from 0 to k - 1, where k is the total no. of classes.
  • The train & validation dataset’s images should be have the same dimensions W x H and same total no. of classes.

An example:

path,class
image-0-of-class-0.png,0
image-1-of-class-0.png,0
...
image-0-of-class-1.png,1
...
image-99-of-class-9.png,9

**note**

You can refer to and run
`./examples/datasets/image\_files/load\_folder\_format.py <https://github.com/nusdbsystem/singa-auto/tree/master/examples/datasets/load_folder_format.py>`__
for converting *directories of images* to SINGA-Auto's
``IMAGE_CLASSIFICATION`` format.
Query Format

An image file in the following common formats: .jpg, .jpeg, .png, .gif, .bmp, or .tiff.

Prediction Format

A jsonified string representing the classification result. There are no strict requirements for the format of the output string, which is entirely determined by the model itself, such as directly outputting the label or class name of the classification result, one-hot encoding, or the probability corresponding to each class.

OBJECT_DETECTION
Dataset Format

dataset-type: DETECTION_DATASET

It is recommended to follow the YOLO dataset format.

  • For folder hierarchy, two folders ‘images’ and ‘labels’ should be prepared. In ‘images’ folder, there are PIL loadable images, and the corresponding txt label files should be placed in ‘labels’ folder, with the same basename with the images.
  • The label file format is as follows, where object-id is the index of object, the following four numbers should be normalized to range between 0 and 1 by dividing by the width and height of the image. center_x center_y are the central coordinates of bounding box, and width heigh is the sides lengths of it. It is allowable to use empty label file (negative samples), which means there are no objects to detect in the image.
object-id center_x center_y width height
...
  • In addition, train.txt, valid.txt can be provided to note images used for training/validataion, only including the path of image files. A class.names contains the category names and thier line numbers are object-id.
Query Format

An image file in the following common formats: .jpg, .jpeg, .png, .gif, .bmp, or .tiff.

Prediction Format

A jsonified dict (string) indicating the bounding boxes and their corresponding classes. The keys and values format are strictly required as following:

{'explanations':
    {'box_info': [{'coord': (224, 275, 281, 357),
                   'class_name': 'person'},
                  {'coord': (64, 263, 150, 368),
                   'class_name': 'person'}]
    }
}
GENERAL_TASK
Dataset Format

dataset-type: GENERAL_FILES

  • For general task, as its name states, any domain’s task (or model) can be included within this category, such as image processing, nlp, speech, or video.
  • There is no requirements for the form of dataset, as long as it can be read into memory in the form of a file. However, the model developer has to know in advance how to handle the read-in file.
Query Format

A file is required as the query format. As long as this file corresponds to the input required by the model, it can be in any file format.

Prediction Format

The same as the input query, the prediction returns the output file as set in the model.’

POS_TAGGING
Dataset Format

dataset-type:CORPUS

  • Sentences are delimited by \n tokens.
  • There is only 1 tag column of tag corresponding to the POS tag of the token as an integer from 0 to k-1.

An example:

token       tag
Two         3
leading     2
...
line-item   1
veto        5
.           4
\n          0
Professors  6
Philip      6
...
previous    1
presidents  8
.           4
\n          0
Query Format

An array of strings representing a sentence as a list of tokens in that sentence.

Prediction Format

A array of integers representing the list of predicted tag for each token, in sequence, for the sentence.

QUESTION_ANSWERING
COVID19 Task Dataset Format

dataset-type:QUESTION_ANSWERING_COVID19

Dataset can be used to finetune the SQuAD pre-trained Bert model.

  • The dataset zips folders containing JSON files. JSON files under different level folders will be automaticly read all together.

Dataset structure example:

/DATASET_NAME.zip
│
├──FOLDER_NAME_1                                              # first level folder
│  └──FOLDER_NAME_2                                           # second level folder, not necessarily to be included
│      └──FOLDER_NAME_3                                       # third level folder, not necessarily to be included
│           ├── 003d2e515e1aaf06f0052769953e8.json            # JSON file name is a random combination of either alphabets/numbers or both
│           ├── 00a407540a8bdd.json
│           ...
│
├──FOLDER_NAME_4                                              # first level folder
│  ├── 0015023cc06b5362d332b3.json
│  ├── 001b4a31684c8fc6e2cfbb70304354978317c429.json
│  ...
...
│
└──metadata.csv                                               # if additional information is provided for above JSON files, user can add a metadata.csv
  • JSON file includes body_text, providing list of paragraphs in full body which can be used for question answering. body_text can contain different entries, only the “text” field of each entry will be read.
  1. For JSON files extracted from papers, it comes that one JSON file for one paper. And if additional information is given in metadata.csv for papers, each JSON file and each metadata.csv entries are linked via sha values of both.
  2. For dataset having their additional information paragraph, the body_text> text entry is in <question> + <\n> + <information paragraph> string format. In this circumstance, there is no sha value nor metadata.csv file needed.

Sample of JSON file:

# JSON file 1                           # for example, a JSON file extracted from one paper
{
    "sha": <str>,                       # 40-character sha1 of the PDF, this field is only required for JSON extracted from papers. it will be read into model in forms of string

    "body_text": [                      # list of paragraphs in full body, this is must-have
        {
            "text": <str>,              # text body for first entry, which is for one paragraph of this paper. this is must-have. it will be read as string into model
        }
        ...                             # other 'text' blocks, i.e. paragraphs blocks the same as above, then all string ‘text’ will be handled and processed into panda datafame
    ],
}

# ---------------------------------------------------------------------------------------------------------------------- #

# JSON file 2                           # for example, a JSON file extraced from SQuAD2.0
{
    "body_text": [                      # list of paragraphs in full body, this is must-have
        {
            "text": 'What are the treatments for Age-related Macular Degeneration ?\n If You Have Advanced AMD Once dry AMD reaches the advanced stage, no form of treatment can prevent vision loss...',
                                        # text body for first entry, this is must-have

        },
        ...                             # other 'text' blocks, i.e. paragraphs blocks look the same as above
    ],
}
  • metadata.csv is not strictly required. User can provide additional information with it, i.e. authors, title, journal and publish_time, mapping to each JSON files by every sha value. cord_uid serves unique values serve as the entry identity. Time sensitive entry, is advised to have publish_time value in Date format. Other values, General format is recommended.

Sample of metadata.csv entry:

Column Names Column Values
cord_uid zjufx4fo
sha b2897e1277f56641193a6db73825f707eed3e4c9
source_x PMC
title Sequence requirements for RNA strand transfer during nidovirus …
doi 10.1093/emboj/20.24.7220
pmcid PMC125340
pubmed_id 11742998
license unk
abstract Nidovirus subgenomic mRNAs contain a leader sequence derived …
publish_time 2001-12-17
Query Format

note

  • The pretrained model should be fine-tuned with a dataset first to adapt to particular question domains when necessary.
  • Otherwise, following the question, input should contain relevant information (context paragraph or candidate answers, or both), whether or not addresses the question.
  • Optionally, while the relevant information as additional paragraph are provided in query, the question always comes first, followed by additional paragraph. We use “n” separators between the question and its paragraph of the input.

Query is in JSON format. It could be a \ of a single question in questions field. Model will only read the questions field.

{
 'questions': ['Is individual's age considered a potential risk factor of COVID19? \n  People of all ages can be infected by the new coronavirus (2019-nCoV). Older people, and people with pre-existing medical conditions (such as asthma, diabetes, heart disease) appear to be more vulnerable to becoming severely ill with the virus. WHO advises people of all ages to take steps to protect themselves from the virus, for example by following good hand hygiene and good respiratory hygiene.',
               # query string can include optional context which follows the question with `\n` syntax
               'Is COVID-19 associated with cardiomyopathy and cardiac arrest?'],     # will be read as a list of string by model, and each question will be extracted as string to process the question answering stage recursively
               ...                                                                    # questions in string format
 ...                                                                                  # other fileds. fields, other than 'questions', won't be read into the model
}
Prediction Format

The output is in JSON format.

['Given a higher mortality rate for older cases, in one study, li et al showed that more than 50% of early patients with covid-19 in wuhan were more than 60 years old',
 'cardiac involvement has been reported in patients with covid-19, which may be reflected by ecg changes.'
 ...
 ]   # output field is a list of string
MedQuAD Task Dataset Format

dataset-type:QUESTION_ANSWERING_MEDQUAD

Dataset structure example:

/MedQuAD.zip
│
├──FOLDER_NAME_1                                              # first level folder
│  └──FOLDER_NAME_2                                           # second level folder, not necessarily to be included
│      └──FOLDER_NAME_3                                       # third level folder, not necessarily to be included
│           ├── 003d2e515e1aaf0052769953e8.xml                # xml file name is a random combination of either alphabets/numbers or both
│           ├── 00a40758bdd.xml
│           ...
│
├──FOLDER_NAME_4                                              # first level folder
│  ├── 0015023cc06b5332b3.xml
│  ├── 001b4a31684c8fc6e2cfbb70304c429.xml
│  ...
...

**note**

-  For following .xml sample, model would only take Question and
   Answer fields into the question answering processing.
-  Each xml file contains multiple \\. Each \\ contains one question
   and its answer.

Sample .xml file:

<?xml version="1.0" encoding="UTF-8"?>
<Document>
...
<QAPairs>
 <QAPair pid="1">                                                           # pair #1
   <Question qid="000001-1"> A question here ... </Question>                # question #1, will be read as string by model
   <Answer> An answer here ... </Answer>                                    # answer of question #1, will be read as string by model
 </QAPair>
 ...                                                                        # multiple subsequent <QAPair> blocks, Question and its Answer pair will be combined into one string by model, and strings of QAPair are then processed into panda dataframe
</QAPairs>
</Document>
Query Format

note

  • The pretrained model should be fine-tuned with a dataset first to adapt to particular question domains when necessary.
  • Otherwise, following the question, input should contain relevant information (context paragraph or candidate answers, or both), whether or not addresses the question.
  • Optionally, while the relevant information as additional paragraph are provided in query, the question always comes first, followed by additional paragraph. We use “n” separators between the question and its paragraph of the input.

Query is in JSON format. It could be a \ of a single question in questions field. Model will only read the questions field.

{
 'questions': ['Who is at risk for Adult Acute Lymphoblastic Leukemia?',
              'What are the treatments for Adult Acute Lymphoblastic Leukemia ?'],     # will be read as a list of string by model, and each question will be extracted as string to process the question answering stage recursively
              ...                                                                      # questions in format of string
 ...                                                                                   # other fileds. fields, other than 'questions', won't be read into the model
}
Prediction Format

The output is in JSON format.

{'answers':['Past treatment with chemotherapy or radiation therapy. Having certain genetic disorders.',    # output 'answers' field is a list of string
            'Chemotherapy. Radiation therapy. Chemotherapy with stem cell transplant. Targeted therapy.'
            ...
            ]}
SPEECH_RECOGNITION

Speech recognition for the English language.

Dataset Type

dataset-type:AUDIO_FILES

The audios.csv should be of a .CSV format with 3 columns of wav_filename, wav_filesize and transcript.

For each row,

wav_filename should be a file path to a .wav audio file within the archive, relative to the root of the directory. Each audio file’s sample rate must equal to 16kHz.

wav_filesize should be an integer representing the size of the .wav audio file, in number of bytes.

transcript should be a string of the true transcript for the audio file. Transcripts should only contain the following alphabets:

a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z


'

An example of audios.csv follows:

wav_filename,wav_filesize,transcript
6930-81414-0000.wav,412684,audio transcript one
6930-81414-0001.wav,559564,audio transcript two
...
672-122797-0005.wav,104364,audio transcript one thousand
...
1995-1837-0001.wav,279404,audio transcript three thousand
Query Format

A Base64-encoded string of the bytes of the audio as a 16kHz .wav file

Prediction Format

A string, representing the predicted transcript for the audio.

TABULAR_CLASSIFICATION
Dataset Type

dataset-type:TABULAR

The following optional train arguments are supported:

Train A rgument Description
f eatures List of feature columns’ names as a list of strings (defaults to first N-1 columns in the CSV file)
` target` Target column name as a string (defaults to the last column in the CSV file)

The train & validation datasets should have the same columns.

age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
48,0,2,130,275,0,1,139,0,0.2,2,0,2,1
58,0,0,170,225,1,0,146,1,2.8,1,2,1,0
Query Format

An size-N-1 dictionary representing feature-value pairs.

E.g.

queries=[
{'age': 48,'sex': 1,'cp': 2,'trestbps': 130,'chol': 225,'fbs': 1,'restecg': 1,'thalach': 172,'exang': 1,'oldpeak': 1.7,'slope': 2,'ca': 0,'thal': 3},
{'age': 48,'sex': 0,'cp': 2,'trestbps': 130,'chol': 275,'fbs': 0,'restecg': 1,'thalach': 139,'exang': 0,'oldpeak': 0.2,'slope': 2,'ca': 0,'thal': 2},
]
Prediction Format

A size-k list of floats, representing the probabilities of each class from 0 to k-1 for the target column.

TABULAR_REGRESSION
Dataset Type

dataset-type:TABULAR

The following optional train arguments are supported:

Train A rgument Description
f eatures List of feature columns’ names as a list of strings (defaults to first N-1 columns in the CSV file)
` target` Target column name as a string (defaults to the last column in the CSV file)

The train & validation datasets should have the same columns.

An example of the dataset follows:

density,bodyfat,age,weight,height,neck,chest,abdomen,hip,thigh,knee,ankle,biceps,forearm,wrist
1.0708,12.3,23,154.25,67.75,36.2,93.1,85.2,94.5,59,37.3,21.9,32,27.4,17.1
1.0853,6.1,22,173.25,72.25,38.5,93.6,83,98.7,58.7,37.3,23.4,30.5,28.9,18.2
1.0414,25.3,22,154,66.25,34,95.8,87.9,99.2,59.6,38.9,24,28.8,25.2,16.6
...
Query Format

An size-N-1 dictionary representing feature-value pairs.

Prediction Format

A float, representing the value of the target column.

Dataset Types

note

Refer to ./examples/datasets/ for examples on pre-processing common dataset formats to conform to the SINGA-Auto’s own dataset formats.

CORPUS

The dataset file must be of the .zip archive format with a corpus.tsv at the root of the directory.

The corpus.tsv should be of a .TSV format with columns of token and N other variable column names (tag columns).

For each row,

token should be a string, a token (e.g. word) in the corpus. These tokens should appear in the order as it is in the text of the corpus. To delimit sentences, token can be take the value of \n.

The other N columns describe the corresponding token as part of the text of the corpus, depending on the task.

SEGMENTATION_IMAGES
  • Inside the uploaded .zip file, the training and validation sets should be wrapped separately, and be named strictly as train and val.

  • For train folder (the same for val folder), the images and annotated masks should also be wrapped separately, and be named strictly as image and mask.

  • mask folder should contain only .png files and file name should be the same as each mask’s corresponding image. (eg. for an image named 0001.jpg, its corresponding mask should be named as 0001.png)

  • An JSON file named params.json must also be included in the .zip file, in order to indicates the essential training parameters such as num_classes, for example:

    {
        "num_classes": 21
    }
    

An example of the upload .zip file structure:

+ dataset.zip
    + train
        + image
            + 0001.jpg
            + 0002.jpg
            + ...
        + mask
            + 0001.png
            + 0002.png
            + ..
    + val
        + image
            + 0003.jpg
            + ...
        + mask
            + 0003.png
            + ...
    + params.json
IMAGE_FILES

The dataset file must be of the .zip archive format with a images.csv at the root of the directory.

The images.csv should be of a .CSV format with columns of path and N other variable column names (tag columns).

For each row,

path should be a file path to a .png, .jpg or .jpeg image file within the archive, relative to the root of the directory.

The other N columns describe the corresponding image, depending on the task.

DETECTION_DATASET

It is recommended to follow the YOLO dataset format.

  • For folder hierarchy, two folders ‘images’ and ‘labels’ should be prepared. In ‘images’ folder, there are PIL loadable images, and the corresponding txt label files should be placed in ‘labels’ folder, with the same basename with the images.
  • The label file format is as follows, where object-id is the index of object, the following four numbers should be normalized to range between 0 and 1 by dividing by the width and height of the image. center_x center_y are the central coordinates of bounding box, and width heigh is the sides lengths of it. It is allowable to use empty label file (negative samples), which means there are no objects to detect in the image.
object-id center_x center_y width height
...
  • In addition, train.txt, valid.txt can be provided to note images used for training/validataion, only including the path of image files. A class.names contains the category names and thier line numbers are object-id.
GENERAL_FILES
  • For general task, as its name states, any domain’s task (or model) can be included within this category, such as image processing, nlp, speech, or video.
  • There is no requirements for the form of dataset, as long as it can be read into memory in the form of a file. However, the model developer has to know in advance how to handle the read-in file.
QUESTION_ANSWERING_COVID19

The dataset file must be of the .zip archive format, containing JSON files. JSON files under different levels of folders will be automaticly read all together.

Each JSON file is extracted from one paper. JSON structure contains field body_text, which is a list of {“text”: <str>} blocks. Each text block is namely each paragraph of corresponding paper.

Meanwhile, a metadata.csv file, at the root of the archive directory, is optional. It is to provide the model with publish_time column, each entry is in Date format, e.g. 2001-12-17. In this condition, each metadata entry is required to have sha value column in General format, and each JSON file required to have “sha”:<str> field, while both sha values linked. When neither metadata.csv or publish_time Date value is provided, the model would not check the timeliness of corresponding JSON body_text field.

QUESTION_ANSWERING_MEDQUAD

The dataset file must be of the .zip archive format, containing xml files. Xml files under different levels of folders will be automaticly read all together.

Model would only take <Document> <QAPairs> … </QAPairs> </Document>field, and this filed contains multiple <QAPair> … </QAPair>. Each QAPair has one <Question> … </Question> and its <Answer> … </Answer> combination.

TABULAR

The dataset file must be a tabular dataset of the .csv format with N columns.

AUDIO_FILES

The dataset file must be of the .zip archive format with a audios.csv at the root of the directory.

The audios.csv should be of a .CSV format with 3 columns of wav_filename, wav_filesize and transcript.

For each row,

wav_filename should be a file path to a .wav audio file within the archive, relative to the root of the directory. Each audio file’s sample rate must equal to 16kHz.

wav_filesize should be an integer representing the size of the .wav audio file, in number of bytes.

transcript should be a string of the true transcript for the audio file. Transcripts should only contain the following alphabets:

a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z


'

An example of audios.csv follows:

wav_filename,wav_filesize,transcript
6930-81414-0000.wav,412684,audio transcript one
6930-81414-0001.wav,559564,audio transcript two
...
672-122797-0005.wav,104364,audio transcript one thousand
...
1995-1837-0001.wav,279404,audio transcript three thousand
Query Format

A Base64-encoded string of the bytes of the audio as a 16kHz .wav file

Prediction Format

A string, representing the predicted transcript for the audio.

Installing Python

Usage of SINGA-Auto requires Python 3.6. Specifically, you’ll need the command python to point to a Python 3.6 program, and pip to point to PIP for that Python 3.6 installation.

To achieve this, we recommend using Conda with a Python 3.6 environment as per the instructions below:

  1. Install the latest version of miniconda

  2. Run the following commands on shell:

    conda create --name singa_auto python=3.6
    
  3. Every time you need to use python or pip for SINGA-Auto, run the following command on shell:

    conda activate singa_auto
    

Otherwise, you can refer to these links below on installing Python natively:

Installing Kubernetes

Usage of SINGA-Auto in Kubernetes mode requires Kubernetes 1.15+.

To achieve this, we recommend the instructions below:

  1. Install kubelet kubeadm kubectl

    apt-get install -y kubelet kubeadm kubectl --allow-unauthenticated
    
  2. Close swap

    swapoff -a
    
  3. Config cri and change docker mode to systemd, reference to Kubernetes Container runtimes

  4. Edit /etc/default/kubelet

    Environment= KUBELET_EXTRA_ARGS=–cgroup-driver=systemd

  5. Reset kubeadm, maybe not necessary

    kubeadm reset
    
  6. Init k8s service, use your own host ip and the node name you want

    kubeadm init --kubernetes-version=1.15.1 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=YOURHOSTIP --node-name=YOURNODENAME --ignore-preflight-errors=ImagePull
    
  7. Add Kubernetes config to current user

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
  8. If just a single node, set master node as worker node

    kubectl taint nodes --all node-role.kubernetes.io/master-
    
  9. Install flannel from github

    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    
  10. Config role

    kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=default:default
    
  11. Nodeport range setting

    sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
    

    set “- –service-node-port-range=1-65535” in spec.containers.command node

Otherwise, you can refer to these links below on installing Kubernetes:

Model Development Guide

SINGA-Auto leverages on a dynamic pool of model templates contributed by Model Developers.

As a Model Developer, you’ll define a Python class that conforms to SINGA-Auto’s base model specification, and submit it to SINGA-Auto with the singa_auto.client.Client.create_model() method.

Implementing the Base Model Interface

As an overview, your model template needs to provide the following logic for deployment on SINGA-Auto:

  • Definition of the space of your model’s hyperparameters (knob configuration)
  • Initialization of the model with a concrete set of hyperparameters (knobs)
  • Training of the model given a (train) dataset on the local file system
  • Evaluation of the model given a (validation) dataset onthe local file system
  • Dumping of the model’s parameters for serialization, after training
  • Loading of the model with trained parameters
  • Making batch predictions with the model, after being trained

Full details of SINGA-Auto’s base model interface is documented at singa_auto.model.BaseModel. Your model implementation has to follow a specific task’s specification (see tasks).

To aid your implementation, you can refer to Sample Models.

Testing

After implementing your model, you’ll use singa_auto.model.dev.test_model_class() to test your model. Refer to its documentation for more details on how to use it, or refer to the sample models’ usage of the method.

Logging

utils.logger in the singa_auto.model module provides a set of methods to log messages & metrics while your model is training. These messages & metrics would be displayed on SINGA-Auto Web Admin for monitoring & debugging purposes. Refer to singa_auto.model.LoggerUtils for more details.

Dataset Loading

utils.dataset in the singa_auto.model module provides a simple set of in-built dataset loading methods. Refer to singa_auto.model.DatasetUtils for more details.

Defining Hyperparameter Search Space

Refer to How Model Tuning Works for the specifics of how you can tune your models on SINGA-Auto.

Sample Models

To illustrate how to write models for SINGA-Auto, we have written the following:

Example: Testing Models for IMAGE_CLASSIFICATION
  1. Download & pre-process the original Fashion MNIST dataset to the dataset format specified by IMAGE_CLASSIFICATION:

    python examples/datasets/image_files/load_fashion_mnist.py
    
  2. Install the Python dependencies for the sample models:

    pip install scikit-learn==0.20.0
    pip install tensorflow==1.12.0
    
  3. Test the sample models in ./examples/models/image_classification:

    python examples/models/image_classification/SkDt.py
    python examples/models/image_classification/TfFeedForward.py
    
Example: Testing Models for POS_TAGGING
  1. Download & pre-process the subsample of the Penn Treebank dataset to the dataset format specified by POS_TAGGING:

    python examples/datasets/corpus/load_sample_ptb.py
    
  2. Install the Python dependencies for the sample models:

    pip install torch==0.4.1
    
  3. Test the sample models in ./examples/models/pos_tagging:

    python examples/models/pos_tagging/BigramHmm.py
    python examples/models/pos_tagging/PyBiLstm.py
    
Configuring the Model’s Environment

Your model will be run in Python 3.6 with the following Python libraries pre-installed:

requests==2.20.0
numpy==1.14.5
Pillow==7.1.0

Additionally, you’ll specify a list of Python dependencies to be installed for your model, prior to model training and inference. This is configurable with the dependencies option during model creation. These dependencies will be lazily installed on top of the worker’s Docker image before your model’s code is executed. If the model is to be run on GPU, SINGA-Auto would map dependencies to their GPU-supported versions, if supported. For example, { 'tensorflow': '1.12.0' } will be installed as { 'tensorflow-gpu': '1.12.0' }. SINGA-Auto could also parse specific dependency names to install certain non-PyPI packages. For example, { 'singa': '1.1.1' } will be installed as singa-cpu=1.1.1 or singa-gpu=1.1.1 using conda.

Refer to the list of officially supported dependencies below. For dependencies that are not listed, they will be installed as PyPI packages of the specified name and version.

Dependency Installation Command
tensorflow pip install tensorflow==${ver} or pip install tensorflow-gpu==${ver}
singa conda install -c nusdbsystem singa-cpu=${ver} or conda install -c nusdbsystem singa-gpu=${ver}
Keras pip install Keras==${ver}
scikit-learn pip install scikit-learn==${ver}
torch pip install torch==${ver}

Alternatively, you can build a custom Docker image that extends rafikiai/rafiki_worker, installing the required dependencies for your model. This is configurable with docker_image option during model creation.

See also

singa_auto.client.Client.create_model()

Your model should be GPU-sensitive based on the environment variable CUDA_AVAILABLE_DEVICES (see here). If CUDA_AVAILABLE_DEVICES is set to -1, your model should simply run on CPU. You can assume that your model has exclusive access to the GPUs listed in CUDA_AVAILABLE_DEVICES.

How Model Tuning Works

Traditionally, getting the best performing model on a dataset involves involves tedious manual hyperparameter tuning. On SINGA-Auto, model hyperparameter tuning is automated by conducting multiple trials in a train job.

Over the trials, the model is initialized with different hyperparameters (knobs), trained and evaluated. A hyperparameter tuning advisor on SINGA-Auto ingests the validation scores from these trials to suggest better hyperparameters for future trials, to maximise performance of a model on the dataset. At the very end of the train job, SINGA-Auto could deploy the best-scoring trials for predictions.

Defining Hyperparameter Search Space

You’ll define a search space of hyperparameters (knob configuration) in a declarative manner with the static method singa_auto.model.BaseModel.get_knob_config(). The method should return a mapping of hyperparameter names (knob names) to hyperparameter specifications (knob specifications). A hyperparameter specification is an instance of a class that extends singa_auto.model.BaseKnob, which is limited to any of the following:

  • singa_auto.model.FixedKnob
  • singa_auto.model.CategoricalKnob
  • singa_auto.model.FloatKnob
  • singa_auto.model.IntegerKnob
  • singa_auto.model.PolicyKnob
  • singa_auto.model.ArchKnob

Refer to their documentation for more details on each type of knob specification, and refer to Sample Models to see examples of how knob configurations are declared.

Model Policies

singa_auto.model.PolicyKnob is a special type of knob specification that allows SINGA-Auto to configure the behaviour of a model on a trial basis.

In a modern model hyperparameter tuning scheme, a model tends to switch between different “modes”, or so we call policies. For example, when you tune your model manually, you might want the model to do early-stopping for the first e.g. 100 trials, then conduct a final trial for a full e.g. 300 epochs. As such, the concept of model policies in SINGA-Auto enables SINGA-Auto’s tuning advisor to externally configure your model to switch between these “modes”.

Your model communicates to SINGA-Auto which policies it supports by adding PolicyKnob(policy_name) to your model’s knob_configuration. On the other hand, during training, SINGA-Auto configures the activation of the model’s policies on a trial basis by realising the values of PolicyKnob to either True (activated) or False (not activated).

For example, if SINGA-Auto’s tuning scheme for your model requires your model to engage in e.g. early-stopping for all trials except for the final trial, if your model has { 'early_stop': PolicyKnob('EARLY_STOP'), ... }, SINGA-Auto will pass early_stop=False for just the final trial as part of its knobs, and pass early_stop=True for all other trials. Your model would situationally do early-stopping based on the value of the knob early-stop.

Below is the list of officially recognized model policies:

Policy Description
SHARE_PARAMS Whether model should load the shared parameters passed in train()
EARLY_STOP Whether model should stop training early in train(), e.g. with use of early stopping or reduced no. of epochs
SKIP_TRAIN Whether model should skip training its parameters
QUICK_EVAL Whether model should stop evaluation early in evaluate(), e.g. by evaluating on only a subset of their validation dataset
DOWNSCALE Whether a smaller version of the model should be constructed e.g. with fewer layers
Model Tuning Schemes

At a model level, SINGA-Auto automatically selects the appropriate tuning scheme (advisor) based on the composition of the model’s knob configuration and the incoming train job’s budget.

Specifically, it employs the following rules, in the given order, to select the type of advisor to use:

Rule Tuning Scheme
Only PolicyKnob, FixedKnob
Only conduct a single trial
Only PolicyKnob, FixedKnob,
FloatKnob, IntegerKnob,
CategoricalKnob, with policy
SHARE_PARAMS

Hyperparameter tuning with Bayesian Optimization & cross-trial parameter sharing.
Share globally best-scoring parameters across workers in a epsilon greedy manner.
Optionally employ early stopping (EARLY_STOP policy) for all trials.

Only PolicyKnob, FixedKnob,
FloatKnob, IntegerKnob,
CategoricalKnob
Hyperparameter tuning with Bayesian Optimization. Optionally employ early stopping
(EARLY_STOP policy) before the last 1h, and perform standard trials during the last 1h.
Only PolicyKnob, FixedKnob,
ArchKnob, with policies
SHARE_PARAMS, EARLY_STOP
SKIP_TRAIN, QUICK_EVAL
DOWNSCALE, and TIME_HOURS budget
>= 12h
Architecture tuning with cell-based
It conducts ENAS architecture search before the last 12h, then performs the final
training of the best architectures found in the last 12h.

All others Hyperparameter tuning with uniformly random knobs

The following subsections briefly explain how to leverage on the various model tuning schemes on SINGA-Auto.

Hyperparameter Tuning with Bayesian Optimization

To tune the hyperparameters of your model, where the hyperparameters are simply floats, integers or categorical, use singa_auto.model.FixedKnob, singa_auto.model.CategoricalKnob, singa_auto.model.FloatKnob & singa_auto.model.IntegerKnob.

Hyperparameter Tuning with Bayesian Optimization & Early Stopping

To additionally employ early stopping during hyperparameter tuning to speed up the tuning process, declare an extra singa_auto.model.PolicyKnob of the EARLY_STOP policy (see Model Policies).

Refer to the sample model ./examples/models/image_classification/TfFeedForward.py.

Hyperparameter Tuning with Bayesian Optimization & Parameter Sharing

To additionally have best-scoring model parameters shared between trials to speed up the tuning process (as outlined in “SINGA-Auto: Machine Learning as an Analytics Service System”), declare an extra singa_auto.model.PolicyKnob of the SHARE_PARAMS policy (see Model Policies).

Refer to the sample model ./examples/models/image_classification/PyDenseNetBc.py and its corresponding usage script ./examples/scripts/image_classification/train_densenet.py to better understand how to do parameter sharing.

Architecture Tuning with ENAS

To tune the architecture for your model with the modern architecture search algorithm “Efficient Neural Architecture Search via Parameter Sharing” (ENAS), declare a singa_auto.model.ArchKnob and offer the policies SHARE_PARAMS, EARLY_STOP, SKIP_TRAIN, QUICK_EVAL and DOWNSCALE (see Model Policies). Specifically, you’ll need your model to support parameter sharing, stopping training early, skipping the training step, evaluating on a subset of the validation dataset, and downscaling the model e.g. to use fewer layers. These policies are critical in the speed & performance of ENAS. See Deep Dive on ENAS to understand more about SINGA-Auto’s implementation of ENAS.

Refer to the sample model ./examples/models/image_classification/TfEnas.py and its corresponding usage script ./examples/scripts/image_classification/run_enas.py to better understand how to do architecture tuning.

Deep Dive on ENAS

The ENAS paper outlines a new methodology for automatic neural network construction, speeding up the original Neural Architecture Search (NAS) methodology by 1000x without affecting its ability to search for a competitive architecture. The authors made the crucial observation that 2 different architectures would share a common subgraph, and the model parameters in that subgraph could be reused across trials without having to re-train these parameters from scratch every trial.

The following is an overview of how ENAS works. As explained in the ENAS paper, during an ENAS search for best CNN architecture (ENAS Search), there is an alternation between 2 phases: training of the ENAS CNN’s shared parameters (CNN Train Phase), and the training of the ENAS controller (Controller Train Phase). While CNN parameters are carried over the phases, the CNN’s shared parameters are not trained during Controller Train Phases. After ENAS Search is done, there is a final training of the best CNN architecture found (ENAS Train), this time initializing its CNN parameters from scratch,

On SINGA-Auto, we’ve replicated the Cell-Based ENAS controller for image classification as one of SINGA-Auto’s tuning scheme and a SINGA-Auto model TfEnas, with very close reference to author’s code. In this specific setup for ENAS, ENAS Search is done with the construction of a single supergraph of all possible architectures, while ENAS Train is done with the construction of a fixed graph of the best architecture (with slight architectural differences from ENAS Search). Each CNN Train Phase involves training the CNN for 1 epoch, while within each Controller Train Phase, the controller is trained for 30 steps. In each controller step, 10 architectures are sampled from the controller, evaluated on the ENAS CNN by dynamically changing its architecture, and losses based on validation accuracies are back-propagated in the controller to update the controller’s parameters. Each validation accuracy is computed on only a batch of the validation dataset. The alternation between CNN Train Phase and Controller Train Phase happens for X cycles during ENAS Search, and close to the end of training, during ENAS Train, architecture samples with highest validation accuracies, this time computed on the full validation dataset, would be trained from scratch to arrive at final best models.

We’ve generalized the ENAS controller, its architecture encoding scheme and its overall tuning scheme on SINGA-Auto, such that SINGA-Auto models can leverage on architecture tuning with a flexible architecture encoding, and SINGA-Auto’s application developers can train with these models in an end-to-end manner.

We’ve also devised a simple, yet effective strategy to run ENAS in a distributed setting. When given multiple GPUs, SINGA-Auto performs ENAS locally at each worker in a train job, with these workers sharing a central ENAS controller.

Developer Guide

Setup & Configuration

Quick Setup

We assume development or deployment in a MacOS or Linux environment.

As for User:

  1. Install Docker 18 (Ubuntu, MacOS) and, if required, add your user to docker group (Linux).

Note

If you’re not a user in the docker group, you’ll instead need sudo access and prefix every bash command with sudo -E.

  1. Install Kubernetes 1.15+ (see Installing Kubernetes) if using Kubernetes.
  2. Install Python 3.6 such that the python and pip commands point to the correct installation of Python 3.6 (see Installing Python).
  3. pip install singa-auto==0.3.4
  4. start the service using : sago stop the service using : sastop clean the service using : saclean

As for Developer

  1. Install Docker 18 (Ubuntu, MacOS) and, if required, add your user to docker group (Linux).

Note

If you’re not a user in the docker group, you’ll instead need sudo access and prefix every bash command with sudo -E.

  1. Install Kubernetes 1.15+ (see Installing Kubernetes) if using Kubernetes.

  2. Install Python 3.6 such that the python and pip commands point to the correct installation of Python 3.6 (see Installing Python).

  3. Clone the project at https://github.com/nusdbsystem/singa-auto (e.g. with Git)

    In file web/src/HTTPconfig.js, there are parameters specifying backend server and port that Web UI interacts with. Developers have to modify the following values to conform with their server setting:

    const adminHost = '127.0.0.1' # Singa-Auto server address, in str format
    const adminPort = '3000'      # Singa-Auto server port, in str format
    
    const LocalGateways = {...
      // NOTE: must append '/' at the end!
      singa_auto: "http://127.0.0.1:3000/", # http://<ServerAddress>:<Port>/, in str format
    }
    
    HTTPconfig.adminHost = `127.0.0.1`  # Singa-Auto server address, in str format
    HTTPconfig.adminPort = `3000`       # Singa-Auto server port, in str format
    

    By using 127.0.0.1 as Singa-Auto server address, it means Singa-Auto will be deployed on your ‘local’ machine.

  4. If using docker, Setup SINGA-Auto’s complete stack with the setup script:

    bash scripts/docker_swarm/start.sh
    

    If using kubernetes, Setup SINGA-Auto’s complete stack with the setup script:

    bash scripts/kubernetes/start.sh
    

SINGA-Auto Admin and SINGA-Auto Web Admin will be available at 127.0.0.1:3000 and 127.0.0.1:3001 respectively, or the server specified as ‘IP_ADRESS’ in scripts/docker_swarm/.env.sh or scripts/kubernetes/.env.sh.

If using docker, to destroy SINGA-Auto’s complete stack:

bash scripts/docker_swarm/stop.sh

If using kubernetes, to destroy SINGA-Auto’s complete stack:

bash scripts/kubernetes/stop.sh
Updating docker images
bash scripts/kubernetes/build_images.sh

or

bash scripts/docker_swarm/build_images.sh
bash scripts/push_images.sh

By default, you can read logs of SINGA-Auto Admin & any of SINGA-Auto’s workers in ./logs directory at the root of the project’s directory of the master node.

Scaling SINGA-Auto

SINGA-Auto’s default setup runs on a single machine and only runs its workloads on CPUs.

SINGA-Auto’s model training workers run in Docker containers that extend the Docker image nvidia/cuda:9.0-runtime-ubuntu16.04, and are capable of leveraging on CUDA-Capable GPUs

Scaling SINGA-Auto horizontally and enabling GPU usage involves setting up Network File System (NFS) at a common path across all nodes, installing & configuring the default Docker runtime to nvidia for each GPU-bearing node. If using docker swarm, putting all these nodes into a single Docker Swarm. If using kubernetes, putting all these nodes into kubernetes.

To run SINGA-Auto on multiple machines with GPUs on docker swarm, do the following:

  1. If SINGA-Auto is running, stop SINGA-Auto with

    bash scripts/docker_swarm/stop.sh
    
  2. Have all nodes leave any Docker Swarm they are in

  3. Set up NFS such that the master node is a NFS host, other nodes are NFS clients, and the master node shares an ancestor directory containing SINGA-Auto’s project directory. Here are instructions for Ubuntu

  4. All nodes should be in a common network. On the master node, change DOCKER_SWARM_ADVERTISE_ADDR in the project’s .env.sh to the IP address of the master node in the network that your nodes are in

  5. For each node (including the master node), ensure the firewall rules allow TCP & UDP traffic on ports 2377, 7946 and 4789

  6. For each node that has GPUs:

    6.1. Install NVIDIA drivers for CUDA 9.0 or above

    6.2. Install nvidia-docker2

    6.3. Set the default-runtime of Docker to nvidia (e.g. instructions here)

  7. On the master node, start SINGA-Auto with

    bash scripts/docker_swarm/start.sh
    
  8. For each worker node, have the node join the master node’s Docker Swarm

  9. On the master node, for each node (including the master node), configure it with the script:

    bash scripts/docker_swarm/setup_node.sh
    

To run SINGA-Auto on multiple machines with GPUs on kubernetes, do the following:

  1. If SINGA-Auto is running, stop SINGA-Auto with

    bash scripts/kubernetes/stop.sh
    
  2. Put all nodes you need in kubernetes cluster, reference to kubeadm join

  3. Set up NFS such that the master node is a NFS host, other nodes are NFS clients, and the master node shares an ancestor directory containing SINGA-Auto’s project directory. Here are instructions for Ubuntu

  4. Change KUBERNETES_ADVERTISE_ADDR in the project’s scripts/kubernetes/.env.sh to the IP address of the master node in the network that your nodes are in

  5. For each node that has GPUs:

    5.1. Install NVIDIA drivers for CUDA 9.0 or above

    5.2. Install nvidia-docker2

    5.3. Set the default-runtime of Docker to nvidia (e.g. instructions here)

    5.4. Install nvidia-device-plugin, use command “kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.10/nvidia-device-plugin.yml” on the master node

  1. On the master node, start SINGA-Auto with bash scripts/kubernetes/start.sh
Exposing SINGA-Auto Publicly

SINGA-Auto Admin and SINGA-Auto Web Admin runs on the master node. If using docker swarm, change SINGA_AUTO_ADDR in .env.sh to the IP address of the master node in the network you intend to expose SINGA-Auto in. If using kubernetes, change SINGA_AUTO_ADDR in scripts/kubernetes/.env.sh to the IP address of the master node in the network you intend to expose SINGA-Auto in.

Example:

export SINGA_AUTO_ADDR=172.28.176.35

Re-deploy SINGA-Auto with step 4, changing Singa-Auto server address to conform. SINGA-Auto Admin and SINGA-Auto Web Admin will be available at that IP address, over ports 3000 and 3001 (by default), assuming incoming connections to these ports are allowed.

Before you expose SINGA-Auto to the public, it is highly recommended to change the master passwords for superadmin, server and the database (located in `.env.sh` as `POSTGRES_PASSWORD`, `APP_SECRET` & `SUPERADMIN_PASSWORD`)

Reading SINGA-Auto’s logs

By default, you can read logs of SINGA-Auto Admin & any of SINGA-Auto’s workers in ./logs directory at the root of the project’s directory of the master node.

Troubleshooting

Q: There seems to be connectivity issues amongst containers across nodes!

A: Ensure that containers are able to communicate with one another through the Docker Swarm overlay network

Development

Before running any individual scripts, make sure to run the shell configuration script:

source scripts/docker_swarm/.env.sh

In ‘.env.sh’, the default server is fixed by ‘IP_ADRESS=127.0.0.1’, which means that Singa-Auto will be using the ‘local’ machine as the server. HOST_WORKDIR_PATH by default is the current directory, and ‘SINGA_AUTO_VERSION’ is set to ‘dev’ for development mode, otherwise, a specific version should be given.

Refer to SINGA-Auto’s Architecture and Folder Structure for a developer’s overview of SINGA-Auto.

Testing Latest Code Changes

To test the lastet code changes e.g. in the dev branch, you’ll need to do the following:

  1. Build SINGA-Auto’s images on each participating node (the quickstart instructions pull pre-built SINGA-Auto’s images from Docker Hub):
bash scripts/docker_swarm/build_images.sh
  1. Purge all of SINGA-Auto’s data (since there might be database schema changes):
bash scripts/clean.sh
Making a Release to master

In general, before making a release to master from dev, ensure that the code at dev is stable & well-tested:

  1. Consider running all of SINGA-Auto’s tests (see Running SINGA-Auto’s Tests). Remember to re-build the Docker images to ensure the latest code changes are reflected (see Testing Latest Code Changes)
  2. Consider running all of SINGA-Auto’s example models in ./examples/models/
  3. Consider running all of SINGA-Auto’s example usage scripts in ./examples/scripts/
  4. Consider running all of SINGA-Auto’s example dataset-preparation scripts in ./examples/datasets/
  5. Consider visiting SINGA-Auto Web Admin and manually testing it
  6. Consider building SINGA-Auto’s documentation site and checking if the documentation matches the codebase (see Building SINGA-Auto’s Documentation)

After merging dev into master, do the following:

  1. Build & push SINGA-Auto’s new Docker images to SINGA-Auto’s own Docker Hub account:

    bash scripts/docker_swarm/build_images.sh
    bash scripts/push_images.sh
    

    Get Docker Hub credentials from @nginyc.

  2. Build & deploy SINGA-Auto’s new documentation to SINGA-Auto's microsite powered by Github Pages. Run the following:

    bash scripts/docker_swarm/build_docs.sh latest
    

    Finally, commit all resultant generated documentation changes and push them to dev branch. The latest documentation should be reflected at https://singa-auto.readthedocs.io/en/latest/.

    Refer to documentation on Github Pages <https://guides.github.com/features/pages/> to understand more on how this works.

  3. Draft a new Singa-Auto Github release. Make sure to include the list of changes relative to the previous release.

Subsequently, you’ll need to increase SINGA_AUTO_VERSION in .env.sh to reflect a new release.

Managing SINGA-Auto’s DB

By default, you can connect to the PostgreSQL DB using a PostgreSQL client (e.g Postico) with these credentials:

SINGA_AUTO_ADDR=127.0.0.1
POSTGRES_EXT_PORT=5433
POSTGRES_USER=singa_auto
POSTGRES_DB=singa_auto
POSTGRES_PASSWORD=singa_auto
Connecting to SINGA-Auto’s Redis

You can connect to Redis DB with rebrow:

bash scripts/docker_swarm/test/start_rebrow.sh

…with these credentials by default:

SINGA_AUTO_ADDR=127.0.0.1
REDIS_EXT_PORT=6380
Pushing Images to Docker Hub

To push the SINGA-Auto’s latest images to Docker Hub (e.g. to reflect the latest code changes):

bash scripts/push_images.sh
Building SINGA-Auto’s Documentation

SINGA-Auto uses Sphinx documentation and hosts the documentation with Github Pages on the dev branch. Build & view SINGA-Auto’s Sphinx documentation on your machine with the following commands:

bash scripts/docker_swarm/build_docs.sh latest
open docs/index.html
Running SINGA-Auto’s Tests

SINGA-Auto uses pytest.

First, start SINGA-Auto.

Then, run all integration tests with:

pip install -r singa_auto/requirements.txt
pip install -r singa_auto/advisor/requirements.txt
bash scripts/docker_swarm/test/test.sh
Troubleshooting

While building SINGA-Auto’s images locally, if you encounter errors like “No space left on device”, you might be running out of space allocated for Docker. Try one of the following:

# Prunes dangling images
docker system prune --all
# Delete all containers
docker rm $(docker ps -a -q)
# Delete all images
docker rmi $(docker images -q)

From Mac Mojave onwards, due to Mac’s new privacy protection feature, you might need to explicitly give Docker Full Disk Access, restart Docker, or even do a factory reset of Docker.

Using SINGA-Auto Admin’s HTTP interface

To make calls to the HTTP endpoints of SINGA-Auto Admin, you’ll need first authenticate with email & password against the POST /tokens endpoint to obtain an authentication token token, and subsequently add the Authorization header for every other call:

Authorization: Bearer {{token}}

Users of SINGA-Auto

_images/system-context-diagram.jpg

Users of SINGA-Auto

There are 4 types of users on SINGA-Auto:

Application Developers create, manage, monitor and stop model training and serving jobs on SINGA-Auto. They are the primary users of SINGA-Auto - they upload their datasets onto SINGA-Auto and create model training jobs that train on these datasets. After model training, they trigger the deployment of these trained ML models as a web service that Application Users interact with. While their model training and serving jobs are running, they administer these jobs and monitor their progress.

Application Users send queries to trained models exposed as a web service on SINGA-Auto, receiving predictions back. Not to be confused with Application Developers, these users may be developers that are looking to conveniently integrate ML predictions into their mobile, web or desktop applications. These application users have consumer-provider relationships with the aforementioned ML application developers, having delegated the work of training and deploying ML models to them.

Model Developers create, update and delete model templates to form SINGA-Auto’s dynamic repository of ML model templates. These users are key external contributors to SINGA-Auto, and represent the main source of up-to-date ML expertise on SINGA-Auto, playing a crucial role in consistently expanding and diversifying SINGA-Auto’s underlying set of ML model templates for a variety of ML tasks. Coupled with SINGA-Auto’s modern ML model tuning framework on SINGA-Auto, these contributions heavily dictate the ML performance that SINGA-Auto provides to Application Developers.

SINGA-Auto Admins create, update and remove users on SINGA-Auto. They regulate access of the other types of users to a running instance of SINGA-Auto.

SINGA-Auto’s Architecture

SINGA-Auto’s system architecture consists of 3 static components, 2 central databases, 4 types of dynamic components, and 1 client-side SDK, which can be illustrated with a 3-layer architecture diagram.

_images/container-diagram.png

Architecture of SINGA-Auto

Static Stack of SINGA-Auto

SINGA-Auto’s static stack consists of the following:

SINGA-Auto Admin (Python/Flask) is the centrepiece of SINGA-Auto. It is a multi-threaded HTTP server which presents a unified REST API over HTTP that fully administrates the SINGA-Auto instance. When users send requests to SINGA-Auto Admin, it handles these requests by accordingly modifying SINGA-Auto’s Metadata Store or deploying/stopping the dynamic components of SINGA-Auto’s stack (i.e. workers for model training & serving).

SINGA-Auto Metadata Store (PostgreSQL) is SINGA-Auto’s centralized, persistent database for user metadata, job metadata, worker metadata and model templates.

SINGA-Auto Redis (Redis) is SINGA-Auto’s temporary in-memory store for the implementation of fast asynchronous cross-worker communication, in a way that decouples senders from receivers. It synchronizes the back-and-forth of queries & predictions between multiple SINGA-Auto Inference Workers and a single SINGA-Auto Predictor for an Inference Job.

SINGA-Auto Web Admin (NodeJS/ExpressJS) is a HTTP server that serves SINGA-Auto’s web front-end to users, allowing Application Developers to survey their jobs on a friendly web GUI.

SINGA-Auto Client (Python) is SINGA-Auto’s client-side Python SDK to simplify communication with Admin.

Dynamic Stack of SINGA-Auto

On the other hand, SINGA-Auto’s dynamic stack consists of a dynamic pool of workers. Internally within SINGA-Auto’s architecture, Admin adopts master-slave relationships with these workers, managing the deployment and termination of these workers in real-time depending on Train Job and Inference Job requests, as well as the stream of events it receives from its workers. When a worker is deployed, it is configured with the identifier for an associated job, and once it starts running, it would first initialize itself by pulling the job’s metadata from Metadata Store before starting on its task.

The types of workers are as follows:

SINGA-Auto Advisor Workers (Python) proposes knobs & training configuration for Train Workers. For each model, there is a single Advisor Worker centrally orchestrating tuning of the model together with multiple Train Workers.

SINGA-Auto Train Workers (Python) train models for Train Jobs by conducting Trials.

SINGA-Auto Predictors (Python/Flask) are multi-threaded HTTP servers that receive queries from Application Users and respond with predictions as part of an Inference Job. It does this through producer-consumer relationships with multiple SINGA-Auto Inference Workers. If necessary, it performs model ensembling on predictions received from different workers.

SINGA-Auto Inference Workers (Python) serve models for Inference Jobs. In a single Inference Job, there could be multiple Inference Workers concurrently making predictions for a single batch of queries.

Container Orchestration Strategy

All of SINGA-Auto’s components’ environment and configuration has been fully specified as a replicable, portable Docker image publicly available as Dockerfiles and on SINGA-Auto’s own Docker Hub account.

When an instance of SINGA-Auto is deployed on the master node, a Docker Swarm is initialized and all of SINGA-Auto’s components run within a single Docker routing-mesh overlay network. Subsequently, SINGA-Auto can be horizontally scaled by adding more worker nodes to the Docker Swarm. Dynamically-deployed workers run as Docker Swarm Services and are placed in a resource-aware manner.

Distributed File System Strategy

All components depend on a shared file system across multiple nodes, powered by Network File System (NFS). Each component written in Python continually writes logs to this shared file system.

Folder Structure

  • singa_auto/

    SINGA-Auto’s Python package

    • admin/

      SINGA-Auto’s static Admin component

    • advisor/

      SINGA-Auto’s advisors

    • client/

      SINGA-Auto’s client-side SDK

      See also

      singa_auto.client

    • worker/

      SINGA-Auto’s train, inference & advisor workers

    • predictor/

      SINGA-Auto’s predictor

    • meta_store/

      Abstract data access layer for singa_auto’s main metadata store (backed by PostgreSQL)

    • param_store/

      Abstract data access layer for SINGA-Auto’s store of model parameters (backed by filesystem)

    • data_store/

      Abstract data access layer for SINGA-Auto’s store of datasets (backed by filesystem)

    • cache/

      Abstract data access layer for SINGA-Auto’s temporary store of model parameters, train job metadata and queries & predictions in train & inference jobs (backed by Redis)

    • container/

      Abstract access layer for dynamic deployment of workers

    • utils/

      Collection of SINGA-Auto-internal utility methods (e.g. for logging, authentication)

    • model/

      Definition of abstract singa_auto.model.BaseModel that all SINGA-Auto models should extend, programming abstractions used in model development, as well as a collection of utility methods for model developers in the implementation of their own models

    • constants.py

      SINGA-Auto’s programming abstractions & constants (e.g. valid values for user types, job statuses)

  • web/

    SINGA-Auto’s Web Admin component

  • dockerfiles/

    Stores Dockerfiles for customized components of SINGA-Auto

  • examples/

    Sample usage code for SINGA-Auto, such as standard models, datasets dowloading and processing codes, sample image/question data, and quick test code

  • docs/

    Source documentation for SINGA-Auto (e.g. Sphinx documentation files)

  • test/

    Test code for SINGA-Auto

  • scripts/

    Shell & python scripts for initializing, starting and stopping various components of SINGA-Auto’s stack

    • docker_swarm/

      Containing server environment settings and scripts for running Docker

    • kubernetes/

      Containing server environment settings and scripts for running Kubernetes

    • .base_env.sh

      Stores configuration variables for SINGA-Auto

  • log_minitor/

    Dockerfile and configurations for elasticsearch and logstash

  • singa_auto_scheduler/

    Dockerfiles and configurations for scheduler and monitor

Python Documentation

singa_auto.client

singa_auto.model

Core
Knobs
Utility

singa_auto.constants

Acknowledgements

The research is supported by the National Research Foundation, Prime Minister’s Office, Singapore under its National Cybersecurity R&D Programme (Grant No. NRF2016NCR-NCR002-020), National Natural Science Foundation of China (No. 61832001), National Key Research and Development Program of China (No. 2017YFB1201001), China Thousand Talents Program for Young Professionals (3070011 181811).