All About Scripts

What is a Script?

Scripts are specific elements that are part of a LOST annotation pipeline. A script element is implemented as a python3 module. The listing below shows an example of such a script. This script will request image annotations for all images of a dataset. You can find the script here.

In order to implement a script you need to create a python class that inherits from lost.pyapi.script.Script. Your class needs to implement a main method whichneeds to be instantiated within your python script. The listing below shows a minimum example for a script.

from lost.pyapi import script

class MyScript(script.Script):

    def main(self):
        self.logger.info('Hello World!')

if __name__ == "__main__":
    MyScript()

Example Scripts

More script examples can be found here: lost/backend/lost/pyapi/examples/pipes

The LOST PyAPI Script Model

As all pipeline elements, a script has an input and an output object. Via these objects, it is connected to other elements in a pipeline (see also here).

Inside a script you can exchange information with the connected elements by using the self.inp object and the self.outp object.

Reading Imagesets

It is a common pattern to read a path to an imageset from a Datasource element in your annotation pipeline. See the listing below for a code example. Since multiple Datasources could be connected to our script, we iterate over all connected Datasources of the input with self.inp.datasources. For each Datasource element we can read the path attribute to get the filesystem path to a folder with images.

from lost.pyapi import script
import os

class MyScript(script.Script):

    def main(self):
        for ds in self.inp.datasources:
            for img_file in os.listdir(ds.path):
                img_path = os.path.join(ds.path, img_file)

if __name__ == "__main__":
    MyScript()

Requesting Annotations

The most important feature of the LOST PyAPI is the ability to request annotations for a connected AnnotationTask element. Inside a Script you can access the output element and call the self.outp.request_annos method (see listing 4 below).

self.outp.self.outp.request_annos(img_path)

Sometimes you also want to send annotation proposals to an AnnotationTask in order to support your annotator. In most cases these proposals will be generated by an AI, like an object detector. The listing below shows a simple example to send a dummy box and a dummy point to an annotation tool.

self.outp.self.outp.request_annos(img_path,
    annos = [[0.1, 0.1, 0.2, 0.2], [0.1, 0.2]],
    anno_types = ['bbox', 'point'])

Annotation Broadcasting

If multiple AnnoTask elements are connected to your ScriptOutput and you call self.outp.request_annos, the annotation request will be broadcast to all connected AnnoTasks. So each AnnoTask will get its own copy of your annotation request. Technically, for each annotation request an empty ImageAnno will be created for each AnnoTask. During the annotation process this ImageAnno will be filled with information.

Reading Annotations

Another important task is to read annotations from previous pipeline elements. In most cases this will be AnnoTask elements.

If you like to read all annotations at the script input in a vectorized way, you can use self.inp.to_df() to get a pandas DataFrame or self.inp.to_vec() to get a list of lists.

If you prefer to iterate over all ImageAnnos you can use the respective iterator self.inp.img_annos. See the listing below for an example.

for img_anno in self.inp.img_annos:
    for twod_anno in img_anno.twod_annos:
        self.logger.info('image path: {}, 2d_anno_data: {}'.format(img_anno.img_path, twod_anno.data)

Contexts to Store Files

There are three different contexts that can be used to store files that should handled by your script. Each context is modeled as a specific folder in the lost filesystem. In order to get the path to a context call self.get_path. The listing below shows an application of self.get_path in order to get the path to the instance context.

../../../backend/lost/pyapi/examples/pipes/sia/export_csv.py

There a three types of contexts that can be accessed: instance, pipe, and static.

The instance context is only accessible by the current instance of your script. Each time a pipeline is started each script will get its own instance folder in the LOST filesystem. No other script in the same pipeline will access this folder.

If you'd like to exchange files among the script instances of a started pipeline, you can choose the pipe context. When calling self.get_path with context = 'pipe' you will get a path to a folder that is available to all script instances of a pipeline instance.

The static context is a path to the pipeline project folder where all script instances will have access to. In this way you can access files that you have provided inside the Pipeline Project. For example, if you'd like to load a pretrained machine learning model inside of your script, you can put it into the pipeline project folder and and access it via the static context:

path_to_model = self.get_path('pretrained_model.md5', context='static')

Logging

Each Script will have a its own logger. This logger is an instance of the standard python logger. The example below shows how to log an info message, a warning and an error. All logs are redirected to a pipeline log file that can be downloaded via the pipeline view inside the web gui.

self.logger.info('I am a info message')
self.logger.warning('I am a warning')
self.logger.error('An error occured!')

Script Errors and Exceptions

If an error occurs in your script, the traceback of the exception will be visible in the web gui, when clicking on the respective script in your pipeline. The error will also be automatically logged to the pipeline log file.

Script ARGUMENTS

The ARGUMENTS variable will be used to provide script arguments that can be set during the start of a pipline within the web gui. ARGUMENTS are defined as a dictionary of dictionaries. Each argument dictionary has the keys value and help. As you can see in the listing below the first argument is called my_arg. Its value is true and its help text is A boolean argument.

ARGUMENTS = {'my_arg' : { 'value':'true',
                'help': 'A boolean argument.'}
            }

Within your script you can access the value of an argument with the get_arg(...) method as shown below.

if self.get_arg('my_arg').lower() == 'true':
    self.logger.info('my_arg was true')

Script ENVS

The EVNS variable provides meta information for the pipeline engine by defining a list of environments (similar to conda environments) where this script may be executed in. In this way you can assure that a script will only be executed in environments where all your dependencies are installed. All environments are installed in workers that may execute your script. If many different environments are defined within the ENVS list of a script, the pipeline engine will try to assign the script to a worker in the same order as defined within the ENVS list. So if a worker is online that has installed the first environment in the list the pipeline engine will assign the script to this worker. If no worker with the first environment is online, it will try to assign the script to a worker with the second environment in the list and so on. The listing below shows an example of the ENVS definition in a script that may be executed in two different environments.

ENVS = ['lost', 'lost-cv']

Script RESOURCES

Sometimes a script will require all resources of a worker. Therefore, no other script should be executed in parallel by the worker that executes your script. This is often the case if you train an AI model and you need all GPU memory to do this. In those cases, you can define a RESOURCES variable inside your python script and assign a list containing the string lock_all to it. See the listing below for an example:

RESOURCES = ['lock_all']

Debugging a Script

Most likely, if you imported your pipeline and run it for the first time, some scripts will not work, since you placed some tiny bug into your code :-)

Inside the web GUI, all exceptions and errors of your script will be visualized when clicking on the respective script element in the pipeline visualization. This way, you get a first hint at what's wrong.

In order to debug your code you need to login to the docker container and find the instance folder that is created for each script instance. Inside this folder, there is a bash script called debug.sh that needs to be executed in order to start the pudb debugger. You will find your script by its unique pipeline element id. The path to the script instance folder will be /home/lost/app/debug/i-<pipe_element_id>. You can find the ID when inspecting the pipeline in the LOST web GUI.

# Log in to docker
docker exec -it lost bash
# Change directory to the instance path of your script
cd /home/lost/app/debug/i-<pipe_element_id>
# Start debugging
bash debug.sh

Note

If your script requires a special ENV to be executed, you need to login to a container that has installed this environment for debugging.

What is a Script?​

Example Scripts​

The LOST PyAPI Script Model​

Reading Imagesets​

Requesting Annotations​

Annotation Broadcasting​

Reading Annotations​

Contexts to Store Files​

Logging​

Script Errors and Exceptions​

Script ARGUMENTS​

Script ENVS​

Script RESOURCES​

Debugging a Script​