# Momba Gym¶

This documentation describes the Momba Gym API which is a part of MoGym.

MoGym is an integrated toolbox enabling the training and verification of machine-learned decision-making agents based on formal models. Given a formal representation of a decision-making problem in the JANI format and a reach-avoid objective, MoGym (a) enables training a decision-making agent with respect to that objective directly on the model using reinforcement learning (RL) techniques, and (b) it supports the rigorous assessment of the quality of the induced decision-making agent by means of deep statistical model checking (DSMC). MoGym implements the standard interface for training environments established by OpenAI Gym, thereby connecting to the vast body of existing work in the RL community.

A tool paper describing MoGym is currently under submission for CAV22.

The Momba Gym API has two parts:

1. It exposes an OpenAI Gym compatible training environment.

2. It exposes convenience functions for deep statistical model checking.

## Training Environment¶

The OpenAI Gym compatible training environment is implemented by the class MombaEnv. It is generic over an explorer implementing the abstract base class Explorer:

class momba.gym.env.MombaEnv(explorer, *, renderer=None)[source]

Implementation of an OpenAI Gym environment.

Parameters
• explorer – The abstract.Explorer to use.

• renderer – An optional renderer for the OpenAI Gym API.

fork()[source]

Forks the environment.

render(mode='human')[source]

Renders the environment assuming a render has been supplied.

reset()[source]

Resets the environment to an initial state and returns an initial observation.

step(action)[source]

Takes a decision in response to the last observation.

A generic MombaEnv working with a large class of JANI models and providing various configuration options, e.g., with regard to the observation and action space, can be constructed with create_generic_env():

momba.gym.create_generic_env(network, controlled_instance, property_name, *, parameters=None, rewards=Rewards(goal_reached=100, dead_end=- 100, step_taken=0, invalid_action=- 100), actions=Actions.EDGE_BY_INDEX, observations=Observations.GLOBAL_ONLY, renderer=None)[source]

Constructs a generic training environment from a JANI model based on the provided options.

Parameters
• network – A JANI automaton network.

• controlled_instance – An instance of an automaton in the provided network. The decision-making agent trained on the resulting environment is assumed to act by resolving the non-determinism in this automaton.

• property_name – The name of a reach-avoid JANI property (specified as part of the JANI model the network originates from) for which the agent should be trained.

• parameters – Allows defining values for parameters of the JANI model.

• rewards – Specifies the reward structure used for training.

• actions – Specifies the action space for the environment.

• observations – Specifies the observation space for the environment.

• renderer – Is an optional renderer for the OpenAI Gym API.

## Deep Statistical Model Checking¶

The DSMC functionality of Momba Gym is based on the statistical model checker modes of the Modest Toolset which has been extended by two new resolution strategies for nondeterminism. Unfortunately, the Modest Toolset is not open-source and the extensions have not been integrated into the official version yet. Binary versions of the extended version are available here. Please make sure that you have the extended version in your PATH, otherwise the following two functions will not work.

The Momba Gym API provides two functions for (a) checking an arbitrary Python function, check_oracle(), and (b) for checking a PyTorch neural network,check_nn():

momba.gym.checker.check_oracle(network, instance, oracle, *, parameters={}, toolset=Toolset(executable='modest', environment=None), options=ModesOptions(max_run_length_as_end=True, additional_options=()), actions=Actions.EDGE_BY_INDEX, observations=Observations.GLOBAL_ONLY)[source]

Checks an arbitrary Python function.

Parameters
• network – A JANI automaton network.

• instance – An instance of an automaton in the provided network. The decision-making agent trained on the resulting environment is assumed to act by resolving the non-determinism in this automaton.

• orcace – The Python function to check.

• parameters – Allows defining values for parameters of the JANI model.

• toolset – Specifies the location of the Modest Toolset.

• actions – Specifies the action space for the environment.

• observations – Specifies the observation space for the environment.

momba.gym.checker.check_nn(network, instance, nn, *, parameters={}, toolset=Toolset(executable='modest', environment=None), options=ModesOptions(max_run_length_as_end=True, additional_options=()), actions=Actions.EDGE_BY_INDEX, observations=Observations.GLOBAL_ONLY)[source]

Checks a PyTorch neural network.

Parameters
• network – A JANI automaton network.

• instance – An instance of an automaton in the provided network. The decision-making agent trained on the resulting environment is assumed to act by resolving the non-determinism in this automaton.

• nn – The PyTorch neural network.

• parameters – Allows defining values for parameters of the JANI model.

• toolset – Specifies the location of the Modest Toolset.

• actions – Specifies the action space for the environment.

• observations – Specifies the observation space for the environment.