=========== Tutorials =========== This page gives you an overview of the general usage paradigms of SmartManPy. We have prepared additional (complete) examples that demonstrate various capabilities. These examples can be found in manpy/simulation/tutorials and here: :doc:`Full examples ` We recommend the following order: 1. Quality_Control.py 2. Dependency.py 3. ExampleTS.py 4. Conditional_Failure.py 5. Interpolation.py 6. Data_Extraction.py 7. Quality_Control_Gym.py Introduction ============= Basic principles ----------------- The original ManPy package is based on the SimPy discrete event simulation. If you are already familiar with SimPy, you should recognize many paradigms. The start and finish point of a production line is a Source and Exit: .. code-block:: python :linenos: from manpy.simulation.imports import Machine, Source, Exit start = Source("S1", "Source", interArrivalTime={"Fixed": {"mean": 0.4}}, entity="manpy.Part", capacity=1) exit = Exit("E1", "Exit1") Then, we can add a machine to our small example. Our machine has a mean processing time of 0.8 seconds, with a standard deviation of 0.075 seconds. However, the minimal and maximal possible time is set to 0.425 and 1.175 seconds: .. code-block:: python :linenos: m1 = Machine("M1", "Machine1", processingTime={"Normal": {"mean": 0.8, "stdev": 0.075, "min": 0.425, "max": 1.175} }) In order to complete our small example, we need to define the routing between the components. This is done for each component individually: .. code-block:: python :linenos: start.defineRouting(successorList=[m1]) m1.defineRouting(predecessorList=[start], successorList=[exit]) exit.defineRouting(predecessorList=[m1]) We can now run and evaluate our simulation, e.g. for 50 time steps. To do this, we use the runSimulation function defined in Globals.py. This function needs a list with all simulated objects (machines, features/timeseries, failures, ...) and the maximum simulation time. .. code-block:: python :linenos: from manpy.simulation.core.Globals import runSimulation, getFeatureData maxSimTime = 50 objectList = [start, m1, exit] runSimulation(objectList, maxSimTime) df = getFeatureData([m1]) df.to_csv("ExampleLine.csv", index=False, encoding="utf8") print(f"Produced: {exit.numOfExits}\\ Simulationszeit: {maxSimTime}") .. attention:: The order of the objects in the object list is important! If there exist dependencies (e.g. temporal) between simulated objects, you need to reflect this in the object list. This is especially important for functional dependencies in features (see `Functional dependencies`_) If you want to run another simulation directly after the previous one has finished, you can use the "resetSimulation()" in Globals. Adding more machines ---------------------- Our example currently consists of only one production step. Since ManPy was designed to simulate production lines, let's see what it takes to add more machines to a simulation. First of all, we need to define a second machine: .. code-block:: python :linenos: m2 = Machine("M2", "Machine2", processingTime={"Normal": {"mean": 2.0, "stdev": 0.1, "min": 1.7, "max": 2.3} }) Now we need to define how the output of Machine1 proceeds to Machine2. ManPy is capable of simulating complex routings, e.g. using conveyor belts. This makes sense if you are interested in the overall behaviour of the production line. For this example, we'll stick to the simplest connection between two machines: the queue. Queues in ManPy act as a simple buffer with a certain capacity. In order to work correctly, we also need to update the routing of the production line and add the new objects to the objectList: .. code-block:: python :linenos: q1 = Queue("Q1", "Queue1", capacity=10) start.defineRouting(successorList=[m1]) m1.defineRouting(predecessorList=[start], successorList=[q1]) q1.defineRouting(predecessorList=[m1], successorList=[m2]) m2.defineRouting(predecessorList=[q1], successorList=[exit]) exit.defineRouting(predecessorList=[m2]) objectList = [start, m1, m2, q1, exit] Improved Routing ----------------- The default routing mechanism requires you to manually set the predecessors and successors of objects, with makes multiple definitions necessary if you add an object to the production line. Furthermore, if you decide to change the order or want to (temporarily) remove a station, you also need to make changes at multiple locations. As an improvement, we added an easier way of defining the routing that's based on list. The whole production line is defined in one list. Each "stage", i.e. all machines at the same level, are contained in separate lists: .. code-block:: python :linenos: routing = [ [start], [m1], [q1], [m2], [exit] ] It is also possible to add multiple machines or sources to the same level. To actually perform the routing definition, you need to use generate_routing_from_list defined in core/ProductionLineModule.py: .. code-block:: python :linenos: from manpy.simulation.core.ProductionLineModule import generate_routing_from_list generate_routing_from_list(routing) Using this approach for routing, you can easily change the order or remove parts of the production line with minimal changes. Advanced usage ================ The following sections provides an introduction into the more advanced concepts of SmartManpy. Features --------- Features are our most important extension to the original ManPy and also the most complex one. Basic usage ............ Features are a sub-class of ObjectProperty, which is a generic base class for all kinds of data a machine/object can generate during production. We currently have two sub_classes of ObjectProperty: Features and TimeSeries. While TimeSeries is concerned with (as the name suggests) time series data that is generated during production (e.g. temperature curves), Feature is concerned with properties that are measured/logged once for each entity at a production station. In the following section, we will explore the most important mechanics of the Feature class. The following statement shows the most basic definition of a Feature: .. code-block:: python feature1 = Feature(id="f1", name="Feature1", victim=m1, distribution={"Feature": {"Normal": {"mean": 0, "stdev": 1.0}}} ) This statements assigns a new Feature with the internal id "f1" and name "Feature1" (used for data output) to Machine m1. The feature values are randomly drawn from a normal distribution with mean 0 and standard deviation 1. It is possible to select different distributions and to control the behaviour of the underlying distribution over the course of a simulation. Further explanations for these mechanics are provided in `Distributions and StateControllers`_. Functional dependencies ........................ We can of course add many more features to a machine. Sometimes, there exist certain relationships between features, e.g. physical dependencies. We can model these dependencies using the "dependency" parameter: .. code-block:: python feature2 = Feature(id="f2", name="Feature2", victim=m1, dependent={"Function": "10*x + 3", "x": feature1} distribution={"Feature": {"Normal": {"stdev": 0.1}}} ) Here, we define a functional dependency between feature2 and feature1, in this case the linear function 10x + 3. To simulate eventual measurement errors, we can apply a standard deviation to this dependency, in this case 0.1. However, it is also possible to have strict functional dependencies between features by simply not passing anything as an argument for distribution: .. code-block:: python feature2 = Feature(id="f2", name="Feature2", victim=m1, dependent={"Function": "10*x + 3", "x": feature1} ) .. attention:: The order in the object list matters! If you define features with functional dependencies, you need to know that the order in the object list that's passed to runSimulation is important! A feature that depends on other features values needs these features to be generate before itself. To ensure this, you need to place the features that are used in functional dependencies before the features that use them. Random walks ............. Sometimes, Features depend on the previous value, e.g. Temperatures. To model this, we can use random walks. When the random walk mode is activated, the randomly drawn feature value is added to the last feature value. A Feature generated using a random walk can be defined as follows: .. code-block:: python random_walk_feature = Feature(id="ftr_rw", name="Feature_Random_Walk", victim=m1, random_walk=True, start_value=20, distribution={"Feature": {"Normal": {"mean": 0, "stdev": 1.0}}}) Feature "Feature_Random_Walk" has a starting value of 20. For each data point, a value is drawn from a normal distribution with mean 0 and standard deviation 1 and then added to the current value. The starting value is 20, which can be interpreted as the "mean" of the random walk. Time Series ------------ Basic usage ............ TimeSeries represents the second type of ObjectProperty in our ManPy Extension. At each production step, TimeSeries generates a configurable amount of data points in a certain time frame. Let's have a look at a simple example: .. code-block:: python ts_features = Feature(id="ftr_ts, name="Feature_Time_Series", distribution={"Function": {(0, 2): "0.5*x + 2"}, "DataPoints": 20, "Feature": {"Normal": {"stdev": 0.1}} } ) This example generates a time series in which the data points are 0.1 second apart. The time series is defined in the interval [0, 2], in which 20 data points are sampled. The resulting values are governed by a linear function. At each data point in the time series, a standard deviation of 0.1 is applied to model small differences between production steps. Multiple Intervals ................... It is possible to define multiple intervals to further customize the mathematical description of the time series: .. code-block:: python ts_features = Feature(id="ftr_ts, name="Feature_Time_Series", distribution={"Function": {(0, 1): "0.5*x + 2", (1, 2): "0.1*x + 2"}, "DataPoints": 20, "Feature": {"Normal": {"stdev": 0.1}} } ) Now, the TimeSeries behaves differently from 1 to 2 than from 0 to 1. Interpolation .............. The aforementioned ways of creating time series are quite powerful, but only if a functional relationship ís known. Sometimes, only certain values are known, which makes interpolation a very useful tool for these cases: .. code-block:: python ts_features = Feature(id="ftr_ts, name="Feature_Time_Series", distribution={"Function": {(0, 1): "0.5*x + 2", (1, 3): [[1, 1.5, 2, 3], [4, 4.2, 4.3, 5.1]]}, "DataPoints": 20, "Feature": {"Normal": {"stdev": 0.1}} } ) In this example, we provide the interpolation algorithm 4 data points in the interval in which it interpolates. No matter how small or large the interval is, the interpolation algorithm needs at least 4 values. The data points for interpolation can also be determined by a Feature with all its customization possibilities: .. code-block:: python :linenos: endVal = Feature(id="endVal", name="endVal", victim=m1, distribution={"Feature": {"Normal": {"mean": 5.2, "stdev": 0.1}}} ) ts_features = Feature(id="ftr_ts, name="Feature_Time_Series", distribution={"Function": {(0, 1): "0.5*x + 2", (1, 3): [[1, 1.5, 2, 3], [4, 4.2, 4.3, "EndVal"]]}, "EndVal": endVal, "DataPoints": 20, "Feature": {"Normal": {"stdev": 0.1}} } ) In this example, the final value for interpolation is received from Feature "endVal". Example plot ............. The following plot shows two complex TimeSeries that were created using both interpolation and functional dependencies: .. image:: ./images/ts_complex.png :width: 600 :alt: Complex TimeSeries Quality Control ----------------- Quality control is a standard process in manufacturing. Therefore, we added the option for quality control to machines. As a result, machines can either have an additional quality control step at the end of their production step or be a standalone quality control instance. The condition for quality control can be set via a custom defined function, which is simply called "condition" in the following example. We can access the currently active entity in a machine with the following statement: .. code-block:: python activeEntity = self.getActiveEntity() We can then use any simulated value of the entity as measurement for quality control, e.g. feature values or internal labels. The condition function must return True if a defect was found, otherwise False must be returned. In the following example, we simply check if a given Feature value is inside a certain interval ([3, 7]). .. code-block:: python :linenos: def condition(self): # self is w.r.t. to the machine in which we apply the condition! activeEntity = self.getActiveEntity() if activeEntity.features[0] > 7 or activeEntity.features[0] < 3: return True else: return False In this example, we had to access the feature value by index, which is usually very tedious. We therefore added the function "get_feature_values_by_id" in Globals.py, that let's you access certain feature values of an entity by the feature ID: .. code-block:: python :linenos: from manpy.simulation.core.Globals import get_feature_values_by_id def condition(self): # self is w.r.t. to the machine in which we apply the condition! activeEntity = self.getActiveEntity() # Access first element since function returns a list feature_value = get_feature_values_by_id(activeEntity, ["f1"])[0] if feature_value > 7 or feature_value < 3: return True else: return False Failures --------- Basic usage ............ Despite not being desired, failures play a big role in production lines. Therefore, in order to accurately model a production line, we must be able to model failures in a sophisticated way. ManPy already provides such a complex model through its Failure classe. The following example demonstrates a simple ManPy Failure: .. code-block:: python simple_failure = Failure(id="Flr0", name="SimpleFailure", victim=m1, distribution={"TTF": {"Fixed": {"mean": 0.8}}, "TTR": {"Normal": {"mean": 100, "stdev": 25, "min":50, "probability": 0.01}}}) This failure is potentially triggered every 0.8 seconds, which is determined by the time-to-failure (TTF) distribution. At each potential trigger point, a time-to-repair (TTR) is calculated, which determines the down time of the victim (i.e. the machine at which the failure occurs) Since we additionally passed a probability value to the TTR distribution, we only get actual downtime with a 1% chance. If we don't pass the probability value, the frequency of the failure is solely determined by TTF. Conditional failures ..................... A more flexible way of triggering failures are conditional failures. Conditional failures are comparable to `Quality Control`_ in Machines. You implement the condition as a function and pass it to the failure using the "conditional" parameter: .. code-block:: python :linenos: # Any function can be employed as the condition for a Failure to occur # You can utilize any simulation values for the condition # Return True to let the Failure occur def condition(self): value_1 = Ftr1.get_feature_value() value_2 = Ftr2.get_feature_value() if (value_1 + 20 * value_2) > 200: return True else: return False conditional_failure = Failure(victim=m1, conditional=condition, distribution={"TTR": {"Fixed": {"mean": 30}}}) Here, the triggering of the failure is solely controlled by the function condition, we only need to specify TTR. Similarly to Quality Control, we can access the feature values to determine whether a failure should be triggered or not. .. tip:: A Failure is automatically resolved after TTR is passed. Additionally, ManPy offers the possibility to model repairmen, which can be used to model constrained maintenance resources. In our case, we always assume that a failure can be repaired in the given time period, which may be unrealistic. Distributions and StateControllers ----------------------------------- Using our StateControllers in combination with distributions allows for complex control over the lifecycle behaviour of features. This can be used to model data drifts or distribution shifts. The StateControllers are relatively generic and would easily allow extensions to other use cases, but we focus on controlling different probability distributions. The motivation for StateControllers was the need for modelling changing behaviour of features depending on their wear. If a machine part (e.g. a bearing) shows signs of wear, it's underlying probability distribution changes slightly. In the case of a bearing, this could be modelled using a steadily increasing standard deviation. In the following, the different types of yet implemented StateControllers are explained. SimpleStateController ...................... SimpleStateController is the most simple case of a StateController (surprise!). It models a simple "break point", e.g. a very different behaviour of a machine part after it broke. This can be achieved using the following piece of code: .. code-block:: python :linenos: dists = [{"Time": {"Fixed": {"mean": feature_cycle_time}}, "Feature": {"Normal": {"mean": 0, "stdev": 1}}}, {"Time": {"Fixed": {"mean": feature_cycle_time}}, "Feature": {"Normal": {"mean": 100, "stdev": 10}}}] boundaries = {(0, 25): 0, (25, None): 1} controller = SimpleStateController(states=dists, boundaries=boundaries, wear_per_step=1.0) f3 = Feature("f3", "F3", victim=m2, distribution_state_controller=controller) This SimpleStateController controls the distributions of Feature F3. The actual behaviour is defined in "boundaries", which controls the exact distribution that should be used at a certain amount of wear. In each production step, wear_per_step is added to the total amount of wear. If the total amount of wear crosses a boundary, a different distribution is used for Feature F3. In this case, the break point is defined at 25 units of wear, which leads to a new normal distribution with a drastically different mean (100). .. tip:: By default, a StateController is reset to its initial state after the victim (= the machine) of its assigned feature has ended a failure, i.e. it's been repaired. This behavior can be deactivated through the "reset_distributions" parameter of Feature. ContinuousNormalDistribution ............................ SimpleStateController is very generic by simply retrieving the element in the states list that is determined by boundaries. ContinuousNormalDistribution is a more specialized StateController. It is specifically designed for Features that are generated using a Gaussian distribution. In ContinuousNormalDistribution, we assume that wear immediately influences the underlying probability distribution, even if it's by a very small amount. We model this by adding a certain amount (mean_change_per_step) in each production step to the mean of the normal distribution. Additionally, the break point mechanic from SimpleStateController is still present. However, it's now simplified such that the normal distribution after the defect occurred is only defined by a mean and STD: .. code-block:: python :linenos: mean_change_per_step = 0.05 controller1 = ContinuousNormalDistribution(wear_per_step=0.1, mean_change_per_step=mean_change_per_step, initial_mean=2.0, std=2.0, break_point=10, defect_mean=7.0, defect_std=3.0 ) # not using a break point controller2 = ContinuousNormalDistribution(wear_per_step=0.7, mean_change_per_step=mean_change_per_step, initial_mean=2.0, std=2.0, break_point=None, defect_mean=None, defect_std=None ) f3 = Feature("f3", "F3", victim=m2, reset_distributions=True, distribution_state_controller=controller1) # f3 = Feature("f3", "F3", victim=m2, reset_distributions=True, distribution_state_controller=controller2) The typical behaviour of ContinuousNormalDistribution can be seen in the following plot. It contains the evolution of the feature value of two ContinuousNormalDistribution StateControllers over the span of 250 steps. .. image:: ./images/continuos_normal_dist.png :width: 500 :alt: Two ContinuousNormalDistributions RandomDefectStateController ............................ SimpleStateController and ContinuousNormalDistribution are best used to model properties related to wear. But sometimes, failures can occur without obvious reason. For these cases, we designed RandomDefectStateController, which models a defect using a Bernoulli distribution. If the Bernoulli distribution returns 1, it selects a defect StateController from a list, otherwise it uses a "ok" StateController that model normal behaviour. .. code-block:: python :linenos: mean_change_per_step = 0.02 ok_controller = ContinuousNormalDistribution(wear_per_step=0.7, break_point=None, mean_change_per_step=mean_change_per_step, initial_mean=3.0, std=2.0, defect_mean=7.0, defect_std=3.0 ) defect_controller1 = ContinuousNormalDistribution(wear_per_step=0.7, mean_change_per_step=mean_change_per_step, initial_mean=7.0, std=2.0, break_point=None, defect_mean=None, defect_std=None ) defect_controller2 = ContinuousNormalDistribution(wear_per_step=0.1, mean_change_per_step=mean_change_per_step, initial_mean=1.0, std=2.0, break_point=None, defect_mean=None, defect_std=None ) random_defect_controller = RandomDefectStateController(failure_probability=0.05, ok_controller=ok_controller, defect_controllers=[defect_controller1, defect_controller2]) The typical behavior of a RandomDefectStateController looks similar to the following plot (Red crosses = defect): .. image:: ./images/random_defect.png :width: 600 :alt: Plot with random defects. The defect_controllers list can contain multiple StateControllers, which can be used to model minor deviations from the planned behaviour in multiple directions, e.g. too much or not enough glue. RandomDefectController introduces an additional way of performing quality control. Depending on the distribution that gets selected (ok/defect), an internal label is set to either True or False, indicating whether a defect is present or not. This label can be used for quality control, which should create more non-obvious relationships: .. code-block:: python :linenos: def quality_control(self): activeEntity = self.Res.users[0] if any(activeEntity.labels): return True This function marks an entity as "defect" if at least one feature was the result of a "defect" probability distribution. .. tip:: StateControllers are highly customizable. If necessary, you can write your own StateController that perfectly fits you demands. The interface is defined in core/StateController.py. Cost of entities ---------------- Since every part and every production step in reality costs money, cost is an interesting aspect for simulation. We therefore added an attribute "cost" to Entity, which allows us to accurately simulate the value of an entity over the course of a production line. Every CoreObject has a cost parameter, which is set to 0 by default. When an entity passes through a CoreObject, the cost of the entity is increased by the cost of the CoreObject. This even includes the Source and Exit objects, which can be used to model the initial cost or the final value of an entity. Additionally it is possible to add a cost to any Failure. This cost is added to the current entity that is being processed by the victim of the Failure. .. code-block:: python :linenos: m1 = Machine("M1", "Machine1", processingTime={"Normal": {"mean": 0.8, "stdev": 0.075, "min": 0.425, "max": 1.175} }, cost=10) e1 = Exit("E1", "Exit1", cost=-50) expensive_failure = Failure(id="Flr0", name="ExpensiveFailure", victim=m1, distribution={"TTF": {"Fixed": {"mean": 0.8}}, "TTR": {"Normal": {"mean": 100, "stdev": 25, "min":50, "probability": 0.01}}}, cost=100) Furthermore, every cost can be negative. Just like in this example, where e1 has a negative cost, which means that finishing the part adds value to it instead of costing money. Export ------ SmartManPy offers two ways to export the simulated data: Pandas DataFrames and Databases. Pandas DataFrames .................. To export the data to a Pandas DataFrame, you can use the getFeatureData and getTimeSeriesData functions: .. code-block:: python :linenos: m1_data = getFeatureData([m1]) print(m1_data.to_string(index=False), "\n") # With 'time=True', timestamps of the feature values are included in the DataFrame m1_data_time = getFeatureData([m1], time=True) print(m1_data_time.to_string(index=False), "\n") # The function supports multiple machines both = getFeatureData([m1, m2]) print(both.to_string(index=False), "\n") # To retrieve timeseries data from the simulation, utilize the getTimeSeriesData function # The function accepts a timeSeries and returns a DataFrame representing that timeseries ts_data = getTimeSeriesData(ts_features) While getFeatureData accepts machines as input, getTimeSeriesData accepts a TimeSeries instance. From there on, Pandas Dataframes offer a variety of exports, e.g. to CSV. Databases .......... Additionally, we support data export to QuestDB and Kafka: .. code-block:: python :linenos: from manpy.simulation.core.Database import ManPyQuestDBDatabase, ManPyKafkaConnection db = ManPyQuestDBDatabase() # alternatively: db = ManPyKafkaConnection(...) runSimulation(objectList, maxSimTime, db=db) In QuestDB, you can use SQL queries to access the datapoints. QuestDB also has plotting capabilities, as you can see in the following screenshot: .. image:: ./images/timeseries.PNG :width: 800 :alt: A screenshot of the QuestDB database. .. tip:: Our Database interface is highly customizable. If necessary, you can write your own DB interface that perfectly fits you demands. The interface is defined in core/Database.py. ProductionLineModules ---------------------- The definition of long and complex production lines can get very extensive and confusing. To improve the clarity of complex production lines, we added ProductionLineModules, which allow the encapsulation of parts of the production line. A ProductionLineModule can contain an arbitrary amount of simulatable objects. The main advantage is the possibility to define complex production stations in a different file, without the need for importing a large amount of objects. ProductionLineModules only need to know their internal routing, the routing with external components is done via the known lists or using defineRouting. The most simple ProductionLineModule is SequentialProductionLineModule, which simply takes the routing between objects in sequential order and applies it. This type of module should be enough to cover most of the needs for such modules. If you need additional functionality, you can write custom ProductionLineModule by inheriting from core/ProductionLineModule. The following example demonstrate the definition of a very basic module. .. code-block:: python :linenos: from manpy.simulation.imports import Machine, Feature from manpy.simulation.core.ProductionLineModule import SequentialProductionLineModule m1 = Machine("M1", "Machine1", processingTime={"Normal": {"mean": 0.8, "stdev": 0.075, "min": 0.425, "max": 1.175} }) feature1 = Feature(id="f1", name="Feature1", victim=m1, distribution={"Feature": {"Normal": {"mean": 0, "stdev": 1.0}}} ) internal_routing = [[m1]] features = [feature1] example_module = SequentialProductionLineModule(internal_routing, features, "ExampleModule") This module can then be imported into other files and easily incorporated in the overall definition of a production line. .. code-block:: python :linenos: from manpy.simulation.core.ProductionLineModule import generate_routing_from_list from FILENAME import example_module object_list = [...] example_module_objects = example_module.getObjectList() object_list.extend(example_module_objects) routing = [ ... [...], [example_module], [...], ... ] generate_routing_from_list(routing) Since all objects of the module need to be added to the global object list of the production line, we need to access the module's object. We can conveniently do so by using example_module.getObjectList(). When defining the routing, a ProductionLineModule behaves like every Machine, Source, Exit, etc. Training an AI agent using deep RL ----------------------------------- ManPy as a simluation framework is a great playground for training Reinforcement Learning agents. Because they can interact with and influence the production line during the simulation. This opens up different possibilities than training on static datasets that are generated at the end of a simulation. We have implemented a custom Gym environment, starting with a simple example of a Quality Control problem. The environment is defined in GymEnv.py as QualityEnv. In order to use it, create a class, inherit from the class QualityEnv and override the abstract methods prepare, obs, and rew. prepare is used to define the simulation, like in the normal simulation setup. obs is used to define the observations that the agent receives from the simulation. rew is used to define the reward that the agent receives for a certain action. When writing rew, the input action 1 means discarding the part. Because we now have an environment and a simulation, taking a step is different from other Gym environments. Instead of letting the environment make a step, we let the simulation run, and call the agent at an appropriate time through QualityEnv.step(). For this Quality Control example, the agent replaces the control function of a machine. .. code-block:: python :linenos: from manpy.simulation.core.GymEnv import QualityEnv class ExampleEnv(QualityEnv): def prepare(self): s = Source("S1", "Source", interArrivalTime={"Fixed": {"mean": 0.1}}, entity="manpy.Part" ) m1 = Machine("M1", "Machine1", processingTime={"Normal": {"mean": 0.2, "stdev": 0.1, "min": 0.08, "max": 0.34}}, control=self.step ) e1 = Exit("E1", "Exit1") dists = [{"Feature": {"Normal": {"mean": 200, "stdev": 50, "min": 0, "max": 400}}}, {"Feature": {"Normal": {"mean": 600, "stdev": 30, "min": 400, "max": 800}}}] boundaries = {(0, 25): 0, (25, None): 1} controller = SimpleStateController(states=dists, labels=["ok", "defect"], boundaries=boundaries, wear_per_step=1.0, reset_amount=40 ) f1 = Feature("Ftr1", "Feature1", victim=M1, distribution_state_controller=controller, ) s.defineRouting([M1]) m1.defineRouting([S], [E1]) e1.defineRouting([M1]) return [s, m1, e1, f1] def obs(self): activeEntity = self.machine.Res.users[0] return np.array(activeEntity.features) def rew(self, action): activeEntity = self.machine.Res.users[0] if action == 1 and activeEntity.labels[-1] == "ok": return -1 elif action == 1 and activeEntity.labels[-1] == "defect": return 1 elif action == 0 and activeEntity.labels[-1] == "ok": return 1 elif action == 0 and activeEntity.labels[-1] == "defect": return -1 simu = ExampleEnv(observation_extremes=[(0, 800)], policy_network=PolicyNetwork(1), maxSteps=2000, steps_between_updates=10, save_policy_network=True) simu.reset() In this example, we produce "defect" and "ok" parts with different feature values by using a SimpleStateController. The agent can decide whether to discard a part or let it pass and receives a reward of 1 for a correct decision and -1 for a wrong decision. We set m1 as the machine that the agent controls, by setting the "control" parameter to self.step. To start the training, create an instance of ExampleEnv and call reset() to reset, prepare and run the simulation. In order to access the results of the simulation, use simulation.objectList to get all objects and therefore the data of the simulation.