The logical trend in manufacturing companies is to integrate systems horizontally and vertically. MES systems play an indispensable role in bridging the gap between enterprise resource planning (ERP) and production automation systems (PLC). While enterprise information systems are centralized, manufacturing controllers are inherently distributed. Distributed MES systems based on the principle of multi-agent systems attempt to deal with the different nature of these systems.
Traditional centralized systems (especially planning systems and centralized control systems) are facing the new demands of the modern production environment. These include:
- Unpredictable development of orders – changes occur in production orders already in progress.
- Changes in the production environment – changes in the workshop environment also occur during the execution of orders.
- Complexity of production environment and orders – the modern manufacturing environment is characterized by increasing complexity of orders and a high degree of variability in the layout and setup of production equipment.
Due to their hierarchical nature, centralized systems are considered to be highly static with a low degree of adaptability to increasingly dynamic changes in orders and production environments. Decision-making is concentrated at the top layers of the imaginary pyramid of enterprise information systems (especially in ERP systems and related tools). This makes production planning difficult to respond to dynamic changes and emerging exceptions at the lowest layers of the production environment.
The growth of flexibility in production can be achieved by applying two main strategies:
- Moving some decision-making processes from enterprise resource planning (ERP) systems to a lower layer of management information systems (MES), which is characterized by shorter planning cycles and faster response to change.
- By distributing the decision-making process to a set of independently functioning entities that are able to implement and optimize the process thanks to mutual cooperation, their own capabilities, and the availability of local information and resources.
In order to implement these principles, the paradigm of Distributed Control Systems (DCS) was defined. The basic idea of DCS systems is to distribute the decision-making process and system functionalities into independently functioning entities called “holon” or “agent”.
So far, it seems unrealistic that current ERP systems will start to operate based on the distribution paradigm in the near future. Therefore, MES systems are the main platform for the implementation of this principle. However, these are usually based on the principle of centralisation. In the next section, the distribution of typical MES functions will be described. But first, let us look at one particular architecture to describe the different entities used to distribute MES functions.
ELEMENTS OF A DISTRIBUTED SYSTEM
As an example of a distributed architecture, consider the PABADIS concept, which was developed under the auspices of the European Union (6th Framework Programme) with the participation of SAP, Siemens, the Austrian Academy of Sciences, Fiat and others.
PABADIS brings complete vertical integration of ERP, MES and automation systems according to the distributed systems paradigm.
On the interface side with ERP, there are three basic entities:
- Order Agent Supervisor (OAS) – managing production orders sent from the ERP system to the MES system and processed by Order Agents (OA).
- Resource Agent Supervisor (RAS) – a direct interface between the ERP system and the so-called Resource Agents (RA). It allows the ERP system to influence the use of resources.
- Product Data Repository (PDR) – is used by the ERP system and OA agents to exchange production data. The PDR provides the data (operation dials, material dials,…) that OA agents need to execute a production order.
The core of the MES system consists of:
- Order Agents (OA) – representing a production order.
On the interface side with automation systems, it stands for:
- Resource Agents (RA) – representing a production resource.
MES auxiliary tools are:
- Ability Broker (AB) – manages the RA resource database and provides information for OA.
- Information Collector (IC) – manages historical data on production order execution and resource utilization. The data can be used by internal entities (RAS,…) or external systems.
- Device Observer (DO) – an auxiliary agent that searches for and registers new resources to the MES. It establishes communication with the newly incorporated resource, informs other agents of the existence of the new resource and initiates the process to enable communication between the control device and its RA.
OAs and RAs are responsible for planning the execution of production orders, including the provision of production resources. The decision-making process is implemented by a group of OAs who work independently of each other but coordinate their actions and decisions according to the production orders they execute and according to a set of rules that ensure the selflessness of these agents in their decision-making.
THE LIFE CYCLE OF A PRODUCTION ORDER IN A DISTRIBUTED SYSTEM
- The paradigm of the distributed approach also has an impact on the life cycle of a production order. Let’s take a look at the steps that make up the contract lifecycle:
- ERP sends the production order to OAS.
- OAS decomposes the order, creates an OA and assigns the appropriate segment of the production order to it.
- OA receives production data from PDR
- OA will ask AB for available RA resources
- OA schedules the surgery by booking oneRA
- RA will perform the operation
- The OA sends a report to the OAS when the operation is complete and terminates
- OAS forwards report to ERP
PLANNING IN A DISTRIBUTED SYSTEM
Planning is an important step in the life cycle of a production order. Scheduling in distributed systems consists of resource-oriented scheduling and order-oriented rescheduling. In the first phase, the OA determines the timeframe for the execution of the assigned segment of the production order.
In the next phase, the OA requests ABs for resources with the required capabilities. The operations of the production order are not linked to specific machines, but refer only to the required resource capabilities (resource types). This increases the flexibility of the system when rescheduling orders.
The OA receives the address of the RA that is capable of executing that segment of the order. The OA sends this RA a timeframe in which the segment should be executed.
In the next phase, the selected (lead) RA communicates with other RAs having the same capabilities. The lead RA reaches out to similar RAs to inquire about their availability within the required timeframe. The available RAs are assigned to the formed cluster.
Once the cluster is formed, the search for a quasi-optimal solution begins. Individual RAs generate solution proposals from which the lead RA selects. The RAs use an evaluation function that assesses the availability and cost of resources as well as the lengths of downtime incurred and machine runtimes. The goal is to create the optimal solution for a given resource with respect to the efficient use of all resources.
After the RA manager receives all solution proposals from the individual cluster members, he/she selects one of the submitted solutions based on the parameters given by the OA or set globally for the whole operation and sends it to the OA. The OA then has the task of evaluating the selected solution and deciding whether it is accepted or rejected.
If the solution is accepted, the OA allocates the necessary resources. If the solution is rejected, there are two options for proceeding:
- The OA will restart the entire solution selection process. Thus, it will ask the AB for resources with the necessary capabilities, etc. Due to the dynamics of production operations and order flow, the outcome of the new selection may be different from the previous one.
- The OA asks the cluster for a new solution that is more suitable for the OA’s requirements but is already less advantageous in terms of resource usage. This mechanism depends on the system configuration – on the balancing between optimizing production operations (resource allocation) and optimizing production order flow.
RESCHEDULING IN A DISTRIBUTED SYSTEM
Production order rescheduling is an essential element of distributed control systems. In centralized systems, rescheduling occurs when resource shortages or order changes occur. This usually means rescheduling the entire production. But in the case of distributed systems, rescheduling is an essential element that occurs periodically at certain stages of production and is intended to keep changes local and reduce the effort required to reschedule.
DECOMPOSITION OF THE PRODUCTION ORDER
In the previous chapters the decomposition of the contract was mentioned. This is possible if the production order can be divided into autonomous and concurrent subparts processed by different OAs. The structure of the production order is a key factor in the distribution of the MES.
The basic description of the production order includes information about the product, quantity, date and time of completion, etc. Other articles of the production order are:
- Process Segments (PS) – This is a basic building block in the description of a production order. It defines the individual tasks and operations that the system must perform in the production of a product. It consists of so-called Abilities, which are predefined and reusable operations/abilities. These operations have a set of parameters specific to a particular product or job. Furthermore, each PS contains a list of materials (input and output). This allows to decompose a job into a set of sub-jobs that can be executed in parallel and where the running of each sub-job can be managed by a separate OA.
- Node Operators (NO) – They represent logical links between individual PSs. There are several types of NOs that represent logical operators (Sequence, BranchOr, BranchAnd, JoinOr, JoinAnd). In addition, the input and output PSs are defined in the NO. There can be multiple inputs and multiple outputs, which provides the possibility of variant pass-through when processing a production order. The variant passes then increase the flexibility and adaptability of the system to resource shortages in production.
CONTROL FUNCTIONS
The functions of MES systems include data collection, product tracking, batch genealogy, document management, etc. Some of these activities are performed by OA and RA. However, there are centrally oriented OAS and RAS (Order Agents Supervisor, Resource Agents Supervisor), which on the one hand take care of managing the activities of the individual agents, but what they mainly perform is providing the link between the layers of the imaginary automation pyramid (between the MES layer and the ERP layer). Within this task, they perform the basic two groups of activities:
- They respond to queries from the ERP. If the ERP requests any information about order progress or resource performance, OAS and RAS query the relevant agents and forward the response to the ERP.
- Periodic reports. Checkpoints are defined in the structure of each production order that trigger the creation of a reporting report. These reports are automatically sent from the OA and RA to the OAS and RAS. It is the job of the control agents to collect and evaluate these reports and, if necessary, send these reports to the ERP system.
CONTROL APPLICATION LOCATED ON THE PRODUCT
The paradigm of distributed systems is taken a step further with the PABADIS PROMISE architecture, which introduces a combination of material and information flow. The link is implemented using next-generation RFID chips that contain control and master information about the production order and are attached directly to the product. These RFID chips are mobile software agents on which the distributed manufacturing system is based. When the mobile software agent loaded in the RFID chip arrives at the manufacturing site with the product, data is read from the chip to ensure the processing of the manufacturing operation, to plan the next steps and to move materials/product. The resource agents are then stored in the computing units of each resource.
The new architecture should, among other things, bring advantages in the form of greater independence from ERP, tolerance to network outages, synchronization of material and information flows, autonomous communication between agents without the intervention of the central system, etc.
However, these benefits are not clear-cut. We can find circumstances in which these benefits are not achieved. Let us look at some of them:
- Independence from ERP – In an ideal scenario, the mobile agent should contain all the information and logic needed to manage production tasks and material flow. As long as no changes are made, the agent does not need to communicate with the central system during job execution. However, one of the main benefits of distributed systems is to react to continuous order changes, which necessitates more frequent communication between the central system and the mobile agent. Another requirement for communication is real-time monitoring of the progress of the job. Therefore, the mobile agent is still dependent on the central system to some extent during the course of the job.
- Resilience to network outages – By containing complete control information, the mobile agent can theoretically operate even if connectivity is lost. However, the operating principle of a distributed system is continuous communication between agents. Therefore, at least connectivity to the local production network must be maintained.
- Faster communication – Direct transfer of control data between the mobile agent and the machine could theoretically be faster than communication via a central server. However, current RFID networks have significantly lower speeds than LANs (e.g. Ethernet) that can be used to transfer control data from the central system to the machine.
But there are also other areas that the PABADIS PROMISE architecture has to deal with:
- Security – It is difficult to ensure the integrity of data and code on RFID chips when materials are transported (e.g. on a ship, in an airplane, …). Even if the data is redundantly stored, loss or modification of the RFID chip can compromise the overall integrity of the data.
- Redundant data storage – If the RFID tag carries control data and code in addition to identification data, it is necessary to keep this data redundantly in the central system. This introduces additional overhead and risk of conflicts.
- Debugging – Identifying and solving problems is much more challenging in a physically distributed and constantly changing environment. This makes it very difficult to debug the system.
- Control – From a business perspective, process traceability is necessary. A highly distributed system that uses mobile software agents makes auditing the process much more difficult.
- Cost – Mobile software agents require high-capacity RFID chips. Now the price of such RFID chips makes it impossible to use them in high-volume, low-cost products.
APPLICATION OF THE SYSTEM IN PRACTICE
Systems for decentralised production control based on distributed software agents can be used for products with a high degree of variability in smaller production runs. A good example is the automotive industry, where the final product has a high price and a high degree of customization. Other possible application areas of distributed control include the furniture industry, the automotive industry including subcontractors, aerospace accessories, chemical and food manufacturing.
Practical applications of the PABADIS architecture can be found, for example, at Rittal and Hatzopoulos. The former is a German manufacturer from Herborn specialising in industrial enclosures. The second company mentioned is Hatzopoulos, which is involved in the flexible production of packaging materials for food companies.