AEM Forms JEE workflows provide tools to design, create, and manage business processes. A workflow process consists of a series of steps that execute in a specified order. Each step performs a specific action such as assigning a task to a user or sending an email message. A process can interact with assets, user accounts, and services, and can be triggered using any of the following methods:
- Starting a process from AEM Forms Workspace
- Using SOAP or RESTful service
- Submitting an adaptive form
- Using watched folder
- Using Email
For more information about creating AEM Forms JEE workflow process, see Workbench Help.
When a process is triggered and as it progresses, it captures data about the process participants, data entered by participants in the form associated with the process, and attachments added to the form. The data is stored in AEM Forms JEE server database, and if configured, some data like attachments are stored in the Global Document Storage (GDS) directory. The GDS directory can be configured on a shared file system or a database.
When a process is triggered, a unique process instance ID and long-lived invocation ID is generated and associated with the process instance. You can access and delete data for a process instance based on the long-lived invocation ID. You can deduce the long-lived invocation ID of a process instance with the user name of the process initiator or process participants who have submitted their tasks.
However, you cannot identify the process instance ID for an initiator in the following scenarios:
- Process triggered through a watched folder: A process instance cannot be identified using its initiator if the process is triggered by a watched folder. In this case, the user information is encoded in the stored data.
- Process initiated from publish AEM instance: All process instances triggered from AEM publish instance do not capture information about the initiator. However, user data may be captured in the form associated with the process, which is stored in workflow variables.
- Process initiated through email: The email ID of the sender is captured as a property in an opaque blob column of the tb_job_instance database table, which cannot be queried directly.
Perform the following steps to identify process instance IDs for a workflow initiator or a participant:
(For workflow initiator) Execute the following command to retrieve all tasks associated with the principal ID for the initiator from the tb_task database table.
select * from tb_task where start_task = 1 and create_user_id= 'initiator_principal_id'
The query returns tasks initiated by the specified initiator_principal_id. The tasks are of two types:
- Completed tasks: These tasks have been submitted and display an alphanumeric value in the process_instance_id field. Take note of all process instance IDs for submitted tasks and continue with the steps.
- Tasks initiated but not complete: These tasks have initiated but not submitted yet. The value in the process_instance_id field for these tasks is 0 (zero). In this case, take note of the corresponding task IDs and see Work with orphan tasks.
(For workflow participants) Execute the following command to retrieve process instance IDs associated with the principal ID of the process participant for the initiator from the tb_assignment database table.
select distinct a.process_instance_id from tb_assignment a join tb_queue q on a.queue_id = q.id where q.workflow_user_id='participant_principal_id'
The query returns instance IDs for all processes associated with the participant including those where the participant has not submitted any task.
Take note of all process instance IDs for submitted tasks and continue with the steps.
For orphan tasks or tasks where the process_instance_id is 0 (zero), take note of the corresponding task IDs and see Work with orphan tasks.
A workflow can be designed such that the user data is captured in a variable that gets stored as a blob in the database. In such cases, you can query user data only if it is stored in one of the following primitive-type variables:
- String: Contains the user ID directly or as a substring and can be queried using SQL.
- Numeric: Contains the user ID directly.
- XML: Contains the user ID as a substring within the text stored as text columns in database and can be queried like strings.
Perform the following steps to determine if a workflow that stores data in primitive-type variables contains data for the user:
select database_table from omd_object_type where name='pt_<app_name>/<workflow_name>'
The query returns a table name in tb_<number> format for the specified application (app_name) and workflow (workflow_name).
The value of the name property can be complex if the workflow is nested within sub-folders inside the application. Ensure that you specify the exact full path to the workflow, which you can get from the omd_object_type database table.
Review the tb_<number> table schema. The table contains variables that store user data for the specified workflow. The variables in the table correspond to the variables in the workflow.
Identify and take note of the variable that corresponds to workflow variable containing the user ID. If the identified variable is of primitive-type, you can run a query to determine workflow instances associated with a user ID.
Now that you have identified the process instance IDs associated with a user, do the following to delete user data from the respective process instances.
Execute the following command to retrieve long-lived invocation ID and status for a process instance from the tb_process_instance table.
select long_lived_invocation_id, status from tb_process_instance where id='process_instance_id'
Create an instance of the public ProcessManager client (com.adobe.idp.workflow.client.ProcessManager) using a ServiceClientFactory instance with the correct connection settings.
For more information, see Java API reference for Class ProcessManager.
Orphan tasks are the tasks whose containing process has been initiated but not submitted yet. in this case, the process_instance_id is 0 (zero). Therefore, you cannot trace user data stored for orphan tasks using process instance IDs. However, you can trace it using the task ID for an orphan task. You can identify the tasks IDs from the tb_task table for a user as described in Identify process instance IDs when workflow initiator or participant is known.
Once you have the task IDs, do the following to purge the associated files and data with an orphan task from GDS and database.
Execute the following command on AEM Forms server database to retrieve IDs for the identified task IDs.
select id from tb_form_data where task_id=<task_id>
Depending on whether your GDS points to a file system or a database, perform one of the following steps:
GDS in file system
In the GDS file system:
a. Search for files with the following session ID strings as their extensions:
The files with these extensions are the marker files. They are stored with filenames in the following format:
b. Delete all marker files and other files with the exact filename as <file_name_guid> from the file system.
delete from tb_dm_chunk where documentid in (select documentid from tb_dm_session_reference where sessionid=<session_id>) delete from tb_dm_session_reference where sessionid=<session_id> delete from tb_dm_deletion where sessionid=<session_id>
delete from tb_task_acl where task_id=<task_id> delete from tb_task_attachment where task_id=<task_id> delete from tb_form_data where task_id=<task_id> delete from tb_assignment where task_id=<task_id> delete from tb_task where id=<task_id>