Adobe Campaign relies on external databases. It has no limitation on the size of tables. To optimize performance, large tables need to have a specific design. This article explains how to optimize the database design for larger volumes.
For a better understanding of Campaign built-in tables and their interaction, refer to the Campaign datamodel description.
- A small-size table is similar to the delivery table.
- A medium size table is the same as the size of the recipient table. It has one record per customer.
- A large-size table is similar to the broad log table. It has many records per customer.
For example, if your database contains 10 million recipients, the broad log table contains about 100 to 200 million messages, and the delivery table contains a few thousand records.
It is a best practice to design large tables with fewer fields and more numeric data.
In this example, the Transactions and Transaction Item tables are large: more than 10 million.
The Product and Store tables are smaller: less than 10,000.
The product label and reference have been placed in the Product table.
The Transaction Item table only has a link to the Product table, which is numerical.
A large table should have mostly numeric fields and contain links to reference tables.
You can use "expr" attributes to avoid duplicating fields. For instance, the Recipient table uses an expression for the domain, which is already present in the email field.
The "XML" type is a good trick to avoid creating too many fields. But it also takes up disk space as it uses a CLOB column in the database.
The drivers to ensure that a field is required to be stored in a table are: targeting purpose or personalization purpose. In other words, if a field is not used to send a personalized email or used as a criterion in a query, you have to consider that this field is useless and takes up disk space for nothing.
Keep in mind that FDA (Federated Data Access, an optional feature that allows to access external data) covers the need to add a field "on-the-fly" during a campaign process. You don't need to import everything if you have FDA.
In addition to the "Autopk" defined by default in most tables, you should consider adding some logical or business keys (account number, client number, and so on). It can be used later for imports/reconciliation or data packages.
Efficient keys are essential for performance. Numeric data types should always be preferred as keys for tables.
For SQLServer database, you could consider using "clustered index" if performance is needed. Since Adobe does not handle this, you need to create it in SQL.
Indexes are essential for performance. When you declare a key in the schema, Adobe will automatically create an index on the fields of the key. You can also declare more indexes for queries that don't use the key.
Indexes should be limited in size and number because this impacts the performance during the insertion of data.
When you design a link, make sure that the target record is unique when a 1-1 relationship has been declared. Otherwise the join may return multiple records when only one is expected. This results in errors during delivery preparation when "the query returns more rows than expected". Set the link name to the same name as the target schema.
Beware of the "own" integrity on large tables. Deleting records that have wide tables in "own" integrity can stop the instance. The table is locked and the deletions are made one by one. So it's best to use "neutral" integrity on child tables that have large volumes.
Declaring a link as an external join is not good for performance. The zero ID record emulates the external join functionality. It is not necessary to declare external joins if the link uses the autopk.
Partitioning at the database level could be a solution to optimize queries on certain tables with large volumes. Strategies differ from the objectives of the table.
- For response or reporting aggregates, partition broad logs and transactions by week or month to optimize access to the latest data.
- For distributed marketing, partition the recipient table by region or agency to optimize access to local data.
The tablespace attribute in the schema allows you to specify a dedicated tablespace for a table.
The installation wizard allows you to store objects by type (data, temporary, and index).
Dedicated tablespaces are better for partitioning, security rules, and allow fluid and flexible administration, better optimization, and performance.
Data design impacts usability and functionality. If you design your data model with lots of one-to-many relationships, it makes it more difficult for users to construct meaningful logic in the application. One-to-many filter logic can be difficult for non-technical marketers to correctly construct and understand.
It's good to have all the essential fields in one table because it makes it easier for users to build queries. Sometimes it's also good for performance to duplicate some fields across tables if it can avoid a join.
Certain built-in functionalities will not be able to reference one-to-many relationships, for example, Offer Weighting formula and Deliveries.