Issue

When you attempt to log in to the Adobe Digital Publishing Suite (DPS) portal, the DPS Folio Builder panel, or the Adobe Content Viewer on a device, an error message indicates that the login was unsuccessful, possibly indicating that the server could not be located. This issue affects users on network configurations that are not picking up expected changes to IP addresses for Domain Name Server entries as a result of a change to the DPS infrastructure.

Solution

Ensure that your network is configured to be responsive to changes in public DNS entries and not locally cached or hard coded to use IP addresses from previous DPS environments.

Background

Adobe's Digital Publishing Suite (DPS) have been migrated from a third-party data center to Amazon Web Services. The DPS Publishing service migration was completed on February 2, 2014. One of the key benefits as a result of the migration are the results of a more efficient load balancing implementation.

Historically, the load-balancing of the DPS services was the result of a limited solution that was restricted in its responsiveness to changes in services load. As a result of migrating to AWS, there are new IP addresses which can change or increase in number.

The reason there is not a one-time DNS change is that DPS now takes advantage of the AWS elastic load balancing feature, where connection requests are dynamically allocated based on existing server load distribution. If current server instances are at capacity, new instances of the server are created dynamically and added to the pool of available servers. So the next time the DNS gets refreshed, your connection to DPS services may use an IP address you've never seen before.

The original server environment was one where the IP address for some DPS domain names never changed, and the new server environment is one where the DNS records will change as frequently as once per minute.

DNS Troubleshooting Tips

Slow-to-Update DNS

Some networks cache DNS lookups in the configuration of their software, a network proxy or http service, or the software platform upon which the software is deployed (for example, JVM, .NET, PHP). A network of domain name servers (DNS) resolves human-friendly names like origin.adobe-dcfs.com to the actual IP address that can be the destination for network traffic. For any given Internet host server, there are typically a handful of authoritative servers that maintain the official 'addresses' for browser lookups, for mail exchange addressing information, etc.

The authoritative name servers for digitalpublishing.acrobat.com, dpsapi2.digitalpublishing.acrobat.com, and the wildcard DNS entry *.digitalpublishing.acrobat.com are Amazon DNS servers. Specifically:

  • ns-108.awsdns-13.com
  • ns-1831.awsdns-36.co.uk
  • ns-1238.awsdns-26.org
  • ns-991.awsdns-59.net

The authoritative name servers for download.digitalpublishing.acrocomcontent.com are also Amazon DNS servers. They are:

  • ns-1067.awsdns-05.org
  • ns-935.awsdns-52.net
  • ns-258.awsdns-32.com
  • ns-1718.awsdns-22.co.uk

DNS entries hop between multiple DNS resolvers between Amazon and your workstation. Typically, the TTL is decremented by 1 for each hop and then if cached at a given node for any length of time, is decremented by 1 for each second that passes. It's not an exact science but overall it works.

The problem comes into play when one of the DNS resolvers in that network chain cache the DNS record longer than should be. There are incentives for them to do so it's not surprising. So even though your network might behave correctly and refresh its DNS every 60 seconds, it's possible that the workstation grabs a DNS record that is stale by an extra minute.

There are two consequences of this:

The efficiency of the load balancing is reduced – this is not something that is measurable for any specific customers but does affect all customers overall.

Occasionally, server instances are taken offline for various reasons – stale caches result in your workstations being unable to connect to a server until your DNS is refreshed, which could be as long as 60 seconds. This could result in erratic and intermittent connectivity.

Eternally cached DNS

If your workstation, or any of the DNS resolvers in its chain, cache DNS entries for time much greater than 60 seconds, it will seem as if the DPS servers disappeared overnight and you may see connection timeouts and even 404 errors. Unlike the slow-to-update DNS issue above, these are not self-healing and corrective action is required to ensure uninterrupted use of the DPS services. This is the default behavior in Java but other environments can be configured in a way that can cause this condition.

If the DNS record is cached forever, DPS services may be unreachable. This is of particular concern if your solution relies upon the Java Virtual Machine for connecting to the services. However, this is not the default behavior for PHP, .NET, etc.

If an element of your solution is running in the JVM (Java's virtual machine), there's a good chance that the default policy for JVM is to cache any DNS lookups forever (or at least until the next restart).

And since Amazon is creating and destroying these virtual servers dynamically, your server retains the an invalid IP address at a great cost.

Hence the restart.

But that's just a band-aid.

The real fix involves identifying the culprit that's caching the DNS and change its behavior. If the software is running on top of Java, this can be done by changing a Java policy setting or to execute a specific command to disable DNS caching during startup. If you must specify a TTL, use a value of 60 (seconds) or less. If you specify a TTL of 60 seconds or greater, you risk experience the Slow-to-Update DNS symptoms.

Determining Risk

There's an easy way to determine if a specific installation is at-risk. Here's a small bash script that you can type into a linux session (on Mac, you can access this by typing 'Terminal' in the Spotlight search field in the upper-right corner of the screen):

for i in {1..10}; do curl -v \
https://cc-us1-prod.adobesc.com/healthcheck/v1 2>&1 | grep Trying; \
sleep 10; done

Set up two machines on the same network to run the same script. As close as is possible, start the two scripts running simultaneously. Wait for about two minutes for the scripts to run. In an environment that is unaffected by DNS caching issues, you would expect to see different results from the two machines, similar to the following:

While an affected environment might return the same results on both machines, similar to:

Testing for bad DNS caching configuration/policy

To determine if your workstation or the network it is reliant upon is at risk, first ensure that the DPS services are working as expected, then modify your /etc/hosts file by typing the following at a 'Terminal' command prompt:
sudo vi /etc/hosts

After you are prompted to enter your system password, you will be editing the hosts file -- use the cursor arrows to set the position (beginning of a line), hit the 'i' key (for insert) and then type the following line (press RETURN key at end), press ESCAPE and then type:wq (loosely translated as command+write+quit) [press RETURN]
127.0.0.1 digitalpublishing.acrobat.com

At some point in the next minute, the DPS services should not be available. At that point, or an extra minute or two if it seems to remain connected, remove the line that you added (do the same as above except cursor to the line in question, hit 'dd' and then ':wq'. Your workstation should reconnect to the service and you will be able to login again.

If you are able to continue to use the DPS service, this is cause for concern.

Work-arounds

  • Restart servers involved in your network chain,
  • Consult your Network Administrator around other DNS recommendations,
  • Change your system or platform configuration file(s) to disable DNS caching (particularly if you're using Java)
    On a Mac, you can do this by editing the file /Library/Java/Home/lib/security/java.security (you have to precede such commands by sudo to invoke admin privileges), and uncommenting the line that reads:
    #networkaddress.cache.ttl=-1

    By removing the # character at the beginning, and change the -1 to 0, and the line should now read:
    networkaddress.cache.ttl=0

    Although you may need to restart your software, you shouldn't need to restart the server for the new settings to take effect.
More details and options are in the table which follows:
DNS cache override and JDK comparison matrix

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License  Twitter™ and Facebook posts are not covered under the terms of Creative Commons.

Legal Notices   |   Online Privacy Policy