Spooky Stories: Chilling Temporal Anti-Patterns (part 1)

I was excited to present this topic, because my favorite part of my job is helping people learn how to use Temporal, solve their pain points, avoid perilous pitfalls, and just have fun using Temporal to make being a developer easier.

Because it's October—the spookiest time of year—I will use some of my favorite spooky movies to explain Temporal best practices and their corresponding anti-patterns in a fun, top 10 list format. If you prefer video/audio format, you can also check out the on-demand version of the original webinar.

10. Wrapping SDKs in Scary Ways

Screenshot 2024-10-16 at 3.53.32 PM

One of the things we observe at Temporal is developers love to wrap libraries (for example, our SDKs) in other libraries, and then wrap those libraries in other libraries. (Hence, the chance for a mummy pun!)

Sometimes, wrapping the Temporal SDKs can be okay. For example, wrapping a thin "shim" layer that allows you to simplify security practices at a company, or make connections to Temporal Cloud easier, or make hooking up to metrics simple for developers. Another useful pattern is when people build a layer on top of Temporal for new developers to do simple workflows really easily (as long as they can also opt to use the SDKs directly). There's a fantastic Replay talk on this very topic from our friends at Cash App — Great tech is not enough: building trust to get the most out of Temporal. Essentially, adding a little bit to the Temporal SDKs to make development easier is awesome!

The anti-pattern is, if you wrap the Temporal SDK too far, you can end up hiding important features, or making it really difficult to update the SDK as improvements are made upstream. We've worked hard to ensure our SDKs are idiomatic, and they're open source, and if you have suggestions for improvements we'd love to hear them!

In short: Don't wrap the SDKs too deeply, to the point that they are hard to upgrade or maintain, or to the point where you end up breaking or hiding useful SDK features. We encourage you to work with us if you want improvements!

9. Jump Scares: Not Done Yet

Idempotency and Local Activities

Screenshot 2024-10-16 at 4.20.24 PM

One of my favorite series of movies is Scream, and if you watch the Scream movies you might know that there's always a killer (sometimes two!) and without fail, once the killer's finally defeated, they come back one more time and jump scare everybody in the audience.

How does this relate to Temporal? Sometimes, people think things are done when they're not done yet. This is important when it comes to things like idempotency and Local Activities.

When you're writing a function or a method in any programming language, "idempotency" means when you execute it multiple times it always has the same result, so you can execute it multiple times safely.

Local Activities are a variation of Activities in Temporal which run as part of the Workflow Execution process, and in order to reduce latency, they don't write their completion to Workflow History until they all have completed.

If you put these two together—a Workflow with a series of Local Activities, and whose Local Activities are not idempotent—this can cause surprise consequences.

Take for example a use case of money transfers. You have a series of Local Activities that move money, and you are not correctly using Idempotency Keys to understand if an Activity is called more than once. If your application hits a bug (for example, one of the later Local Activities calls an API that your code relies on goes down after the money is moved by an earlier Local Activity), your Local Activity sequence will keep firing, and end up moving far more money than it's supposed to, because the Workflow History will never tell it to stop, because the Local Activity series hasn't completed in full yet.

For this reason, in general I recommend using regular Activities, not Local Activities. But if you do use them, know how they work, and always use idempotency when you're writing Temporal Activities.

See Idempotency and Durable Execution for best practices around this topic, as well as Max's community post that lays out the exact execution sequences between Local and regular Activities.

In short: Don't let your Workflow processing jump out at you and not be done yet when you think it's done.

8. Not Using the Time Machine

Use Time Travel to Save ~~your mom~~ your workflows

Screenshot 2024-10-16 at 4.22.35 PM

Like the movie Totally Killer, Temporal also gives you time travel superpowers.

Let's say you have 100,000 Workflows and they all hit the same bug in one of your Activities. Maybe that bug did some math, and now your calculations are wrong, and all of your customers have 10,000 extra widgets.

But fear not… Temporal has time travel! Every Workflow records every step within it to the Workflow Event History. And you can rewind that using a feature called Temporal Reset. If a Workflow pod in Temporal crashes, its Workflow Event History is also replayed.

So in our example, you would deploy a fix for your math Activity to production, and then reset your Workflows in batch back to an earlier step. Temporal will rewind all of your Workflows back before the problematic Activity was called, and then re-execute them through the same fixed code path, and now the math is correct.

I've worked with several people who've done this in production with their workflows, and it's basically magic! You have a bug in production, you can rewind and replay with the bug fixed, and it fixes all of the workflows automatically.

For more information on this feature, see Temporal Time Traveling: Replay.

In short: Use Temporal time travel to save your mom… or your workflows, in this case.

7. An Overwhelming Amount of Tribbles

Manage workflow history size

Screenshot 2024-10-16 at 4.23.25 PM

Temporal's Workflow History is amazing: it lets you go back in time, it lets you keep track of every workflow you've ever done, and it's very performant. But one of the things you can't do is have unlimited Workflow History size, or you end up with Star Trek Tribble Trouble.

Workflow History has limits (albeit pretty high) in order to keep Workflow Replay and Reset performant. And new Temporal users can incorrectly assume that the Workflow History can have unlimited size, because Temporal is kind of magic. And unfortunately If the Event History exceeds 50Ki (51,200) Events, the Workflow Execution is terminated.

Here are a couple of ways to keep the size of Workflow History down:

Don't keep too much data in your Temporal Workflows. If you need to work with large data, access it externally to the Workflow History.
Use Continue-As-New as needed; this passes the latest relevant state to a new Workflow Execution, with a fresh Event History.

In short: Be aware that Workflow History size has a limit, and keep this in mind when designing your workflows.

6. Crossing the Streams

Test for Determinism

Screenshot 2024-10-16 at 4.24.01 PM

In Ghostbusters movies, there's a rule: don't cross the streams of the proton packs. In Temporal, we have a rule as well, which is that Workflows can be rewound, replayed, and reset. To support that, Workflows must be deterministic (meaning, given a particular input, it will always produce the same output).

What you don't want is to cross one time stream with another time stream and cause weird things to start happening to your timeline. If you need to make a change to Workflow code, or a Workflow that's already running somewhere, you can do that as long as you don't break anything deterministically. To support this, we have features Workflow Versioning and Workflow Patching, which are both ways to make changes to your Workflow time stream without breaking existing Workflows in their Replay and Reset.

Note: Activities do not have to be deterministic and you can make changes to them without consequences. Same with a Workflow that's still under active development. But if a Workflow is running in production, this is where Versioning and Patching come in.

There's an excellent course on Versioning Workflows which covers this topic more in-depth, and is highly recommended.

A couple of great resources on testing for determinism are:

Replay Testing To Avoid Non-Determinism in Temporal Workflows
Altering Space-Time Continuum: Testing for Determinism in Temporal Workflows

In short: Don't cross your streams. Test for determinism and use Versioning to make sure you don't have any errors when you go to production.

5. Leaving Handy Tools Sitting There

Use Features like Signals, Updates, Polling…

Screenshot 2024-10-16 at 4.24.39 PM

Sometimes when you watch a horror movie, the heroes run right by something that would be super helpful—a fire extinguisher, a first aid kit, a crowbar, a flashlight—and it's SO frustrating! "NO! Why?? Why don't you pick up the thing that would be super helpful to you?!"

I sometimes have the same feelings when I'm helping people out with Temporal. There are so many great Temporal features in our documentation and training you can rely on, such as:

Signals, Queries, and Updates
Best practices for Activity polling (code sample)
Interactive Workflows hands-on course
Examples of how to use Temporal

In short: Don't forget to "loot the bodies" and take from / learn about all the helpful Temporal features and resources that are available.

Keep reading! Spooky Stories: Chilling Temporal Anti-Patterns (part 2)

Spooky Stories: Chilling Temporal Anti-Patterns (part 1)

10. Wrapping SDKs in Scary Ways#

9. Jump Scares: Not Done Yet#

8. Not Using the Time Machine#

7. An Overwhelming Amount of Tribbles#